Yassir Manaf

Author: Yassir Manaf

Fine-Tuning vs. RAG: How I Actually Choose
Apr 2, 2026
—
par
Yassir Manaf
dans AI in Production
Most teams reach for fine-tuning when they need RAG, and RAG when they need fine-tuning. Here’s how I actually make the call.
LLM Agent vs Single Call: How I Decide Before Writing a Line of Code
Mar 31, 2026
—
par
Yassir Manaf
dans AI in Production
Most teams reach for agents too early. Here’s the decision framework I use to choose between an LLM agent and a single call — before writing a line of code.
What Building a SaaS for Non-Technical Users Taught Me About Product Assumptions
Mar 29, 2026
—
par
Yassir Manaf
dans Builder Stories
I built a SaaS tool for people who’d never used software like it before. Everything I assumed about onboarding, features, and UX turned out to be wrong. Here’s what I learned.
How I Audit an AI System Before It Goes Live
Feb 15, 2026
—
par
Yassir Manaf
dans AI in Production
Most AI systems pass the demo but aren’t ready for production. Here’s the six-area audit I run as a fractional Tech Lead — and the gaps I find in every system.
Prompt Injection Production: 4 Critical Attack Vectors and How to Defeat Them
Jan 13, 2026
—
par
Yassir Manaf
dans AI in Production
Prompt injection is easy to miss in testing and dangerous in production. Here’s what it actually looks like in a live LLM system — and the layered defenses that catch it.
RAG in Production: Beyond the Demo
Nov 5, 2025
—
par
Yassir Manaf
dans AI in Production
Every RAG demo works. Production is where things fall apart — quietly, in ways that are hard to debug. Here’s what I’ve learned building RAG systems that actually ship.
Structured Outputs with LLMs: Moving Beyond Raw Text
Oct 12, 2025
—
par
Yassir Manaf
dans AI in Production
I spent two years writing regex to parse JSON from LLM responses. Structured outputs ended that. Real before/after from a production pipeline — metrics, tradeoffs, and the failure modes that don’t go away.
Multi-Tenant LLM Architecture: Isolation Patterns That Actually Work
Aug 10, 2025
—
par
Yassir Manaf
dans AI in Production
The first time a tenant’s prompt leaked into another tenant’s context window, I found out from a support ticket. Here are the multi-tenant isolation patterns that held up in production — and the ones that didn’t.
LLM Caching in Production: The What, When, and How
Jul 13, 2025
—
par
Yassir Manaf
dans AI in Production
Caching LLM responses isn’t like caching a REST API. The inputs are fuzzy, the outputs are non-deterministic, and most traditional strategies break. Here’s the caching hierarchy I built in production — and the traps I walked into along the way.
How I Scope and Run Fractional Tech Lead Engagements
Jun 8, 2025
—
par
Yassir Manaf
dans Builder Stories
Running a fractional tech lead engagement is not consulting and not freelancing. Here’s how I scope, price, and run them — and what kills them early.