Author: Yassir Manaf
When Vector Databases Are Overkill
Before you spin up Pinecone, read this. Most production RAG systems don’t need a dedicated vector store — and the operational overhead costs more than the query bill.
AI Observability Production: The Stack I Actually Use
Standard APM tools miss the failures that matter most in AI systems — the ones where infrastructure is healthy but the model is wrong. Here’s the observability stack I built to catch them.
How I Think About Context Windows in Production LLM Apps
Every token you send to an LLM costs money, adds latency, and past a threshold, degrades quality. Here’s how I manage context windows in production — and why bigger isn’t better.
I Built Conversational AI in 2017 — Here’s What I’d Do Differently with LLMs
built production conversational AI in 2017 using Rasa, spaCy, and hand-coded dialogue flows. Here’s what broke, what held up, and what I’d do differently with LLMs today.
Azure AI Cost Optimization: Where the Money Actually Goes in Production
The Azure OpenAI bill surprises most teams. Not because of the obvious costs — but because of the ones nobody documented.
LLM Output Validation in Production: What Actually Works
Raw LLM output breaks production systems in ways that have nothing to do with hallucination. Here’s the validation stack that actually works.
What Rasa Production NLP Taught Me That LLMs Still Can’t Replace
I built production NLP systems with Rasa before LLMs changed everything. The constraints Rasa imposed — on intent design, training data, and dialogue control — still apply today.
Event-Driven Architecture with Kafka: What the Tutorials Don’t Tell You
Kafka tutorials show the happy path. This is the other one — the production failures, the trade-offs, and the honest answer to when you shouldn’t use Kafka at all.
How I Debug Distributed Systems Without Losing My Mind
Distributed systems fail in ways that are hard to reproduce and harder to explain. Here’s the debugging workflow I’ve built from years of production incidents — no tools survey, just what actually works.
Prompt Engineering Is Not a Skill. It’s a Process.
Most teams treat prompt engineering as a creative act. It’s not. Here’s the repeatable process I use to version, test, and improve LLM prompts in production.