What Is RAG and Why Do Interviewers Ask About It
RAG augments a language model's input with documents retrieved from an external knowledge base. Instead of relying on what the model memorized during training, you fetch relevant context at query time.
Nearly every company building production AI uses RAG. It solves the problem of LLMs not knowing your proprietary data — without the cost of fine-tuning.
RAG questions reveal how you think about systems. The pipeline has 6+ stages, each with meaningful tradeoffs. Candidates who walk through the full pipeline and reason about where things break stand out immediately.
The Two Pipelines
Every RAG system has two distinct pipelines with different engineering challenges:
Ingestion is a throughput problem (batch processing, deduplication, freshness). Query is a latency problem (search speed, re-ranking cost, generation time). Confusing the two leads to bad architecture decisions.
Core Concepts
Key Tradeoffs
| Decision | Tradeoff |
|---|---|
| Chunk size | Smaller = more precise retrieval, Larger = more context per chunk |
| Retrieved chunks (k) | More = better recall, More = higher cost and latency |
| Embedding model size | Larger = better quality, Higher latency and storage cost |
| Re-ranking | Better precision, Adds 100-200ms latency |
| Chunk overlap | Prevents info loss at boundaries, Increases storage by 10-20% |
How to Explain This in an Interview
The most common mistake: describing RAG as "just vector search + LLM." Interviewers want to hear about chunking decisions, hybrid retrieval, re-ranking, and evaluation. The second mistake is never discussing failure modes.
Common Interview Questions
- Design a RAG Pipeline from Scratch — End-to-end pipeline covering all stages
- Chunking Strategies for RAG — Deep dive into document splitting
- Evaluate Retrieval Quality — Metrics for measuring RAG performance
- RAG vs Fine-Tuning — When to use retrieval vs model adaptation
- Design a Hybrid Search System — Combining dense and sparse retrieval
What to Practice Next
Browse all RAG & Retrieval interview questions for hands-on practice.
Next module: LLM Evaluation: Metrics, Evals, and Production Monitoring