39 questions · Free · No signup

AI Interview Question Bank

Curated questions on system design, prompt engineering, RAG, LLM evaluation, and AI agents — sourced from real interviews at Google, Meta, Amazon, and more. With walkthroughs, follow-ups, and the kind of detail that actually helps you prep.

Try: "rag", "prompt", "agent memory"

Browse by Category

Browse by Company

View all →

All Questions(15 of 39)

AI AgentsAdvanced
GoogleMetaMicrosoft+1

Design an AI Agent That Can Book Travel End-to-End

Design a multi-step AI agent that books flights, hotels, and transportation — covering tool design, planning loops, error recovery, and user confirmation.

Read question
AI AgentsAdvanced
GoogleMetaMicrosoft+2

Design a Multi-Agent System for Software Development

Design a multi-agent system where specialized agents collaborate on software development — covering orchestration, communication, coordination, and failure modes.

Read question
AI System DesignAdvanced
GoogleMicrosoftMeta

Design an AI-Powered Code Review System

Design a system that uses LLMs to automatically review pull requests — identifying bugs, style issues, and suggesting improvements at scale.

Read question
AI System DesignAdvanced
MetaGoogleMicrosoft

Design a Real-Time Content Moderation Pipeline Using LLMs

Design a scalable content moderation system that uses LLMs to detect harmful content in real time while minimizing false positives and latency.

Read question
AI System DesignAdvanced
OpenAIGoogleMeta+1

Design a Production LLM Chat System (Design ChatGPT)

Walk through the architecture of a production LLM-powered chat system — covering streaming responses, conversation history management, context window limits, multi-user scaling, and safety.

Read question
AI System DesignAdvanced
GoogleMetaMicrosoft+2

How Would You Architect a Multi-Model AI Gateway?

Design a unified gateway that routes requests across multiple LLM providers, handles fallbacks, enforces rate limits, and tracks costs per team.

Read question
AI System DesignAdvanced
MicrosoftGoogleAmazon

How Do You Architect a Multi-Tenant LLM Deployment with Role-Based Data Access?

Enterprise AI products serve multiple customers from shared infrastructure. Walk through how to design tenant isolation, role-based access control, and data governance for a multi-tenant LLM deployment.

Read question
LLM Eval & OpsAdvanced
GoogleMetaMicrosoft+2

How Would You Detect and Handle LLM Output Regressions?

Build a system to detect when LLM output quality degrades — covering statistical monitoring, automated quality checks, and incident response.

Read question
LLM Eval & OpsAdvanced
GoogleMetaNVIDIA+1

How Do You Optimize LLM Inference for Higher Throughput and Lower Latency?

Walk through the key techniques for optimizing LLM inference performance in production — KV cache management, quantization, continuous batching, and speculative decoding.

Read question
LLM Eval & OpsAdvanced
GoogleMetaMicrosoft+1

How Do You Handle Model Version Upgrades Without Breaking Production?

A safe, systematic approach to upgrading LLM model versions in production — from pre-upgrade evaluation to canary deployment and rollback.

Read question
Prompt EngineeringAdvanced
GoogleMetaMicrosoft+1

Compare Few-Shot Prompting vs. Fine-Tuning for a Classification Task

Understand when to use few-shot prompting versus fine-tuning for classification — covering cost, data requirements, latency, and when each approach wins.

Read question
RAG & RetrievalAdvanced
GoogleMetaMicrosoft+1

A Client's RAG System Has Poor Retrieval Accuracy — How Do You Fix It?

A RAG-based system isn't returning accurate results. Walk through a systematic process to diagnose the root cause and improve retrieval quality.

Read question
RAG & RetrievalAdvanced
GoogleMetaMicrosoft+1

Design a Hybrid Search System Combining Semantic and Keyword Search

Design a search system that combines dense vector search with sparse keyword search — outperforming either approach alone through intelligent score fusion.

Read question
RAG & RetrievalAdvanced
GoogleMetaMicrosoft+1

How Do You Handle Multi-Hop and Multifaceted Queries in a RAG System?

Single-shot retrieval breaks down for complex questions that require reasoning across multiple documents. Walk through strategies to handle multi-hop and multifaceted queries.

Read question
RAG & RetrievalAdvanced
GoogleMetaMicrosoft+2

How Do You Choose a Vector Index and Vector Database for a RAG System?

Compare vector index types — HNSW, IVF, PQ, LSH — and explain how to choose the right vector database given scale, latency, filtering, and cost requirements.

Read question

Prep for the full interview loop

Know the concepts. Now prove it. Practice GenAI, Coding, System Design, and AI/ML Design interviews with an AI that tells you exactly where you fell short.

Start a mock interview