AI Interview Question Bank
Curated questions on system design, prompt engineering, RAG, LLM evaluation, and AI agents — sourced from real interviews at Google, Meta, Amazon, and more. With walkthroughs, follow-ups, and the kind of detail that actually helps you prep.
Browse by Category
AI Agents & Tool Use
Autonomous AI agents, function calling, planning architectures, and multi-agent systems.
AI System Design
End-to-end design of AI-powered systems — from architecture to deployment.
LLM Evaluation & Ops
Testing, monitoring, and operating LLMs reliably in production environments.
Prompt Engineering
Designing, evaluating, and optimizing prompts for real-world LLM applications.
RAG & Retrieval
Retrieval-Augmented Generation architectures — combining search with LLMs for grounded, accurate AI.
Browse by Company
View all →AI interview questions reported from Google AI, DeepMind, and Cloud AI engineering roles.
Meta
AI interview questions reported from Meta AI, FAIR, and GenAI engineering roles.
Microsoft
AI interview questions reported from Microsoft Copilot, Azure OpenAI, and AI platform engineering roles.
Amazon
AI interview questions reported from Amazon AWS Bedrock, Alexa AI, and GenAI engineering roles.
OpenAI
AI interview questions reported from OpenAI research, applied AI, and platform engineering roles.
NVIDIA
AI interview questions reported from NVIDIA AI inference, GPU computing, and LLM platform roles.
Anthropic
AI interview questions reported from Anthropic research, safety, and applied AI engineering roles.
All Questions(17 of 39)
How Would You Implement Memory for a Long-Running AI Agent?
Design a memory system for a long-running AI agent — covering in-context working memory, episodic recall, semantic knowledge, and retrieval strategies.
Read questionHow Do You Decide What Tools to Give an AI Agent?
A framework for deciding which tools to give an AI agent — covering granularity, safety boundaries, observability, and the principle of minimal tool sets.
Read questionWhat Is the Plan-and-Execute Agent Pattern, and When Should You Use It Over ReAct?
Plan-and-Execute separates planning from execution in AI agents. Walk through how it works, how it compares to ReAct, and the tradeoffs in multi-step task completion.
Read questionWhat's the Difference Between OpenAI Function Calling and LangChain Agents?
OpenAI function calling and LangChain agents both let LLMs use tools, but they operate at different abstraction levels. Walk through how each works and when to use each.
Read questionDesign a Conversational AI Customer Support System
Design an AI-powered customer support system that handles common queries automatically while escalating complex issues to human agents.
Read questionDesign a Document Q&A System for a Large Corpus
Design an AI system that answers natural language questions over a large collection of documents, with accurate citations and low hallucination rates.
Read questionHow Do You Estimate the Cost of Running a Production LLM System?
Walk through how to estimate and model the cost of running an LLM system in production — covering API token costs, open source GPU infra, and key levers for optimization.
Read questionHow Do You Build an Eval Suite for an LLM-Powered Feature?
Walk through building a systematic evaluation suite for an LLM feature — from test case design to automated metrics and regression tracking.
Read questionHow Do You Evaluate a RAG System End-to-End?
RAG evaluation is distinct from general LLM evaluation — it requires measuring both retrieval quality and generation quality independently and together. Walk through the key metrics and frameworks.
Read questionWhat Is Prompt Injection, and How Do You Defend Against It?
Prompt injection is one of the most significant security risks in LLM-powered applications. Walk through the attack types and the layered defenses used in production.
Read questionWhat Strategies Do You Use to Reduce Hallucinations?
Walk through a layered approach to reducing LLM hallucinations — from prompt-level techniques to retrieval grounding and output validation.
Read questionHow Would You Design a Prompt for Structured Data Extraction?
Design a prompt that reliably extracts structured data (JSON, tables) from unstructured text — handling missing fields, ambiguity, and format errors.
Read questionHow Do You Handle Chunking Strategies for Different Document Types?
Compare chunking strategies for different document types — PDFs, code, HTML, and tables — and learn when each approach works best.
Read questionHow Do You Handle Tables, Charts, and Complex Documents in a RAG Pipeline?
Real-world documents contain tables, charts, and complex layouts that naive text extraction mangles. Walk through how to build a robust document processing pipeline for structured and visual content.
Read questionDesign a RAG Pipeline from Scratch
Walk through designing a production-ready RAG system covering document ingestion, chunking strategies, embedding models, vector search, and LLM generation.
Read questionHow Would You Evaluate Retrieval Quality in a RAG System?
Walk through metrics and methods for evaluating retrieval quality in a RAG pipeline — from offline metrics to end-to-end answer quality.
Read questionHow Do Vector Embeddings Work, and How Do You Choose the Right Embedding Model?
Explain what vector embeddings are, how embedding models convert text to vectors, and how you'd benchmark and improve retrieval accuracy for a production RAG system.
Read questionPrep for the full interview loop
Know the concepts. Now prove it. Practice GenAI, Coding, System Design, and AI/ML Design interviews with an AI that tells you exactly where you fell short.
Start a mock interview