Intermediate5 min read

What's the Difference Between OpenAI Function Calling and LangChain Agents?

OpenAI function calling and LangChain agents both let LLMs use tools, but they operate at different abstraction levels. Walk through how each works and when to use each.

Prep for the full interview loop

Know the concepts. Now prove it. Practice GenAI, Coding, System Design, and AI/ML Design interviews with an AI that tells you exactly where you fell short.

Start a mock interview

Why This Is Asked

This question tests practical depth. Candidates who've only used tutorials often treat LangChain as a magic box. Interviewers want to know whether you understand the underlying mechanism — and whether you can make an informed build-vs-framework decision.

Key Concepts to Cover

  • How function calling works at the API level — JSON schema, model output, execution loop
  • What LangChain adds on top — agent executor, tool abstractions, memory
  • When raw function calling is better — control, simplicity, latency
  • When LangChain (or similar frameworks) add value — rapid prototyping, complex chains
  • Structured output as a related concept — constrained generation

How to Approach This

1. How OpenAI Function Calling Works

Function calling (now called "tool use" in most APIs) is a first-class API feature where:

  1. You declare tools as JSON schemas in the API request
  2. The model decides when and how to call a tool based on the conversation
  3. The model outputs a structured JSON object (not free text) with the tool name and arguments
  4. Your code executes the function and returns the result to the model
  5. The model uses the result to continue the conversation
# Step 1: Define tools as JSON schemas
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
        }
    }
}]

# Step 2: Call the API
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools
)

# Step 3: Check if the model wants to call a tool
if response.choices[0].message.tool_calls:
    tool_call = response.choices[0].message.tool_calls[0]
    args = json.loads(tool_call.function.arguments)
    # {"location": "Tokyo", "unit": "celsius"}

    # Step 4: Execute the function
    result = get_weather(**args)

    # Step 5: Return result and get final response
    final_response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": "What's the weather in Tokyo?"},
            response.choices[0].message,
            {"role": "tool", "content": str(result), "tool_call_id": tool_call.id}
        ]
    )

This is the raw mechanism. You control the loop, the execution, and the error handling entirely.

2. What LangChain Adds

LangChain's agent abstractions wrap this loop with:

  • AgentExecutor: the outer loop that repeatedly calls the LLM, executes tools, and feeds results back until the agent decides it's done
  • Tool interface: a consistent wrapper around any function (Python function, API call, another chain) that provides name, description, and schema
  • Memory modules: short-term (conversation buffer) and long-term (vector store) memory that persist across turns
  • Chain compositions: combine retrievers, LLMs, and tools into declarative pipelines
  • Callbacks and tracing: LangSmith integration for observability
# LangChain equivalent
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.tools import tool

@tool
def get_weather(location: str) -> str:
    """Get the current weather for a location."""
    return call_weather_api(location)

agent = create_tool_calling_agent(llm, [get_weather], prompt)
executor = AgentExecutor(agent=agent, tools=[get_weather])
result = executor.invoke({"input": "What's the weather in Tokyo?"})

Less code, but you give up direct control of the loop.

3. When to Use Raw Function Calling

  • Predictable, narrow use cases: one or two tools, well-defined task, low failure surface
  • Production latency requirements: every LangChain abstraction layer adds overhead; raw API calls are measurably faster
  • Full control over error handling: you can implement custom retry logic, fallbacks, and validation
  • Avoiding dependency weight: LangChain is a large dependency; for simple tool use, it's overkill
  • When debugging matters: raw function calling is much easier to trace and debug than a LangChain agent executor

4. When Frameworks Like LangChain Add Value

  • Rapid prototyping: connecting many tools quickly without writing orchestration boilerplate
  • Complex pipelines: RAG + tools + memory + chain-of-thought — frameworks provide tested patterns
  • Team unfamiliar with agent patterns: abstractions lower the floor
  • LangSmith observability: if you're iterating quickly, built-in tracing saves significant time

The honest answer: use LangChain to move fast in early development, then evaluate whether to replace it with direct API calls for the production hot path.

Tool calling is a specific form of constrained generation. A broader technique is structured output / constrained decoding:

  • OpenAI response_format: json_schema: force the model to output a specific JSON schema every time (not tool calling, just structured output)
  • Instructor library: wraps the OpenAI API with Pydantic schema enforcement and automatic retries on validation failure
  • Outlines: lower-level constrained generation that enforces grammar at the token level

Use structured output (not tool calling) when you want deterministic output format but aren't integrating with external APIs — e.g., extracting structured data from text.

Common Follow-ups

  1. "How do you handle the case where the model calls a tool with invalid arguments?" Validate the arguments before executing the function. If invalid, return an error message as the tool result and let the model retry with corrected arguments. Use Pydantic for schema validation. Set a max retry count to avoid infinite loops.

  2. "Can you call multiple tools in parallel?" Yes. Modern APIs (OpenAI, Anthropic, Gemini) support parallel tool calls — the model outputs multiple tool_calls in one response. Execute them concurrently (e.g., asyncio.gather) and return all results in a single follow-up message. Significant latency savings for independent tool calls.

  3. "How do you decide which tools to give the model vs. handling logic in code?" Anything deterministic (math, data transformation, lookup tables) should be code — don't burn LLM tokens on arithmetic. Tools should represent capabilities the model needs external data or state for: API calls, database queries, web search, file operations. The model's job is reasoning about which tools to call and how to interpret their results — not replacing logic that's cheaper to implement deterministically.

Related Questions

Prep for the full interview loop

Know the concepts. Now prove it. Practice GenAI, Coding, System Design, and AI/ML Design interviews with an AI that tells you exactly where you fell short.

Start a mock interview