tool-use intermediate

Tool Use / Function Calling

Extend LLM capabilities by giving it access to external tools and functions it can invoke to take actions or fetch real-time data.

function-callingtoolsapiactionsplugins

Overview

Tool Use (also called Function Calling) lets an LLM go beyond text generation by invoking external tools — APIs, databases, calculators, code interpreters, or any callable function. The LLM decides which tool to use, what arguments to pass, and then incorporates the result into its response.

This pattern transforms LLMs from passive text generators into active agents that can interact with the world.

When to Use

  • LLM needs real-time data (weather, stock prices, search results)
  • Tasks require precise computation (math, date calculations)
  • You want the LLM to take actions (send emails, create records)
  • Multi-step workflows where the LLM needs to gather information incrementally

Architecture

sequenceDiagram
    participant User
    participant LLM
    participant Router as Tool Router
    participant T1 as Calculator
    participant T2 as Search API
    participant T3 as Database

    User->>LLM: "What's the weather in NYC and convert 72°F to °C?"
    LLM->>Router: call: weather_api(location="NYC")
    Router->>T2: GET /weather?city=NYC
    T2-->>Router: {"temp": "72°F", "condition": "sunny"}
    Router-->>LLM: Result: 72°F, sunny
    LLM->>Router: call: calculator(expr="(72-32)*5/9")
    Router->>T1: calculate
    T1-->>Router: 22.22
    Router-->>LLM: Result: 22.22°C
    LLM-->>User: "NYC is 72°F (22.2°C) and sunny!"

How It Works

  1. Define Tools: Describe available functions with names, descriptions, and parameter schemas
  2. LLM Decides: Given a user request, the LLM generates a structured tool call (JSON)
  3. Execute: Your code parses the tool call and executes the actual function
  4. Return Results: Feed the result back to the LLM for final response generation
  5. Iterate: The LLM may chain multiple tool calls before responding

Implementation

▶ Interactive Example (python)

Gotchas & Best Practices

🚨 Never Trust LLM Output Directly

The LLM generates tool arguments as text. Always validate and sanitize before executing. Never pass LLM output directly to eval(), SQL queries, or shell commands in production.

🚨 Tool Descriptions Are Critical

Poorly described tools lead to wrong tool selection. Write descriptions like API docs — be precise about what the tool does, what inputs it expects, and what it returns.

⚠️ Limit Tool Count

More tools ≠ better. With too many tools (>15-20), LLMs struggle to pick the right one. Group related tools, use routing, or implement a tool-selection layer.

💡 Error Handling Is Essential

Tools fail — APIs time out, inputs are invalid. Return clear error messages to the LLM so it can retry with different parameters or explain the failure to the user.

💡 Include Examples in Tool Descriptions

Adding example inputs/outputs to tool descriptions dramatically improves tool use accuracy. Example: "Calculate math expression. Example: calculator('2 + 2') → '4'"

Variations

  • Single-turn — One tool call per request
  • Multi-turn — LLM chains multiple tool calls iteratively
  • Parallel — Multiple independent tool calls executed simultaneously
  • Nested — Tool results trigger further tool calls
  • Human-in-the-loop — Require approval for sensitive tool calls

Further Reading