The Promise vs. The Reality
In demos, tool use looks magical. The agent needs data, calls an API, gets a response, and continues smoothly. In production, you discover:
- APIs return unexpected formats
- Rate limits kick in mid-workflow
- The model hallucinates function parameters
- Tool outputs are too large to fit in context
- Error handling becomes a maze
The gap between “works in a demo” and “works reliably” is where most agent projects die.
This post covers the patterns that actually hold up under real use.
Pattern 1: Start With Read-Only Tools
When you’re starting out, resist the urge to give your agent write permissions immediately.
Why? Read-only tools let you test agent reasoning without risking unintended actions. You can observe:
- Does it choose the right tool?
- Does it parse outputs correctly?
- Does it handle missing data gracefully?
Good first tools:
search_database()— query but don’t modifyfetch_url()— retrieve contentread_file()— access local filesget_current_time()— context awareness
Once your agent reliably uses these, add write capabilities one at a time.
Pattern 2: Provide Clear, Minimal Tool Descriptions
The model decides which tool to call based on the description you provide. If your description is vague or overly complex, the agent will misuse it.
Bad:
get_weather(location, units, forecast_days, include_humidity, detailed)
"Gets weather information with various options and settings for different formats"
Good:
get_weather(city: str) -> dict
"Returns current temperature and conditions for the specified city. Example: get_weather('London')"
Rules:
- One clear purpose per tool
- Include an example of correct usage
- Avoid optional parameters unless necessary
- State what the tool returns
Pattern 3: Handle Tool Errors Gracefully
When a tool fails, the agent needs context to recover. Don’t just return “Error.”
Instead of:
Error
Return:
Error: API rate limit exceeded (60 requests/hour).
You have 45 minutes until reset.
Try using cached data or a different data source.
This gives the agent options:
- Wait and retry
- Use alternative tools
- Adjust its plan
Good error messages turn failures into learning opportunities for the agent.
Pattern 4: Limit Tool Output Size
A common failure mode: the agent calls a tool, gets back 50KB of JSON, and blows through the context window trying to process it.
Solution: Truncate at the tool level
def search_database(query):
results = db.query(query)
# Return only first 10 results with key fields
return [
{
"id": r.id,
"title": r.title,
"summary": r.summary[:200]
}
for r in results[:10]
]
If the agent needs more detail, it can call get_full_record(id) for specific items.
Principle: Tools should return just enough information for the agent to decide next steps, not everything.
Pattern 5: Use Confirmation For Destructive Actions
For any tool that modifies data, sends messages, or costs money, add a human-in-the-loop confirmation step.
Example flow:
- Agent calls
draft_email(to, subject, body) - Tool returns: “Draft created. Preview: [content]. Call
send_email(draft_id)to send.” - Human reviews and approves
- Agent calls
send_email(draft_id)
This pattern:
- Prevents costly mistakes
- Builds trust with users
- Creates an audit trail
- Lets you test agent judgment before full autonomy
Pattern 6: Tool Chaining With Context Passing
Agents often need to use multiple tools in sequence, passing data from one to the next.
Bad approach: Force the agent to remember everything
Good approach: Design tools to accept references
# Step 1: Search returns IDs
results = search_products(query="wireless headphones")
# Returns: [{"id": "prod_123", "name": "..."}]
# Step 2: Get details by ID
details = get_product_details(product_id="prod_123")
# Step 3: Check inventory with same ID
stock = check_inventory(product_id="prod_123")
The agent doesn’t need to hold full product data in context—just pass IDs between tools.
Pattern 7: Progressive Disclosure
Don’t dump all available tools on the agent at once. Start with a core set, and only provide specialized tools when needed.
Example:
Initial tools:
search()summarize()ask_user()
If agent uses search() for code:
- Add
search_github() - Add
read_file() - Add
execute_code()
This keeps the decision space manageable and improves tool selection accuracy.
What We’ve Built At RIL
In our Agentic AI bootcamp, we don’t teach tool use as theory. We build working examples:
- Research Agent: Uses web search, content extraction, and synthesis tools to produce structured briefs
- Data Agent: Queries APIs, cleans responses, and generates visualizations
- Code Agent: Searches repositories, reads files, and suggests improvements
Each one follows these patterns. Each one runs reliably.
The Real Test
You know your tool design works when:
- The agent rarely calls the wrong tool
- Errors lead to recovery, not failure
- You can add new tools without breaking existing workflows
- Users trust the agent to act without constant supervision
If your agent is brittle, the problem is usually tool design, not the model.
Start Simple, Then Scale
Your first agent doesn’t need 50 tools. It needs 3-5 well-designed ones that work together cleanly.
Once that’s solid, adding more is straightforward.
Want to see these patterns in action? Join our Agentic AI Bootcamp and build an agent with real tools, real data, and real deployment by the end of the day.
