files-to-prompt sqlite-utils llm llm-gemini llm-anthropic -e md -e py -c | \
llm -m gemini-2.5-pro-exp-03-25 -s '
Provide a detailed design proposal for adding tool
calling support to LLM
Different model plugins will have different ways of
executing tools, but LLM itself should provide both
a Python and a CLI abstraction over these
differences.
Tool calling involves passing tool definitions to a
model, the model then replies with tools it would
like executed, the harness code then executes those
and passes the result back to the model and
continues in a loop until the model stops
requesting tools. This should happen as part of the
existing llm.Conversation mechanism.
At the Python layer tools should be provided as
Python functions that use type hints, which can
then be converted into JSON schema using Pydantic -
similar to how the existing schema= mechanism
works.
For the CLI option tools will be provided as an
option passed to the command, similar to how the
sqlite-utils convert CLI mechanism works.'
Okay, here is a detailed design proposal for adding tool calling support to llm
.
- Integrate Tool Calling: Enable LLMs that support tool calling (function calling) to request the execution of user-provided tools within a conversation.
- Python API: Allow users to define tools as Python functions with type hints. Automatically generate JSON Schema definitions using Pydantic. Integrate seamlessly with
llm.Conversation
. - CLI API: Allow users to specify tools via the command line, likely by referencing Python files containing tool functions.
- Abstraction: The core
llm
library should provide a consistent interface for users, abstracting away the specific implementation details of different model providers (OpenAI, Anthropic, Gemini, etc.). - Plugin System: Model plugins should be responsible for translating between LLM's internal tool representation and the specific API format of the underlying model.
- Conversation Context: Tool calls and their results should become part of the ongoing conversation history sent back to the model.
- Tool Definition: A representation of a tool that can be understood by both the user's code and the LLM (after conversion).
- Python: A standard Python function with type hints and a docstring.
- Internal/Schema: A JSON Schema object describing the function's name, description, and parameters (types, descriptions, required fields).
- Tool Call Request: A structured request from the LLM to execute a specific tool with specific arguments.
- Contains: Tool name, arguments (dictionary), potentially a unique call ID if the API provides one.
- Tool Result: The outcome of executing a requested tool.
- Contains: Tool name, call ID (if applicable), the return value (JSON serializable), or an error message if execution failed.
Tools will be defined as standard Python functions using type hints and docstrings. Pydantic will be used internally to generate the JSON Schema.
import llm
from pydantic import BaseModel, Field
from typing import Literal
def get_current_weather(
location: str = Field(..., description="The city and state, e.g. San Francisco, CA"),
unit: Literal["celsius", "fahrenheit"] = "fahrenheit"
) -> dict:
"""
Get the current weather in a given location.
Args:
location: The city and state, e.g. San Francisco, CA
unit: The unit to use for the temperature, celsius or fahrenheit
"""
# Implementation would go here - fetch weather data
if "tokyo" in location.lower():
return {"location": location, "temperature": "10", "unit": unit, "forecast": ["sunny", "windy"]}
elif "san francisco" in location.lower():
return {"location": location, "temperature": "72", "unit": unit, "forecast": ["cloudy", "foggy"]}
else:
return {"location": location, "temperature": "unknown"}
# Example using a Pydantic model for more complex arguments
class GetNthFibonacciRequest(BaseModel):
n: int = Field(..., description="The index of the Fibonacci number to calculate (0-based)", ge=0, le=100)
def get_nth_fibonacci(req: GetNthFibonacciRequest) -> dict:
"""Calculates the nth Fibonacci number."""
a, b = 0, 1
if req.n == 0: return {"result": a}
for _ in range(req.n - 1):
a, b = b, a + b
return {"result": b}
# List of tools to be passed to the model
my_tools = [get_current_weather, get_nth_fibonacci]
- Docstrings: The main function docstring becomes the tool's description. Parameter descriptions can be added using
Field(..., description="...")
or potentially extracted from standard docstring formats (e.g., Google style) as a fallback. - Type Hints: Used by Pydantic to generate the JSON Schema parameter types (
string
,integer
,number
,boolean
,array
,object
).Literal
is used for enums. Pydantic models can be used for complex nested arguments. - Return Type: The function should return a JSON-serializable dictionary or other value.
A new tools
parameter will be added to Conversation.prompt()
(and AsyncConversation.prompt()
).
import llm
model = llm.get_model("gpt-4o") # Or another tool-capable model
conversation = model.conversation()
# The tool calling loop is handled internally by prompt()
response = conversation.prompt(
"What's the weather like in San Francisco and what is the 5th Fibonacci number?",
tools=my_tools
)
# The final response from the model after tool execution
print(response.text())
Internal Flow:
conversation.prompt()
receives thetools
list.- It converts each function into its JSON Schema representation using a helper function (e.g.,
llm.function_to_json_schema(func)
). - It passes the prompt text and the list of tool schemas to the underlying
model.execute()
method. - The
model.execute()
implementation (within the plugin) interacts with the specific model API, providing the tool schemas. - If the model API returns a request to call a tool,
model.execute()
yields a standardizedToolCallRequest
object (defined inllm
). conversation.prompt()
(or its internal iterator) interceptsToolCallRequest
.- It finds the corresponding Python function in the
my_tools
list based on the requested tool name. - It validates the arguments provided by the LLM against the function's signature/schema (using Pydantic).
- It executes the Python function with the validated arguments.
- It captures the return value (or any exception).
- It packages the result into a standardized
ToolResult
object. - It passes the
ToolResult
back to themodel.execute()
generator/iterator (perhaps via.send()
or a new model method likemodel.continue_execution(tool_results)
). - The
model.execute()
implementation sends the tool result back to the model API. - The loop continues: the model might respond with another tool call or with the final text response.
conversation.prompt()
yields/returns the final text chunks/Response
object once the model stops requesting tool calls.
Response
Object Enhancements:
The Response
object will need attributes to store the history of tool calls and results made during the generation of that specific response.
@dataclass
class ToolCallRequest:
id: Optional[str] # Unique ID for the call, if provided by the model API
name: str
arguments: Dict[str, Any]
@dataclass
class ToolResult:
id: Optional[str] # Matching ID from the request, if available
name: str
result: Optional[Any] = None # JSON serializable result
error: Optional[str] = None # Error message if execution failed
class Response:
# ... existing attributes ...
tool_calls: List[Tuple[ToolCallRequest, ToolResult]] = field(default_factory=list)
# Internal list to track the sequence of calls and results for this response turn.
The Response.log_to_db()
method will need updating to store this tool call history, likely in new database tables.
The AsyncConversation.prompt()
method will mirror the synchronous API, accepting the tools
parameter and handling the tool calling loop using async/await
for tool execution if the tool functions themselves are async. If tool functions are synchronous, they will be run in a thread pool executor via asyncio.to_thread
.
A new --tool
option will be added to llm prompt
and llm chat
. It will accept the path to a Python file.
llm prompt "Weather in SF?" --tool path/to/my_tools.py
llm chat -m gemini-1.5-flash --tool path/to/my_tools.py
The specified Python file (my_tools.py
) should contain one or more functions intended to be used as tools.
Discovery:
By default, llm
will import the file and make all functions defined directly within that module available as tools.
Alternatively, a decorator could be introduced (@llm.tool
) to explicitly mark functions for exposure, providing better control. Let's start with importing all functions for simplicity.
- The
llm prompt
(orchat
) command parses the--tool file.py
options. - For each file, it imports the module.
- It inspects the module, finding all function objects.
- For each function, it generates the JSON Schema (using the same mechanism as the Python API).
- It passes the list of tool schemas to the selected model via the
prompt()
method (which internally callsmodel.execute()
). - When the underlying
model.execute()
yields aToolCallRequest
, the CLI harness:- Finds the corresponding function object within the imported modules.
- Validates arguments.
- Executes the function.
- Captures the result/error.
- Creates the
ToolResult
object. - Sends the result back into the model execution loop.
- Prints the final text response to the console.
# path/to/my_tools.py
from pydantic import Field
from typing import Literal
def get_current_weather(
location: str = Field(..., description="The city and state, e.g. San Francisco, CA"),
unit: Literal["celsius", "fahrenheit"] = "fahrenheit"
) -> dict:
"""Get the current weather in a given location."""
# Implementation...
if "tokyo" in location.lower():
return {"location": location, "temperature": "10", "unit": unit, "forecast": ["sunny", "windy"]}
# ... rest of implementation
return {"location": location, "temperature": "unknown"}
# Another tool
def calculate_sum(a: int, b: int) -> dict:
"""Calculates the sum of two integers."""
return {"sum": a + b}
The Model
(and AsyncModel
, KeyModel
, AsyncKeyModel
) base classes need modifications to support tool calling.
-
Capability Flag:
class Model: ... supports_tool_calling: bool = False # Optional: If the model supports structured output matching a schema # distinct from general tool calling (like Gemini function calling vs JSON mode) supports_schema: bool = False
Plugins set
supports_tool_calling = True
if their model supports the tool call/result loop.supports_schema
remains for models that can only force JSON output matching a schema without the back-and-forth tool loop. -
Modified
execute()
Signature (Conceptual): The existingexecute()
method needs to handle the tool calling loop. A possible way is for it to yield special objects.from typing import Iterator, Union, List, Dict, Any, Optional, AsyncGenerator @dataclass class ToolCallRequest: id: Optional[str] name: str arguments: Dict[str, Any] @dataclass class ToolResult: id: Optional[str] name: str result: Optional[Any] = None error: Optional[str] = None # Standardized input/output for execute's tool interaction ToolInput = Union[str, ToolResult] ToolOutput = Union[str, Chunk, ToolCallRequest] class Model(ABC): # ... existing methods ... @abstractmethod def execute( self, prompt: Prompt, # Now includes prompt.tools_schemas list stream: bool, response: Response, conversation: Optional[Conversation], ) -> Iterator[ToolOutput]: # Yields text chunks OR ToolCallRequest """ Executes the prompt. If the model supports tools, this method is responsible for the loop: 1. Send prompt + tool schemas to the model API. 2. Parse response: - If text response, yield text chunks. - If tool call request, yield ToolCallRequest object. 3. If ToolCallRequest was yielded, expect a ToolResult to be sent back via the generator's .send() method. 4. Send ToolResult to the model API. 5. Go back to step 2. """ pass class AsyncModel(ABC): # ... similar changes for async execute ... @abstractmethod async def execute( self, prompt: Prompt, stream: bool, response: AsyncResponse, conversation: Optional[AsyncConversation], ) -> AsyncGenerator[ToolOutput, ToolInput]: # async yield, async send pass
Explanation of
execute
Changes:- The
Prompt
object passed toexecute
will now contain an additional (optional) field likeprompt.tool_schemas: List[dict]
. execute
now yieldsToolOutput
, which can be textstr
,Chunk
, or aToolCallRequest
.- The caller (the
Conversation.prompt
implementation) iterates the generator. - If it receives a
ToolCallRequest
, it executes the tool and usesgenerator.send(tool_result)
to send theToolResult
back into theexecute
generator. - The
execute
implementation needs to handle receiving thisToolResult
(e.g., via the value of ayield
expression) and continue the API interaction.
This generator-based approach allows the core
llm
library to manage the execution harness while the plugin focuses solely on the API communication details. - The
The logs.db
schema needs updating to store tool interactions.
-
New Table:
tool_calls
id
: TEXT PRIMARY KEY (e.g., ULID)response_id
: TEXT, FK toresponses.id
request_id
: TEXT (optional, the ID from the API call if provided)tool_name
: TEXTarguments
: TEXT (JSON dictionary)result
: TEXT (JSON representation of the return value)error
: TEXT (Error message if execution failed)timestamp_utc
: TEXT (When the tool call was processed)
-
Modify
responses
table:- Consider adding a boolean
has_tool_calls
column for easier querying, although joiningtool_calls
is also possible.
- Consider adding a boolean
The Response.log_to_db()
method will be updated to insert records into the tool_calls
table based on the content of response.tool_calls
.
- Tool Not Found: If the LLM requests a tool not provided by the user, the harness returns a
ToolResult
with an appropriate error message. - Argument Validation Failure: If the LLM provides arguments that don't match the tool's schema, Pydantic raises an error during validation. The harness returns a
ToolResult
with the validation error details. - Tool Execution Error: If the user's Python function raises an exception, the harness catches it and returns a
ToolResult
with the exception message/traceback. - Model Does Not Support Tools: If
tools=
is passed to a model wheresupports_tool_calling
isFalse
,prompt()
should raise aTypeError
orValueError
.
- Define
ToolCallRequest
andToolResult
: Add these dataclasses tollm.models
. - Schema Generation: Implement
llm.function_to_json_schema()
using Pydantic. - Update
Prompt
: Addtool_schemas: List[dict]
attribute. - Update
Response
: Addtool_calls: List[Tuple[ToolCallRequest, ToolResult]]
attribute. - Modify
Model.execute
: Change signature/yield type as described in section 5. Update documentation. - Modify
Conversation.prompt
:- Add
tools
parameter. - Implement schema generation.
- Implement the loop that iterates over
model.execute()
. - Handle
ToolCallRequest
: find function, validate args, execute, createToolResult
. - Use
generator.send(tool_result)
to continue execution. - Store
(request, result)
tuples inresponse.tool_calls
.
- Add
- Update
Async
Variants: Implement the async versions of the above. - CLI Implementation:
- Add
--tool
option tollm prompt
/llm chat
. - Implement module import and function discovery.
- Pass discovered tools (as functions) to the
prompt()
call.
- Add
- Database: Implement schema migrations and update
Response.log_to_db()
. - Update OpenAI Plugin: Modify the OpenAI plugin's
execute()
methods to handle the new tool calling flow (sending schemas, parsing tool call requests, handling results). - Testing: Add comprehensive tests for Python API, CLI, plugin interactions, error handling, and database logging.
- Documentation: Update user docs (CLI, Python API) and plugin author docs.
This design provides a flexible and extensible way to integrate tool calling, leveraging existing patterns (Pydantic, JSON Schema) while keeping the core logic abstracted from specific model implementations.Okay, here is a detailed design proposal for adding tool calling support to llm
.
- Integrate Tool Calling: Enable LLMs that support tool calling (function calling) to request the execution of user-provided tools within a conversation.
- Python API: Allow users to define tools as Python functions with type hints. Automatically generate JSON Schema definitions using Pydantic. Integrate seamlessly with
llm.Conversation
. - CLI API: Allow users to specify tools via the command line by referencing Python files containing tool functions.
- Abstraction: The core
llm
library should provide a consistent interface for users, abstracting away the specific implementation details of different model providers (OpenAI, Anthropic, Gemini, etc.). - Plugin System: Model plugins should be responsible for translating between LLM's internal tool representation and the specific API format of the underlying model.
- Conversation Context: Tool calls and their results should become part of the ongoing conversation history sent back to the model.
- Tool Definition: A representation of a tool that can be understood by both the user's code and the LLM (after conversion).
- Python: A standard Python function with type hints and a docstring.
- Internal/Schema: A JSON Schema object describing the function's name, description, and parameters (types, descriptions, required fields).
- Tool Call Request: A standardized structure representing a request from the LLM to execute a specific tool with specific arguments.
- Contains:
id
(Optional[str], unique ID for the call, if provided by the model API),name
(str, tool name),arguments
(Dict[str, Any]).
- Contains:
- Tool Result: A standardized structure representing the outcome of executing a requested tool.
- Contains:
id
(Optional[str], matching ID from the request),name
(str, tool name),result
(Optional[Any], JSON serializable return value),error
(Optional[str], error message if execution failed).
- Contains:
Tools will be defined as standard Python functions using type hints and docstrings. Pydantic will be used internally to generate the JSON Schema. The existing schema=
mechanism's infrastructure can be leveraged or extended.
# examples/tools.py
import llm
from pydantic import BaseModel, Field
from typing import Literal
def get_current_weather(
location: str = Field(..., description="The city and state, e.g. San Francisco, CA"),
unit: Literal["celsius", "fahrenheit"] = "fahrenheit"
) -> dict:
"""
Get the current weather in a given location.
Args:
location: The city and state, e.g. San Francisco, CA
unit: The unit to use for the temperature, celsius or fahrenheit
"""
# Implementation would go here - fetch weather data
print(f"*** Tool: get_current_weather called with location={location}, unit={unit} ***")
if "tokyo" in location.lower():
return {"location": location, "temperature": "10", "unit": unit, "forecast": ["sunny", "windy"]}
elif "san francisco" in location.lower():
return {"location": location, "temperature": "72", "unit": unit, "forecast": ["cloudy", "foggy"]}
else:
return {"location": location, "temperature": "unknown"}
class GetNthFibonacciRequest(BaseModel):
n: int = Field(..., description="The index of the Fibonacci number to calculate (0-based)", ge=0, le=100)
def get_nth_fibonacci(req: GetNthFibonacciRequest) -> dict:
"""Calculates the nth Fibonacci number."""
print(f"*** Tool: get_nth_fibonacci called with n={req.n} ***")
a, b = 0, 1
if req.n == 0: return {"result": a}
for _ in range(req.n - 1):
a, b = b, a + b
return {"result": b}
# Can also define tools that raise errors
def sometimes_errors(succeed: bool = True) -> dict:
"""This tool sometimes raises an error."""
print(f"*** Tool: sometimes_errors called with succeed={succeed} ***")
if succeed:
return {"status": "success"}
else:
raise ValueError("You asked for an error!")
# List of tools to be passed to the model
my_tools = [get_current_weather, get_nth_fibonacci, sometimes_errors]
- Docstrings: The main function docstring becomes the tool's description. Parameter descriptions can be added using
Field(..., description="...")
or extracted from standard docstring formats (e.g., Google style - TBD). - Type Hints: Used by Pydantic to generate the JSON Schema parameter types. Pydantic models can define complex nested arguments.
- Return Type: The function should return a JSON-serializable dictionary or other value. Non-dict values will likely be wrapped in a standard structure like
{"result": value}
.
A new tools
parameter will be added to Conversation.prompt()
(and AsyncConversation.prompt()
) accepting a list of callable functions.
import llm
# Assuming my_tools is defined as above
# from examples.tools import my_tools
# Use an OpenAI model that supports tool calling
# Ensure API key is configured via llm keys set openai or OPENAI_API_KEY
model = llm.get_model("gpt-4o")
conversation = model.conversation()
# The tool calling loop is handled internally by prompt()
# Use --no-stream for simpler debugging initially
response = conversation.prompt(
"What's the weather like in San Francisco and what is the 5th Fibonacci number?",
tools=my_tools,
stream=False # Use stream=False for easier debugging of tool loops
)
# The final response from the model after tool execution
print(response.text())
# You can inspect the tool calls made during the response generation
for request, result in response.tool_calls:
print(f"Request: {request.name}({request.arguments})")
if result.error:
print(f" Error: {result.error}")
else:
print(f" Result: {result.result}")
# Example with a tool error
response_error = conversation.prompt(
"Use the sometimes_errors tool and make it fail",
tools=my_tools,
stream=False
)
print(response_error.text())
for request, result in response_error.tool_calls:
print(f"Request: {request.name}({request.arguments})")
if result.error:
print(f" Error: {result.error}")
else:
print(f" Result: {result.result}")
Internal Flow (Revised based on thought process Option 3):
conversation.prompt()
receives thetools
list (of Python functions).- It generates JSON Schema representations for each tool using an internal helper (
_function_to_schema
). - It adds these schemas to the
Prompt
object (prompt.tool_schemas
). - The
conversation.prompt()
method (or its internal iterator/async handler_execute_prompt_internal
) manages the tool-calling loop. - It calls
model.execute(prompt, ...)
which returns a generator. - It iterates through the generator returned by
model.execute()
. - If
model.execute()
yieldsstr
orChunk
, yield it to the user (if streaming). Accumulate text. - If
model.execute()
yields aToolCallRequest
object:- Look up the corresponding Python function from the
tools
list. - Error Handling: If not found, create a
ToolResult
with an error and.send()
it back. - Validate the arguments from
ToolCallRequest.arguments
against the function signature/schema using Pydantic. - Error Handling: If validation fails, create
ToolResult
with validation error and.send()
it back. - Execute the Python function (using
asyncio.to_thread
if sync function in async context). - Error Handling: If the function raises an exception, catch it, create
ToolResult
with the error message/traceback and.send()
it back. - If execution succeeds, create
ToolResult
with the return value. - Store the
(ToolCallRequest, ToolResult)
pair inresponse._tool_calls_internal
. - Use
generator.send(tool_result)
to send the result back into themodel.execute()
generator.
- Look up the corresponding Python function from the
- The loop continues until
model.execute()
finishes yielding (signifying the model has produced its final text response). - The accumulated text and the list of tool calls are finalized on the
Response
object.
from dataclasses import dataclass, field
# ... other imports ...
@dataclass
class ToolCallRequest:
id: Optional[str] = None # Unique ID for the call, if provided by the model API
name: str = ""
arguments: Dict[str, Any] = field(default_factory=dict)
@dataclass
class ToolResult:
id: Optional[str] = None # Matching ID from the request, if available
name: str = ""
result: Optional[Any] = None # JSON serializable result
error: Optional[str] = None # Error message if execution failed
class Response:
# ... existing attributes ...
# Internal list, populated during generation
_tool_calls_internal: List[Tuple[ToolCallRequest, ToolResult]] = field(default_factory=list)
@property
def tool_calls(self) -> List[Tuple[ToolCallRequest, ToolResult]]:
"""A list of (request, result) tuples for tool calls made during this response generation."""
self._force() # Ensure iteration is complete
return self._tool_calls_internal
def log_to_db(self, db):
# ... existing logging ...
# Add logic to insert into the new tool_calls table
# using self.tool_calls and the response_id
if self.tool_calls:
tool_call_records = []
for request, result in self.tool_calls:
tool_call_records.append({
"response_id": self.id, # Assuming response.id is set before log_to_db
"request_id": request.id,
"tool_name": request.name,
"arguments": json.dumps(request.arguments),
"result": json.dumps(result.result) if result.result is not None else None,
"error": result.error,
"timestamp_utc": datetime.datetime.now(datetime.timezone.utc).isoformat(),
})
db["tool_calls"].insert_all(tool_call_records, pk="id") # Need unique ID? Or composite?
# Similar changes needed for AsyncResponse, using await self._force()
The AsyncConversation.prompt()
method will mirror the synchronous API. The internal loop handler (_execute_prompt_internal_async
) will use async for
and await generator.asend(tool_result)
. It will use asyncio.to_thread
for executing synchronous tool functions. If a tool function is async def
, it will be awaited directly.
A new --tool PATH
option will be added to llm prompt
and llm chat
. It can be specified multiple times.
# Use tools defined in my_tools.py
llm prompt "Weather in SF and 5th Fib number?" --tool my_tools.py
# Use tools from multiple files
llm prompt "Combine weather and fib" --tool weather.py --tool math.py
The specified Python file (my_tools.py
) should contain one or more functions intended to be used as tools.
Discovery: llm
will import the file and make all functions defined directly within that module available as tools, ignoring those starting with _
. Docstrings and type hints will be used for schema generation.
- The
llm prompt
(orchat
) command parses the--tool file.py
options. - For each file, it imports the module and inspects it to find candidate functions.
- It generates the JSON Schema for each tool function.
- It passes the list of functions (not just schemas) to the internal
conversation.prompt()
call via thetools=
argument. - The Python API's internal loop handles execution as described in section 3.2.
- The CLI prints the final text response. If tool execution occurred, potentially add a note to stderr or use a
--verbose
flag to show tool interactions.
-
Capability Flag: Add
supports_tool_calling: bool = False
to_BaseModel
. Plugins set this toTrue
. -
Modified
execute()
: The generator-based approach described in section 3.2 (step 8) seems most robust.execute
yieldsstr
,Chunk
, orToolCallRequest
, and receivesToolResult
via.send()
.# In llm/models.py from typing import Iterator, Union, List, Dict, Any, Optional, AsyncGenerator, Tuple, TypeVar @dataclass class ToolCallRequest: ... # As defined above @dataclass class ToolResult: ... # As defined above # Define types for generator input/output ToolOutput = Union[str, Chunk, ToolCallRequest] ToolInput = Optional[ToolResult] # Can send None initially, then ToolResult # Sync Model class Model(ABC): # ... @abstractmethod def execute( self, prompt: Prompt, stream: bool, response: Response, conversation: Optional[Conversation], ) -> Iterator[ToolOutput]: # Yields text/Chunk/ToolCallRequest # Implementation needs to handle receiving ToolResult via yield value tool_result: ToolInput = yield "Initial text or first tool call request" while isinstance(tool_result, ToolResult): # Process tool_result, call API again # ... # Yield more text or another ToolCallRequest next_output: ToolOutput = "More text or another tool call" tool_result = yield next_output # Yield and wait for next .send() # Final processing if needed pass # Async Model class AsyncModel(ABC): # ... @abstractmethod async def execute( self, prompt: Prompt, stream: bool, response: AsyncResponse, conversation: Optional[AsyncConversation], ) -> AsyncGenerator[ToolOutput, ToolInput]: # Async yield/send # Similar logic using 'yield' and '(yield ...)' for send tool_result: ToolInput = yield "Initial text or first tool call request" while isinstance(tool_result, ToolResult): # Process tool_result, call API again (await) # ... # Yield more text or another ToolCallRequest next_output: ToolOutput = "More text or another tool call" tool_result = yield next_output # Yield and wait for next .asend() # Final processing if needed # Need 'yield' at least once in an async generator if False: yield # pragma: no cover
- The
Prompt
object passed toexecute
will containprompt.tool_schemas: List[dict]
. - Plugins implement the logic to:
- Format
prompt.tool_schemas
for their specific API. - Send the prompt and schemas.
- Parse the API response.
- Yield text/
Chunk
for content, orToolCallRequest
for tool calls (parsing name, args, id). - Handle the
ToolResult
sent back viayield value = yield ...
. - Format the
ToolResult
for their API and send it back. - Repeat until the API provides a final text response.
- Format
- The
- Add
tool_calls
table:CREATE TABLE [tool_calls] ( [id] INTEGER PRIMARY KEY, -- Auto-incrementing local ID for the call record [response_id] TEXT REFERENCES [responses]([id]), [request_id] TEXT, -- ID from the API if available [tool_name] TEXT, [arguments] TEXT, -- JSON [result] TEXT, -- JSON [error] TEXT, [timestamp_utc] TEXT ); CREATE INDEX [idx_tool_calls_response_id] ON [tool_calls] ([response_id]);
- Migration: A new migration (
m016_tool_calls.py
?) will be added tollm/migrations.py
to create this table. Response.log_to_db()
: Update this method to iterate throughresponse.tool_calls
(orresponse._tool_calls_internal
) and insert records into the newtool_calls
table, linking them viaresponse_id
.
- Model does not support tools:
conversation.prompt()
raisesTypeError
iftools=
used with incompatible model. - Tool Not Found (CLI): CLI harness fails during function lookup, raises
click.ClickException
. - Tool Not Found (Python API): Harness loop creates
ToolResult(error=...)
and sends to model via generator. - Argument Validation Failure: Pydantic validation in harness loop creates
ToolResult(error=...)
and sends to model. - Tool Execution Error: Harness loop catches exception, creates
ToolResult(error=...)
and sends to model.
- User Docs:
- New section explaining tool calling concept.
- Python API: How to define tools, use
tools=
parameter, interpretresponse.tool_calls
. - CLI: How to use
--tool
, structure of tool files.
- Plugin Author Docs:
- Explain
supports_tool_calling
flag. - Detail the modified
execute()
signature and the generator.send()
mechanism. - Define the
ToolCallRequest
andToolResult
dataclasses. - Provide examples for handling the tool loop within
execute()
.
- Explain
This design provides a robust framework for tool calling, balancing user convenience with the flexibility needed for diverse model APIs, and integrating naturally into the existing llm
structures.