Note we added think=False this time (compare with the chat-only version)
A simple chat:
❯ python tool_call_min.py
> What's a lagomorph
(Sending context to Ollama API)
From the model's response:
>> Role: assistant
>> Thinking: None
>> Tool calls: None
>> Content: A lagomorph is a type of mammal, specifically a group of primates, that are known for their long, narrow noses and the ability to chew their teeth.
>
Get it to make a tool call:
❯ python tool_call_min.py
> can you ping fly.io?
(Sending context to Ollama API)
From the model's response:
>> Role: assistant
>> Thinking: None
>> Tool calls: [ToolCall(function=Function(name='ping', arguments={'host': 'fly.io'}))]
>> Content:
Local function output:
ping -c 3 fly.io
PING fly.io (37.16.18.81): 56 data bytes
64 bytes from 37.16.18.81: icmp_seq=0 ttl=55 time=17.132 ms
64 bytes from 37.16.18.81: icmp_seq=1 ttl=55 time=19.061 ms
64 bytes from 37.16.18.81: icmp_seq=2 ttl=55 time=18.845 ms
--- fly.io ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 17.132/18.346/19.061/0.863 ms
(Sending context to Ollama API)
From the model's response:
>> Role: assistant
>> Thinking: None
>> Tool calls: None
>> Content: Yes, I can ping Fly.io. The ping response confirms that Fly.io is reachable and responds with a round-trip time of approximately 18.346 milliseconds.
>
We started with one API call to send the context up until, and including, the user's prompt to the model.
The model responded with a tool_call instead of content, so we
- added that response
messageto the context - ran the tool
- added a message with the
toolrole to the context (to let the model know the result of running the tool function), and - fired the whole thing back to the API.
The model responded to that with just content, so we could display that and prompt the user for another round of input.