Skip to content

Instantly share code, notes, and snippets.

@simonw

simonw/exa.md Secret

Created May 29, 2025 03:56
Show Gist options
  • Save simonw/b5780859f1dc68695fef496f44780595 to your computer and use it in GitHub Desktop.
Save simonw/b5780859f1dc68695fef496f44780595 to your computer and use it in GitHub Desktop.

2025-05-29T03:53:34 conversation: 01jwd2s1fzc7ejf4f1ntjh89az id: 01jwd2s1g533eyegxz9zgda1kt

Model: gpt-4.1-mini

Prompt

What's in LLM 0.26?

Tools

  • web_search: 77c178b811f25dab6e0937ad6e7f8fe9bc572eedf5218e689aad50d6b71eb23e
    Search the web using Exa API for high-quality, relevant results.

      Args:
          query: The search query string to find relevant content.
          num_results: Number of search results to return (default: 3).
          category: Optional content category to filter results by specific type.
          include_domains: Optional list of domains to limit search results to.
      <br>
    

    Arguments: {"query": {"type": "string"}, "num_results": {"default": 3, "type": "integer"}, "category": {"anyOf": [{"enum": ["company", "research paper", "news", "pdf", "github", "tweet", "personal site", "linkedin profile", "financial report"], "type": "string"}, {"type": "null"}], "default": null}, "include_domains": {"anyOf": [{"items": {"type": "string"}, "type": "array"}, {"type": "null"}], "default": null}}

Response

Tool calls

  • web_search: call_Zm4JciiQETcTwjKzHSaAMVQ8
    Arguments: {"query": "LLM 0.26 release notes"}

Token usage

188 input, 21 output

2025-05-29T03:53:38

Prompt

-- none --

Tools

  • web_search: 77c178b811f25dab6e0937ad6e7f8fe9bc572eedf5218e689aad50d6b71eb23e
    Search the web using Exa API for high-quality, relevant results.

      Args:
          query: The search query string to find relevant content.
          num_results: Number of search results to return (default: 3).
          category: Optional content category to filter results by specific type.
          include_domains: Optional list of domains to limit search results to.
      <br>
    

    Arguments: {"query": {"type": "string"}, "num_results": {"default": 3, "type": "integer"}, "category": {"anyOf": [{"enum": ["company", "research paper", "news", "pdf", "github", "tweet", "personal site", "linkedin profile", "financial report"], "type": "string"}, {"type": "null"}], "default": null}, "include_domains": {"anyOf": [{"items": {"type": "string"}, "type": "array"}, {"type": "null"}], "default": null}}

Tool results

  • web_search: call_Zm4JciiQETcTwjKzHSaAMVQ8
    Title: Large Language Models can run tools in your terminal with LLM 0.26 Author: Simon Willison URL: https://simonwillison.net/2025/May/27/llm-tools/ Published: 2025-05-27T00:00:00.000Z Highlights:

    • I first saw the trick described in the ReAcT paper, first published in October 2022 (a month before the initial release of ChatGPT). I built a simple implementation of that in a few dozen lines of Python. It was clearly a very neat pattern! Over the past few years it has become very apparent that tool use is the single most effective way to extend the abilities of language models. It’s such a simple trick: you tell the model that there are tools it can use, and have it output special syntax (JSON or XML or tool_name(arguments), it doesn’t matter which) requesting a tool action, then stop. Text: 27th May 2025 LLM 0.26 is out with the biggest new feature since I started the project: support for tools. You can now use the LLM CLI tool —and Python library —to grant LLMs from OpenAI, Anthropic, Gemini and local models from Ollama with access to any tool that you can represent as a Python function. LLM also now has tool plugins, so you can install a plugin that adds new capabilities to whatever model you are currently using. There’s a lot to cover here, but here are the highlights:

    LLM can run tools now! You can install tools from plugins and load them by name with --tool/-T name_of_tool. You can also pass in Python function code on the command-line with the --functions option. The Python API supports tools too: llm.get_model("gpt-4.1").chain("show me the locals", tools=[locals]).text()

    Tools work in both async and sync contexts.

    Here’s what’s covered in this post:

    Trying it out More interesting tools from plugins Ad-hoc command-line tools with --functions Tools in the LLM Python API Why did this take me so long? Is this agents then? What’s next for tools in LLM?

    Trying it out First, install the latest LLM. It may not be on Homebrew yet so I suggest using pip or pipx or uv: If you have it already, upgrade it. Tools work with other vendors, but let’s stick with OpenAI for the moment. Give LLM an OpenAI API key llm keys set openai

    Paste key here

    Now let’s run our first tool: llm --tool llm_version " What version? " --td Here’s what I get:

    llm_version is a very simple demo tool that ships with LLM. Running --tool llm_version exposes that tool to the model—you can specify that multiple times to enable multiple tools, and it has a shorter version of -T to save on typing. The --td option stands for --tools-debug —it causes LLM to output information about tool calls and their responses so you can peek behind the scenes. This is using the default LLM model, which is usually gpt-4o-mini. I switched it to gpt-4.1-mini (better but fractionally more expensive) by running: llm models default gpt-4.1-mini You can try other models using the -m option. Here’s how to run a similar demo of the llm_time built-in tool using o4-mini: llm --tool llm_time " What time is it? " --td -m o4-mini Outputs:

    Tool call: llm_time({}) { "utc_time": " 2025-05-27 19:15:55 UTC ", "utc_time_iso": " 2025-05-27T19:15:55.288632+00:00 ", "local_timezone": " PDT ", "local_time": " 2025-05-27 12:15:55 ", "timezone_offset": " UTC-7:00 ", "is_dst": true } The current time is 12:15 PM PDT (UTC−7:00) on May 27, 2025, which corresponds to 7:15 PM UTC.

    Models from (tool supporting) plugins work too. Anthropic’s Claude Sonnet 4: llm install llm-anthropic -U llm keys set anthropic

    Paste Anthropic key here

    llm --tool llm_version " What version? " --td -m claude-4-sonnet Or Google’s Gemini 2.5 Flash: llm install llm-gemini -U llm keys set gemini

    Paste Gemini key here

    llm --tool llm_version " What version? " --td -m gemini-2.5-flash-preview-05-20 You can even run simple tools with Qwen3:4b, a tiny (2.6GB) model that I run using Ollama: ollama pull qwen3:4b llm install ' llm-ollama>=0.11a0 ' llm --tool llm_version " What version? " --td -m qwen3:4b Qwen 3 calls the tool, thinks about it a bit and then prints out a response:

    More interesting tools from plugins This demo has been pretty weak so far. Let’s do something a whole lot more interesting. LLMs are notoriously bad at mathematics. This is deeply surprising to many people: supposedly the most sophisticated computer systems we’ve ever built can’t multiply two large numbers together? We can fix that with tools. The llm-tools-simpleeval plugin exposes the simpleeval “Simple Safe Sandboxed Extensible Expression Evaluator for Python” library by Daniel Fairhead. This provides a robust-enough sandbox for executing simple Python expressions. Here’s how to run a calculation: llm install llm-tools-simpleeval llm -T simpleeval Trying that out: llm -T simple_eval ' Calculate 1234 * 4346 / 32414 and square root it ' --td I got back this—it tried sqrt() first, then when that didn’t work switched to ** 0.5 instead: Tool call: simple_eval({'expression': '1234 * 4346 / 32414'}) 165.45208860368976 Tool call: simple_eval({'expression': 'sqrt(1234 * 4346 / 32414)'}) Error: Function 'sqrt' not defined, for expression 'sqrt(1234 * 4346 / 32414)'. Tool call: simple_eval({'expression': '(1234 * 4346 / 32414) ** 0.5'}) 12.862818066181678 The result of (1234 * 4346 / 32414) is approximately 165.45, and the square root of this value is approximately 12.86.

    I’ve released four tool plugins so far:

    llm-tools-simpleeval —as shown above, simple expression support for things like mathematics.

    llm-tools-quickjs —provides access to a sandboxed QuickJS JavaScript interpreter, allowing LLMs to run JavaScript code. The environment persists between calls so the model can set variables and build functions and reuse them later on.

    llm-tools-sqlite —read-only SQL query access to a local SQLite database.

    llm-tools-datasette —run SQL queries against a remote Datasette instance!

    Let’s try that Datasette one now: llm install llm-tools-datasette llm -T ' Datasette("https://datasette.io/content") ' --td " What has the most stars? " The syntax here is slightly different: the Datasette plugin is what I’m calling a “toolbox”—a plugin that has multiple tools inside it and can be configured with a constructor. Specifying --tool as Datasette("https://datasette.io/content") provides the plugin with the URL to the Datasette instance it should use—in this case the content database that powers the Datasette website. Here’s the output, with the schema section truncated for brevity:

    This question triggered three calls. The model started by guessing the query! It tried SELECT name, stars FROM repos ORDER BY stars DESC LIMIT 1, which failed because the stars column doesn’t exist. The tool call returned an error, so the model had another go—this time calling the Datasette_schema() tool to get the schema of the database. Based on that schema it assembled and then executed the correct query, and output its interpretation of the result:

    The repository with the most stars is “datasette” with 10,020 stars.

    Getting to this point was a real Penny Arcade Minecraft moment for me. The possibilities here are limitless. If you can write a Python function for it, you can trigger it from an LLM. Ad-hoc command-line tools with --functions

    I’m looking forward to people building more plugins, but there’s also much less structured and more ad-hoc way to use tools with the LLM CLI tool: the --functions option. This was inspired by a similar feature I added to sqlite-utils a while ago. You can pass a block of literal Python code directly to the CLI tool using the --functions option, and any functions defined there will be made available to the model as tools. Here’s an example that adds the ability to search my blog: llm --functions ' import httpx

    def search_blog(q): "Search Simon Willison blog" return httpx.get("https://simonwillison.net/search/", params={"q": q}).content ' --td ' Three features of sqlite-utils ' -s ' use Simon search ' This is such a hack of an implementation! I’m literally just hitting my search page and dumping the HTML straight back into tho model. It totally works though—it helps that the GPT-4.1 series all handle a million tokens now, so crufty HTML is no longer a problem for them. (I had to add “use Simon search” as the system prompt because without it the model would try to answer the question itself, rather than using the search tool I provided. System prompts for tools are clearly a big topic, Anthropic’s own web search tool has 6,471 tokens of instructions!) Here’s the output I got just now:

    Three features of sqlite-utils are:

    It is a combined CLI tool and Python library for manipulating SQLite databases. It can automatically add columns to a database table if you attempt to insert data that doesn’t quite fit (using the alter=True option). It supports plugins, allowing the extension of its functionality through third-party or custom plugins.

    A better search tool would have more detailed instructions and would return relevant snippets of the results, not just the headline and first paragraph for each result. This is pretty great for just four lines of Python though! Tools in the LLM Python API LLM is both a CLI tool and a Python library at the same time (similar to my other project sqlite-utils). The LLM Python library grew tool support in LLM 0.26 as well. Here’s a simple example solving one of the previously hardest problems in LLMs: counting the number of Rs in “strawberry”: import llm def count_char_in_text ( char: str, text: str) -> int: "How many times does char appear in text?" return text. count ( char) model = llm. get_model ( "gpt-4.1-mini") chain_response = model. chain ( "Rs in strawberry?", tools = [ count_char_in_text], after_call = print ) for chunk in chain_response: print ( chunk, end = "", flush = True) The after_call=print argument is a way to peek at the tool calls, the Python equivalent of the --td option from earlier. The model.chain() method is new: it’s similar to model.prompt() but knows how to spot returned tool call requests, execute them and then prompt the model again with the results. A model.chain() could potentially execute dozens of responses on the way to giving you a final answer. You can iterate over the chain_response to output those tokens as they are returned by the model, even across multiple responses. I got back this:

    Tool(name='count_char_in_text', description='How many times does char appear in text?', input_schema={'properties': {'char': {'type': 'string'}, 'text': {'type': 'string'}}, 'required': ['char', 'text'], 'type': 'object'}, implementation=<function count_char_in_text at 0x109dd4f40>, plugin=None) ToolCall(name='count_char_in_text', arguments={'char': 'r', 'text': 'strawberry'}, tool_call_id='call_DGXcM8b2B26KsbdMyC1uhGUu') ToolResult(name='count_char_in_text', output='3', tool_call_id='call_DGXcM8b2B26KsbdMyC1uhGUu', instance=None, exception=None) There are 3 letter “r”s in the word “strawberry”.

    LLM’s Python library also supports asyncio, and tools can be async def functions as described here. If a model requests multiple async tools at once the library will run them concurrently with asyncio.gather(). The Toolbox form of tools is supported too: you can pass tools=[Datasette("https://datasette.io/content")] to that chain() method to achieve the same effect as the --tool 'Datasette(...) option from earlier. Why did this take me so long? I’ve been tracking llm-tool-use for a while. I first saw the trick described in the ReAcT paper, first published in October 2022 (a month before the initial release of ChatGPT). I built a simple implementation of that in a few dozen lines of Python. It was clearly a very neat pattern! Over the past few years it has become very apparent that tool use is the single most effective way to extend the abilities of language models. It’s such a simple trick: you tell the model that there are tools it can use, and have it output special syntax (JSON or XML or tool_name(arguments), it doesn’t matter which) requesting a tool action, then stop. Your code parses that output, runs the requested tools and then starts a new prompt to the model with the results. This works with almost every model now. Most of them are specifically trained for tool usage, and there are leaderboards like the Berkeley Function-Calling Leaderboard dedicated to tracking which models do the best job of it. All of the big model vendors—OpenAI, Anthropic, Google, Mistral, Meta—have a version of this baked into their API, either called tool usage or function calling. It’s all the same underlying pattern. The models you can run locally are getting good at this too. Ollama added tool support last year, and it’s baked into the llama.cpp server as well. It’s been clear for a while that LLM absolutely needed to grow support for tools. I released LLM schema support back in February as a stepping stone towards this. I’m glad to finally have it over the line. As always with LLM, the challenge was designing an abstraction layer that could work across as many different models as possible. A year ago I didn’t feel that model tool support was mature enough to figure this out. Today there’s a very definite consensus among vendors about how this should work, which finally gave me the confidence to implement it. I also presented a workshop at PyCon US two weeks ago about Building software on top of Large Language Models, which was exactly the incentive I needed to finally get this working in an alpha! Here’s the tools section from that tutorial. Is this agents then? Sigh. I still don’t like using the term “agents”. I worry that developers will think tools in a loop, regular people will think virtual AI assistants voiced by Scarlett Johansson and academics will grumble about thermostats. But in the LLM world we appear to be converging on “tools in a loop”, and that’s absolutely what this. So yes, if you want to build “agents” then LLM 0.26 is a great way to do that. What’s next for tools in LLM? I already have a LLM tools v2 milestone with 13 issues in it, mainly around improvements to how tool execution logs are displayed but with quite a few minor issues I decided shouldn’t block this release. There’s a bunch more stuff in the tools label. I’m most excited about the potential for plugins. Writing tool plugins is really fun. I have an llm-plugin-tools cookiecutter template that I’ve been using for my own, and I plan to put together a tutorial around that soon. There’s more work to be done adding tool support to more model plugins. I added details of this to the advanced plugins documentation. This commit adding tool support for Gemini is a useful illustratino of what’s involved. And yes, Model Context Protocol support is clearly on the agenda as well. MCP is emerging as the standard way for models to access tools at a frankly bewildering speed. Two weeks ago it wasn’t directly supported by the APIs of any of the major vendors. In just the past eight days it’s been added by OpenAI, Anthropic and Mistral! It’s feeling like a lot less of a moving target today. I want LLM to be able to act as an MCP client, so that any of the MCP servers people are writing can be easily accessed as additional sources of tools for LLM. If you’re interested in talking more about what comes next for LLM, come and chat to us in our Discord.


    Title: Changelog - LLM Author: URL: https://llm.datasette.io/en/stable/changelog.html Published: 2016-05-03T00:00:00.000Z Highlights:

    Back to top

    Toggle Light / Dark / Auto color theme

    Toggle table of contents sidebar

    Changelog #

    0.21 (2025-01-31) #

    • New model: o3-mini. #728

    • The o3-mini and o1 models now support a reasoning_effort option which can be set to low, medium or high.

    • llm prompt and llm logs now have a --xl/--extract-last option for extracting the last fenced code block in the response - a complement to the existing --x/--extract option. #717

    0.20 (2025-01-22) #

    • New model, o1. This model does not yet support streaming. #676

    • o1-preview and o1-mini models now support streaming.

    • New models, gpt-4o-audio-preview and gpt-4o-mini-audio-preview. #677

    • llm prompt -x/--extract option, which returns just the content of the first fenced code block in the response. Try llm prompt -x 'Python function to reverse a string'. #681

    • Creating a template using llm ... --save x now supports the -x/--extract option, which is saved to the template. YAML templates can set this option using extract: true.

    • New llm logs -x/--extract option extracts the first fenced code block from matching logged responses.

    • New llm models -q 'search' option returning models that case-insensitively match the search query. #700

    • Installation documentation now also includes uv. Thanks, Ariel Marcus. #690 and #702

    • llm models command now shows the current default model at the bottom of the listing. Thanks, Amjith Ramanujam. #688

    • Plugin directory now includes llm-venice, llm-bedrock, llm-deepseek and llm-cmd-comp.

    • Fixed bug where some dependency version combinations could cause a Client.__init__() got an unexpected keyword argument 'proxies' error. #709

    • OpenAI embedding models are now available using their full names of text-embedding-ada-002, text-embedding-3-small and text-embedding-3-large - the previous names are still supported as aliases. Thanks, web-sst. #654

    0.19.1 (2024-12-05) #

    • FIxed bug where llm.get_models() and llm.get_async_models() returned the same model multiple times. #667

    0.19 (2024-12-01) #

    • Tokens used by a response are now logged to new input_tokens and output_tokens integer columns and a token_details JSON string column, for the default OpenAI models and models from other plugins that implement this feature. #610

    • llm prompt now takes a -u/--usage flag to display token usage at the end of the response.

    • llm logs -u/--usage shows token usage information for logged responses.

    • llm prompt ... --async responses are now logged to the database. #641

    • llm.get_models() and llm.get_async_models() functions, documented here. #640

    • response.usage() and async response await response.usage() methods, returning a Usage(input=2, output=1, details=None) dataclass. #644

    • response.on_done(callback) and await response.on_done(callback) methods for specifying a callback to be executed when a response has completed, documented here. #653

    • Fix for bug running llm chat on Windows 11. Thanks, Sukhbinder Singh. #495

    0.19a2 (2024-11-20) #

    0.19a1 (2024-11-19) #

    • response.usage() and async response await response.usage() methods, returning a Usage(input=2, output=1, details=None) dataclass. #644

    0.19a0 (2024-11-19) #

    • Tokens used by a response are now logged to new input_tokens and output_tokens integer columns and a token_details JSON string column, for the default OpenAI models and models from other plugins that implement this feature. #610

    • llm prompt now takes a -u/--usage flag to display token usage at the end of the response.

    • llm logs -u/--usage shows token usage information for logged responses.

    • llm prompt ... --async responses are now logged to the database. #641

    0.18 (2024-11-17) #

    • Initial support for async models. Plugins can now provide an AsyncModel subclass that can be accessed in the Python API using the new llm.get_async_model(model_id) method. See async models in the Python API docs and implementing async models in plugins. #507

    • OpenAI models all now include async models, so function calls such as llm.get_async_model("gpt-4o-mini") will return an async model.

    • gpt-4o-audio-preview model can be used to send audio attachments to the GPT-4o audio model. #608

    • Attachments can now be sent without requiring a prompt. #611

    • llm models --options now includes information on whether a model supports attachments. #612

    • llm models --async shows available async models.

    • Custom OpenAI-compatible models can now be marked as can_stream: false in the YAML if they do not support streaming. Thanks, Chris Mungall. #600

    • Fixed bug where OpenAI usage data was incorrectly serialized to JSON. #614

    • Standardized on audio/wav MIME type for audio attachments rather than audio/wave. #603

    0.18a1 (2024-11-14) #

    • Fixed bug where conversations did not work for async OpenAI models. #632

    • __repr__ methods for Response and AsyncResponse.

    0.18a0 (2024-11-13) #

    Alpha support for async models. #507

    Multiple smaller changes.

    0.17 (2024-10-29) #

    Support for attachments, allowing multi-modal models to accept images, audio, video and other formats. #578

    The default OpenAI gpt-4o and gpt-4o-mini models can both now be prompted with JPEG, GIF, PNG and WEBP images.

    Attachments in the CLI can be URLs:

    llm-mgpt-4o"describe this image"\
    -ahttps://static.simonwillison.net/static/2024/pelicans.jpg
    
    

    Or file paths:

    llm-mgpt-4o-mini"extract text"-aimage1.jpg-aimage2.jpg
    
    

    Or binary data, which may need to use --attachment-type to specify the MIME type:

    catimage|llm-mgpt-4o-mini"extract text"--attachment-type-image/jpeg
    
    

    Attachments are also available in the Python API:

    model = llm.get_model("gpt-4o-mini")
    response = model.prompt(
     "Describe these images",
     attachments=[
     llm.Attachment(path="pelican.jpg"),
     llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg"),
     ]
    )
    
    

    Plugins that provide alternative models can support attachments, see Attachments for multi-modal models for details.

    The latest llm-claude-3 plugin now supports attachments for Anthropic’s Claude 3 and 3.5 models. The llm-gemini plugin supports attachments for Google’s Gemini 1.5 models.

    Also in this release: OpenAI models now record their "usage" data in the database even when the response was streamed. These records can be viewed using llm logs --json. #591

    0.17a0 (2024-10-28) #

    Alpha support for attachments. #578

    0.16 (2024-09-12) #

    • OpenAI models now use the internal self.get_key() mechanism, which means they can be used from Python code in a way that will pick up keys that have been configured using llm keys set or the OPENAI_API_KEY environment variable. #552. This code now works correctly:
    importllm
    print(llm.get_model("gpt-4o-mini").prompt("hi"))
    
    
    • New documented API methods: llm.get_default_model(), llm.set_default_model(alias), llm.get_default_embedding_model(alias), llm.set_default_embedding_model(). #553

    • Support for OpenAI’s new o1 family of preview models, llm -m o1-preview "prompt" and llm -m o1-mini "prompt". These models are currently only available to tier 5 OpenAI API users, though this may change in the future. #570

    0.15 (2024-07-18) #

    • Support for OpenAI’s new GPT-4o mini model: llm -m gpt-4o-mini 'rave about pelicans in French' #536

    • gpt-4o-mini is now the default model if you do not specify your own default, replacing GPT-3.5 Turbo. GPT-4o mini is both cheaper and better than GPT-3.5 Turbo.

    • Fixed a bug where llm logs -q 'flourish' -m haiku could not combine both the -q search query and the -m model specifier. #515

    0.14 (2024-05-13) #

    • Support for OpenAI’s new GPT-4o model: llm -m gpt-4o 'say hi in Spanish' #490

    • The gpt-4-turbo alias is now a model ID, which indicates the latest version of OpenAI’s GPT-4 Turbo text and image model. Your existing logs.db database may contain records under the previous model ID of gpt-4-turbo-preview. #493

    • New llm logs -r/--response option for outputting just the last captured response, without wrapping it in Markdown and accompanying it with the prompt. #431

    • Nine new plugins since version 0.13:

    • llm-claude-3 supporting Anthropic’s Claude 3 family of models.

    • llm-command-r supporting Cohere’s Command R and Command R Plus API models.

    • llm-reka supports the Reka family of models via their API.

    • llm-perplexity by Alexandru Geana supporting the Perplexity Labs API models, including llama-3-sonar-large-32k-online which can search for things online and llama-3-70b-instruct.

    • llm-groq by Moritz Angermann providing access to fast models hosted by Groq.

    • llm-fireworks supporting models hosted by Fireworks AI.

    • llm-together adds support for the Together AI extensive family of hosted openly licensed models.

    • llm-embed-onnx provides seven embedding models that can be executed using the ONNX model framework.

    • llm-cmd accepts a prompt for a shell command, runs that prompt and populates the result in your shell so you can review it, edit it and then hit to execute or ctrl+c to cancel, see this post for details.

    0.13.1 (2024-01-26) #

    • Fix for No module named 'readline' error on Windows. #407

    0.13 (2024-01-26) #

    See also LLM 0.13: The annotated release notes.

    • Added support for new OpenAI embedding models: 3-small and 3-large and three variants of those with different dimension sizes, 3-small-512, 3-large-256 and 3-large-1024. See OpenAI embedding models for details. #394

    • The default gpt-4-turbo model alias now points to gpt-4-turbo-preview, which uses the most recent OpenAI GPT-4 turbo model (currently gpt-4-0125-preview). #396

    • New OpenAI model aliases gpt-4-1106-preview and gpt-4-0125-preview.

    • OpenAI models now support a -o json_object 1 option which will cause their output to be returned as a valid JSON object. #373

    • New plugins since the last release include llm-mistral, llm-gemini, llm-ollama and llm-bedrock-meta.

    • The keys.json file for storing API keys is now created with 600 file permissions. #351

    • Documented a pattern for installing plugins that depend on PyTorch using the Homebrew version of LLM, despite Homebrew using Python 3.12 when PyTorch have not yet released a stable package for that Python version. #397

    • Underlying OpenAI Python library has been upgraded to >1.0. It is possible this could cause compatibility issues with LLM plugins that also depend on that library. #325

    • Arrow keys now work inside the llm chat command. #376

    • LLM_OPENAI_SHOW_RESPONSES=1 environment variable now outputs much more detailed information about the HTTP request and response made to OpenAI (and OpenAI-compatible) APIs. #404

    • Dropped support for Python 3.7.

    0.12 (2023-11-06) #

    • Support for the new GPT-4 Turbo model from OpenAI. Try it using llm chat -m gpt-4-turbo or llm chat -m 4t. #323

    • New -o seed 1 option for OpenAI models which sets a seed that can attempt to evaluate the prompt deterministically. #324

    0.11.2 (2023-11-06) #

    • Pin to version of OpenAI Python library prior to 1.0 to avoid breaking. #327

    0.11.1 (2023-10-31) #

    0.11 (2023-09-18) #

    LLM now supports the new OpenAI gpt-3.5-turbo-instruct model, and OpenAI completion (as opposed to chat completion) models in general. #284

    llm-mgpt-3.5-turbo-instruct'Reasons to tame a wild beaver:'
    
    

    OpenAI completion models like this support a -o logprobs 3 option, which accepts a number between 1 and 5 and will include the log probabilities (for each produced token, what were the top 3 options considered by the model) in the logged response.

    llm-mgpt-3.5-turbo-instruct'Say hello succinctly'-ologprobs3
    
    

    You can then view the logprobs that were recorded in the SQLite logs database like this:

    sqlite-utils"$(llmlogspath)"\
    'select * from responses order by id desc limit 1'|\
    jq'.[0].response_json'-r|jq
    
    

    Truncated output looks like this:

     [
     {
     "text": "Hi",
     "top_logprobs": [
     {
     "Hi": -0.13706253,
     "Hello": -2.3714375,
     "Hey": -3.3714373
     }
     ]
     },
     {
     "text": " there",
     "top_logprobs": [
     {
     " there": -0.96057636,
     "!\"": -0.5855763,
     ".\"": -3.2574513
     }
     ]
     }
     ]
    
    

    Also in this release:

    • The llm.user_dir() function, used by plugins, now ensures the directory exists before returning it. #275

    • New LLM_OPENAI_SHOW_RESPONSES=1 environment variable for displaying the full HTTP response returned by OpenAI compatible APIs. #286

    • The llm embed-multi command now has a --batch-size X option for setting the batch size to use when processing embeddings - useful if you have limited memory available. #273

    • The collection.embed_multi() method also now accepts an optional batch_size=int argument.

    • Fixed two bugs with llm embed-multi --files relating to handling of directories. Thanks, ealvar3z. #274, #280

    0.10 (2023-09-12) #

    The two major features in this release are the llm chat command and support for embedding binary data.

    See Build an image search engine with llm-clip, chat with models with llm chat for more background on these features.

    llm chat #

    The new llm chat command starts an ongoing chat conversation with a model in your terminal. It works with all models supported by LLM and its plugins, including locally installed models such as Llama 2. #231

    This offers a big performance boost for local models, since they don’t need to be freshly loaded into memory for each prompt.

    Here’s an example chat with Llama 2 13B, provided by the llm-mlc plugin.

    llmchat-mmlc-chat-Llama-2-13b-chat-hf-q4f16_1
    
    

    This starts a chat session:

    Type 'exit' or 'quit' to exit
    Type '!multi' to enter multiple lines, then '!end' to finish
    > Who are you?
    Hello! I'm just an AI, here to assist you with any questions you may have.
    My name is LLaMA, and I'm a large language model trained to provide helpful
    and informative responses to a wide range of questions and topics. I'm here
    to help you find the information you need, while ensuring a safe and
    respectful conversation. Please feel free to ask me anything, and I'll do my
    best to provide a helpful and accurate response.
    > Tell me a joke about otters
    Sure, here's a joke about otters:
    
    Why did the otter go to the party?
    
    Because he heard it was a "whale" of a time!
    
    (Get it? Whale, like a big sea mammal, but also a "wild" or "fun" time.
    Otters are known for their playful and social nature, so it's a lighthearted
    and silly joke.)
    
    I hope that brought a smile to your face! Do you have any other questions or
    topics you'd like to discuss?
    > exit
    
    

    Chat sessions are logged to SQLite - use llm logs to view them. They can accept system prompts, templates and model options - consult the chat documentation for details.

    Binary embedding support #

    LLM’s embeddings feature has been expanded to provide support for embedding binary data, in addition to text. #254

    This enables models like CLIP, supported by the new llm-clip plugin.

    CLIP is a multi-modal embedding model which can embed images and text into the same vector space. This means you can use it to create an embedding index of photos, and then search for the embedding vector for “a happy dog” and get back images that are semantically closest to that string.

    To create embeddings for every JPEG in a directory stored in a photos collection, run:

    llminstallllm-clip
    llmembed-multiphotos--filesphotos/'*.jpg'--binary-mclip
    
    

    Now you can search for photos of raccoons using:

    llm similar photos -c 'raccoon'
    
    

    This spits out a list of images, ranked by how similar they are to the string “raccoon”:

    {"id": "IMG_4801.jpeg", "score": 0.28125139257127457, "content": null, "metadata": null}
    {"id": "IMG_4656.jpeg", "score": 0.26626441704164294, "content": null, "metadata": null}
    {"id": "IMG_2944.jpeg", "score": 0.2647445926996852, "content": null, "metadata": null}
    ...
    
    

    Also in this release #

    • The LLM_LOAD_PLUGINS environment variable can be used to control which plugins are loaded when llm starts running. #256

    • The llm plugins --all option includes builtin plugins in the list of plugins. #259

    • The llm embed-db family of commands has been renamed to llm collections. #229

    • llm embed-multi --files now has an --encoding option and defaults to falling back to latin-1 if a file cannot be processed as utf-8. #225

    0.10a1 (2023-09-11) #

    • Support for embedding binary data. #254

    • llm chat now works for models with API keys. #247

    • llm chat -o for passing options to a model. #244

    • llm chat --no-stream option. #248

    • LLM_LOAD_PLUGINS environment variable. #256

    • llm plugins --all option for including builtin plugins. #259

    • llm embed-db has been renamed to llm collections. #229

    • Fixed bug where llm embed -c option was treated as a filepath, not a string. Thanks, mhalle. #263

    0.10a0 (2023-09-04) #

    • New llm chat command for starting an interactive terminal chat with a model. #231

    • llm embed-multi --files now has an --encoding option and defaults to falling back to latin-1 if a file cannot be processed as utf-8. #225

    0.9 (2023-09-03) #

    The big new feature in this release is support for embeddings. See LLM now provides tools for working with embeddings for additional details.

    Embedding models take a piece of text - a word, sentence, paragraph or even a whole article, and convert that into an array of floating point numbers. #185

    This embedding vector can be thought of as representing a position in many-dimensional-space, where the distance between two vectors represents how semantically similar they are to each other within the content of a language model.

    Embeddings can be used to find related documents, and also to implement semantic search - where a user can search for a phrase and get back results that are semantically similar to that phrase even if they do not share any exact keywords.

    LLM now provides both CLI and Python APIs for working with embeddings. Embedding models are defined by plugins, so you can install additional models using the plugins mechanism.

    The first two embedding models supported by LLM are:

    See Embedding with the CLI for detailed instructions on working with embeddings using LLM.

    The new commands for working with embeddings are:

    • llm embed - calculate embeddings for content and return them to the console or store them in a SQLite database.

    • llm embed-multi - run bulk embeddings for multiple strings, using input from a CSV, TSV or JSON file, data from a SQLite database or data found by scanning the filesystem. #215

    • llm similar - run similarity searches against your stored embeddings - starting with a search phrase or finding content related to a previously stored vector. #190

    • llm embed-models - list available embedding models.

    • llm embed-db - commands for inspecting and working with the default embeddings SQLite database.

    There’s also a new llm.Collection class for creating and searching collections of embedding from Python code, and a llm.get_embedding_model() interface for embedding strings directly. #191

    0.8.1 (2023-08-31) #

    • Fixed bug where first prompt would show an error if the io.datasette.llm directory had not yet been created. #193

    • Updated documentation to recommend a different llm-gpt4all model since the one we were using is no longer available. #195

    0.8 (2023-08-20) #

    • The output format for llm logs has changed. Previously it was JSON - it’s now a much more readable Markdown format suitable for pasting into other documents. #160

    • The new llm logs --json option can be used to get the old JSON format.

    • Pass llm logs --conversation ID or --cid ID to see the full logs for a specific conversation.

    • You can now combine piped input and a prompt in a single command: cat script.py | llm 'explain this code'. This works even for models that do not support system prompts. #153

    • Additional OpenAI-compatible models can now be configured with custom HTTP headers. This enables platforms such as openrouter.ai to be used with LLM, which can provide Claude access even without an Anthropic API key.

    • Keys set in keys.json are now used in preference to environment variables. #158

    • The documentation now includes a plugin directory listing all available plugins for LLM. #173

    • New related tools section in the documentation describing ttok, strip-tags and symbex. #111

    • The llm models, llm aliases and llm templates commands now default to running the same command as llm models list and llm aliases list and llm templates list. #167

    • New llm keys (aka llm keys list) command for listing the names of all configured keys. #174

    • Two new Python API functions, llm.set_alias(alias, model_id) and llm.remove_alias(alias) can be used to configure aliases from within Python code. #154

    • LLM is now compatible with both Pydantic 1 and Pydantic 2. This means you can install llm as a Python dependency in a project that depends on Pydantic 1 without running into dependency conflicts. Thanks, Chris Mungall. #147

    • llm.get_model(model_id) is now documented as raising llm.UnknownModelError if the requested model does not exist. #155

    0.7.1 (2023-08-19) #

    • Fixed a bug where some users would see an AlterError: No such column: log.id error when attempting to use this tool, after upgrading to the latest sqlite-utils 3.35 release. #162

    0.7 (2023-08-12) #

    The new Model aliases commands can be used to configure additional aliases for models, for example:

    llmaliasessetturbogpt-3.5-turbo-16k
    
    

    Now you can run the 16,000 token gpt-3.5-turbo-16k model like this:

    llm-mturbo'An epic Greek-style saga about a cheesecake that builds a SQL database from scratch'
    
    

    Use llm aliases list to see a list of aliases and llm aliases remove turbo to remove one again. #151

    Notable new plugins #

    Also in this release #

    • OpenAI models now have min and max validation on their floating point options. Thanks, Pavel Král. #115

    • Fix for bug where llm templates list raised an error if a template had an empty prompt. Thanks, Sherwin Daganato. #132

    • Fixed bug in llm install --editable option which prevented installation of .[test]. #136

    • llm install --no-cache-dir and --force-reinstall options. #146

    0.6.1 (2023-07-24) #

    • LLM can now be installed directly from Homebrew core: brew install llm. #124

    • Python API documentation now covers System prompts.

    • Fixed incorrect example in the Prompt templates documentation. Thanks, Jorge Cabello. #125

    0.6 (2023-07-18) #

    • Models hosted on Replicate can now be accessed using the llm-replicate plugin, including the new Llama 2 model from Meta AI. More details here: Accessing Llama 2 from the command-line with the llm-replicate plugin.

    • Model providers that expose an API that is compatible with the OpenAPI API format, including self-hosted model servers such as LocalAI, can now be accessed using additional configuration for the default OpenAI plugin. #106

    • OpenAI models that are not yet supported by LLM can also be configured using the new extra-openai-models.yaml configuration file. #107

    • The llm logs command now accepts a -m model_id option to filter logs to a specific model. Aliases can be used here in addition to model IDs. #108

    • Logs now have a SQLite full-text search index against their prompts and responses, and the llm logs -q SEARCH option can be used to return logs that match a search term. #109

    0.5 (2023-07-12) #

    LLM now supports additional language models, thanks to a new plugins mechanism for installing additional models.

    Plugins are available for 19 models in addition to the default OpenAI ones:

    • llm-gpt4all adds support for 17 models that can download and run on your own device, including Vicuna, Falcon and wizardLM.

    • llm-mpt30b adds support for the MPT-30B model, a 19GB download.

    • llm-palm adds support for Google’s PaLM 2 via the Google API.

    A comprehensive tutorial, writing a plugin to support a new model describes how to add new models by building plugins in detail.

    New features #

    • Python API documentation for using LLM models, including models from plugins, directly from Python. #75

    • Messages are now logged to the database by default - no need to run the llm init-db command any more, which has been removed. Instead, you can toggle this behavior off using llm logs off or turn it on again using llm logs on. The llm logs status command shows the current status of the log database. If logging is turned off, passing --log to the llm prompt command will cause that prompt to be logged anyway. #98

    • New database schema for logged messages, with conversations and responses tables. If you have previously used the old logs table it will continue to exist but will no longer be written to. #91

    • New -o/--option name value syntax for setting options for models, such as temperature. Available options differ for different models. #63

    • llm models list --options command for viewing all available model options. #82

    • llm "prompt" --save template option for saving a prompt directly to a template. #55

    • Prompt templates can now specify default values for parameters. Thanks, Chris Mungall. #57

    • llm openai models command to list all available OpenAI models from their API. #70

    • llm models default MODEL_ID to set a different model as the default to be used when llm is run without the -m/--model option. #31

    Smaller improvements #

    • llm -s is now a shortcut for llm --system. #69

    • llm -m 4-32k alias for gpt-4-32k.

    • llm install -e directory command for installing a plugin from a local directory.

    • The LLM_USER_PATH environment variable now controls the location of the directory in which LLM stores its data. This replaces the old LLM_KEYS_PATH and LLM_LOG_PATH and LLM_TEMPLATES_PATH variables. #76

    • Documentation covering Utility functions for plugins.

    • Documentation site now uses Plausible for analytics. #79

    0.4.1 (2023-06-17) #

    • LLM can now be installed using Homebrew: brew install simonw/llm/llm. #50

    • llm is now styled LLM in the documentation. #45

    • Examples in documentation now include a copy button. #43

    • llm templates command no longer has its display disrupted by newlines. #42

    • llm templates command now includes system prompt, if set. #44

    0.4 (2023-06-17) #

    This release includes some backwards-incompatible changes:

    • The -4 option for GPT-4 is now -m 4.

    • The --code option has been removed.

    • The -s option has been removed as streaming is now the default. Use --no-stream to opt out of streaming.

    Prompt templates #

    Prompt templates is a new feature that allows prompts to be saved as templates and re-used with different variables.

    Templates can be created using the llm templates edit command:

    llmtemplateseditsummarize
    
    

    Templates are YAML - the following template defines summarization using a system prompt:

    system:Summarize this text
    
    

    The template can then be executed like this:

    catmyfile.txt|llm-tsummarize
    
    

    Templates can include both system prompts, regular prompts and indicate the model they should use. They can reference variables such as $input for content piped to the tool, or other variables that are passed using the new -p/--param option.

    This example adds a voice parameter:

    system:Summarize this text in the voice of $voice
    
    

    Then to run it (via strip-tags to remove HTML tags from the input):

    curl-s'https://til.simonwillison.net/macos/imovie-slides-and-audio'|\
    strip-tags-m|llm-tsummarize-pvoiceGlaDOS
    
    

    Example output:

    My previous test subject seemed to have learned something new about iMovie. They exported keynote slides as individual images […] Quite impressive for a human.

    The Prompt templates documentation provides more detailed examples.

    Continue previous chat #

    You can now use llm to continue a previous conversation with the OpenAI chat models ( gpt-3.5-turbo and gpt-4). This will include your previous prompts and responses in the prompt sent to the API, allowing the model to continue within the same context.

    Use the new -c/--continue option to continue from the previous message thread:

    llm"Pretend to be a witty gerbil, say hi briefly"
    
    

    Greetings, dear human! I am a clever gerbil, ready to entertain you with my quick wit and endless energy.

    llm"What do you think of snacks?"-c
    
    

    Oh, how I adore snacks, dear human! Crunchy carrot sticks, sweet apple slices, and chewy yogurt drops are some of my favorite treats. I could nibble on them all day long!

    The -c option will continue from the most recent logged message.

    To continue a different chat, pass an integer ID to the --chat option. This should be the ID of a previously logged message. You can find these IDs using the llm logs command.

    Thanks Amjith Ramanujam for contributing to this feature. #6

    New mechanism for storing API keys #

    API keys for language models such as those by OpenAI can now be saved using the new llm keys family of commands.

    To set the default key to be used for the OpenAI APIs, run this:

    llmkeyssetopenai
    
    

    Then paste in your API key.

    Keys can also be passed using the new --key command line option - this can be a full key or the alias of a key that has been previously stored.

    See API key management for more. #13

    New location for the logs.db database #

    The logs.db database that stores a history of executed prompts no longer lives at ~/.llm/log.db - it can now be found in a location that better fits the host operating system, which can be seen using:

    llmlogspath
    
    

    On macOS this is ~/Library/Application Support/io.datasette.llm/logs.db.

    To open that database using Datasette, run this:

    datasette"$(llmlogspath)"
    
    

    You can upgrade your existing installation by copying your database to the new location like this:

    cp~/.llm/log.db"$(llmlogspath)"
    rm-rf~/.llm# To tidy up the now obsolete directory
    
    

    The database schema has changed, and will be updated automatically the first time you run the command.

    That schema is included in the documentation. #35

    Other changes #

    • New llm logs --truncate option (shortcut -t) which truncates the displayed prompts to make the log output easier to read. #16

    • Documentation now spans multiple pages and lives at https://llm.datasette.io/ #21

    • Default llm chatgpt command has been renamed to llm prompt. #17

    • Removed --code option in favour of new prompt templates mechanism. #24

    • Responses are now streamed by default, if the model supports streaming. The -s/--stream option has been removed. A new --no-stream option can be used to opt-out of streaming. #25

    • The -4/--gpt4 option has been removed in favour of -m 4 or -m gpt4, using a new mechanism that allows models to have additional short names.

    • The new gpt-3.5-turbo-16k model with a 16,000 token context length can now also be accessed using -m chatgpt-16k or -m 3.5-16k. Thanks, Benjamin Kirkbride. #37

    • Improved display of error messages from OpenAI. #15

    0.3 (2023-05-17) #

    • llm logs command for browsing logs of previously executed completions. #3

    • llm "Python code to output factorial 10" --code option which sets a system prompt designed to encourage code to be output without any additional explanatory text. #5

    • Tool can now accept a prompt piped directly to standard input. #11

    0.2 (2023-04-01) #

    • If a SQLite database exists in ~/.llm/log.db all prompts and responses are logged to that file. The llm init-db command can be used to create this file. #2

    0.1 (2023-04-01) #

    • Initial prototype release. #1

    Title: llm Author: URL: https://pypi.org/project/llm/0.26a1/ Published: 2025-05-26T00:00:00.000Z Highlights:

    • If you're not sure which to choose, learn more about installing packages. 98911926c28959953a3eea5109ffa3f48969165672fdc0c924edc78a1c0a3521 d06cfc342e58822bdbf852a0816bc23bd924895afef17119b1e5f723ee8e2bd1 The following attestation bundles were made for llm-0.26a1.tar.gz: Values shown here reflect the state when the release was signed and may no longer be current. Text: Project description

    A CLI tool and Python library for interacting with OpenAI, Anthropic’s Claude, Google’s Gemini, Meta’s Llama and dozens of other Large Language Models, both via remote APIs and with models that can be installed and run on your own machine. Watch Language models on the command-line on YouTube for a demo or read the accompanying detailed notes. With LLM you can:

    Run prompts from the command-line Store prompts and responses in SQLite Generate and store embeddings Extract structured content from text and images … and much, much more

    Quick start First, install LLM using pip or Homebrew or pipx or uv: pip install llm

    Or with Homebrew (see warning note): brew install llm

    Or with pipx: pipx install llm

    Or with uv uv tool install llm

    If you have an OpenAI API key key you can run this:

    Paste your OpenAI API key into this

    llm keys set openai

    Run a prompt (with the default gpt-4o-mini model)

    llm "Ten fun names for a pet pelican"

    Extract text from an image

    llm "extract text" -a scanned-document.jpg

    Use a system prompt against a file

    cat myfile.py | llm -s "Explain this code"

    Run prompts against Gemini or Anthropic with their respective plugins: llm install llm-gemini llm keys set gemini

    Paste Gemini API key here

    llm -m gemini-2.0-flash 'Tell me fun facts about Mountain View' llm install llm-anthropic llm keys set anthropic

    Paste Anthropic API key here

    llm -m claude-4-opus 'Impress me with wild facts about turnips'

    You can also install a plugin to access models that can run on your local device. If you use Ollama:

    Install the plugin

    llm install llm-ollama

    Download and run a prompt against the Orca Mini 7B model

    ollama pull llama3.2:latest llm -m llama3.2:latest 'What is the capital of France?'

    To start an interactive chat with a model, use llm chat: llm chat -m gpt-4.1

    Chatting with gpt-4.1 Type 'exit' or 'quit' to exit Type '!multi' to enter multiple lines, then '!end' to finish Type '!edit' to open your default editor and modify the prompt. Type '!fragment <my_fragment> [<another_fragment> ...]' to insert one or more fragments > Tell me a joke about a pelican Why don't pelicans like to tip waiters? Because they always have a big bill!

    More background on this project:

    llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs The LLM CLI tool now supports self-hosted language models via plugins LLM now provides tools for working with embeddings Build an image search engine with llm-clip, chat with models with llm chat You can now run prompts against images, audio and video in your terminal using LLM Structured data extraction from unstructured content using LLM schemas Long context support in LLM 0.24 using fragments and template plugins

    See also the llm tag on my blog. Contents

    Setup

    Installation Upgrading to the latest version Using uvx A note about Homebrew and PyTorch Installing plugins API key management

    Saving and using stored keys Passing keys using the –key option Keys in environment variables

    Configuration

    Setting a custom default model Setting a custom directory location Turning SQLite logging on and off

    Usage

    Executing a prompt

    Model options Attachments System prompts Tools Extracting fenced code blocks Schemas Fragments Continuing a conversation Tips for using LLM with Bash or Zsh Completion prompts

    Starting an interactive chat Listing available models Setting default options for models

    OpenAI models

    Configuration OpenAI language models Model features OpenAI embedding models OpenAI completion models Adding more OpenAI models

    Other models

    Installing and using a local model OpenAI-compatible models

    Extra HTTP headers

    Tools

    How tools work LLM’s implementation of tools

    Schemas

    Schemas tutorial

    Getting started with dogs Extracting people from a news articles

    Using JSON schemas Ways to specify a schema Concise LLM schema syntax Saving reusable schemas in templates Browsing logged JSON objects created using schemas

    Templates

    Getting started with –save Using a template Listing available templates Templates as YAML files

    System prompts Fragments Options Schemas Additional template variables Specifying default parameters Configuring code extraction Setting a default model for a template

    Template loaders from plugins

    Fragments

    Using fragments in a prompt Using fragments in chat Browsing fragments Setting aliases for fragments Viewing fragments in your logs Using fragments from plugins Listing available fragment prefixes

    Model aliases

    Listing aliases Adding a new alias Removing an alias Viewing the aliases file

    Embeddings

    Embedding with the CLI

    llm embed llm embed-multi llm similar llm embed-models llm collections list llm collections delete

    Using embeddings from Python

    Working with collections Retrieving similar items SQL schema

    Writing plugins to add new embedding models

    Embedding binary content LLM_RAISE_ERRORS

    Embedding storage format

    Plugins

    Installing plugins

    Listing installed plugins Running with a subset of plugins

    Plugin directory

    Local models Remote APIs Fragments and template loaders Embedding models Extra commands Just for fun

    Plugin hooks

    register_commands(cli) register_models(register) register_tools(register) register_template_loaders(register) register_fragment_loaders(register)

    Developing a model plugin

    The initial structure of the plugin Installing your plugin to try it out Building the Markov chain Executing the Markov chain Adding that to the plugin Understanding execute() Prompts and responses are logged to the database Adding options Distributing your plugin GitHub repositories Publishing plugins to PyPI Adding metadata What to do if it breaks

    Advanced model plugins

    Tip: lazily load expensive dependencies Models that accept API keys Async models Supporting schemas Attachments for multi-modal models Tracking token usage

    Utility functions for plugins

    llm.user_dir() llm.ModelError Response.fake()

    Python API

    Basic prompt execution

    Tools System prompts Attachments Schemas Fragments Model options Passing an API key Models from plugins Accessing the underlying JSON Token usage Streaming responses

    Async models

    Tool functions can be sync or async Tool use for async models

    Conversations

    Conversations using tools

    Listing models Running code when a response has completed Other functions

    set_alias(alias, model_id) remove_alias(alias) set_default_model(alias) get_default_model() set_default_embedding_model(alias) and get_default_embedding_model()

    Logging to SQLite

    Viewing the logs

    -s/–short mode Logs for a conversation Searching the logs Filtering past a specific ID Filtering by model Filtering by prompts that used specific fragments Filtering by prompts that used specific tools Browsing data collected using schemas

    Browsing logs using Datasette Backing up your database SQL schema

    Related tools

    strip-tags ttok Symbex

    CLI reference

    llm –help

    llm prompt –help llm chat –help llm keys –help llm logs –help llm models –help llm templates –help llm schemas –help llm tools –help llm aliases –help llm fragments –help llm plugins –help llm install –help llm uninstall –help llm embed –help llm embed-multi –help llm similar –help llm embed-models –help llm collections –help llm openai –help

    Contributing

    Updating recorded HTTP API interactions and associated snapshots Debugging tricks Documentation Release process

    Changelog

    Download files Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

    Source Distribution

    Built Distribution

    File details Details for the file llm-0.26a1.tar.gz. File metadata

    Download URL: llm-0.26a1.tar.gz

    Upload date: May 26, 2025

    Size: 79.6 kB Tags: Source

    Uploaded using Trusted Publishing? Yes Uploaded via: twine/6.1.0 CPython/3.12.9

    File hashes

    Hashes for llm-0.26a1.tar.gz

    Algorithm Hash digest

    SHA256 98911926c28959953a3eea5109ffa3f48969165672fdc0c924edc78a1c0a3521

    MD5 6510815a53a90f805cc6202b224295e1

    BLAKE2b-256 d06cfc342e58822bdbf852a0816bc23bd924895afef17119b1e5f723ee8e2bd1

    See more details on using hashes here.

    Provenance

    The following attestation bundles were made for llm-0.26a1.tar.gz:

    Publisher:

    publish.yml on simonw/llm

    Attestations:

    Values shown here reflect the state when the release was signed and may no longer be current.

    Statement:

    Statement type: https://in-toto.io/Statement/v1 Predicate type: https://docs.pypi.org/attestations/publish/v1 Subject name: llm-0.26a1.tar.gz Subject digest: 98911926c28959953a3eea5109ffa3f48969165672fdc0c924edc78a1c0a3521 Sigstore transparency entry: 219927996 Sigstore integration time: May 26, 2025

    Source repository:

    Permalink: simonw/llm@15a497635d0f3402defcf53bda8f13b4b63e21b5

    Branch / Tag: refs/tags/0.26a1

    Owner: https://github.com/simonw Access: public

    Publication detail:

    Token Issuer: https://token.actions.githubusercontent.com Runner Environment: github-hosted Publication workflow:

    publish.yml@15a497635d0f3402defcf53bda8f13b4b63e21b5

    Trigger Event: release

    File details Details for the file llm-0.26a1-py3-none-any.whl. File metadata

    Download URL: llm-0.26a1-py3-none-any.whl

    Upload date: May 26, 2025

    Size: 76.8 kB Tags: Python 3

    Uploaded using Trusted Publishing? Yes Uploaded via: twine/6.1.0 CPython/3.12.9

    File hashes

    Hashes for llm-0.26a1-py3-none-any.whl

    Algorithm Hash digest

    SHA256 b8e0b24e34dc49b6870107b420378ee83d1bd785eae7230bc36bcb2be85af4ff

    MD5 7ad9ab8289084666e9f05409c78fdca8

    BLAKE2b-256 63f65c62686867772beed7c1c1617ecae66b8819913987ecec46bb62114192d4

    See more details on using hashes here.

    Provenance

    The following attestation bundles were made for llm-0.26a1-py3-none-any.whl:

    Publisher:

    publish.yml on simonw/llm

    Attestations:

    Values shown here reflect the state when the release was signed and may no longer be current.

    Statement:

    Statement type: https://in-toto.io/Statement/v1 Predicate type: https://docs.pypi.org/attestations/publish/v1 Subject name: llm-0.26a1-py3-none-any.whl Subject digest: b8e0b24e34dc49b6870107b420378ee83d1bd785eae7230bc36bcb2be85af4ff Sigstore transparency entry: 219927998 Sigstore integration time: May 26, 2025

    Source repository:

    Permalink: simonw/llm@15a497635d0f3402defcf53bda8f13b4b63e21b5

    Branch / Tag: refs/tags/0.26a1

    Owner: https://github.com/simonw Access: public

    Publication detail:

    Token Issuer: https://token.actions.githubusercontent.com Runner Environment: github-hosted Publication workflow:

    publish.yml@15a497635d0f3402defcf53bda8f13b4b63e21b5

    Trigger Event: release


Response

LLM 0.26 was released on May 27, 2025, and the biggest new feature in this version is official support for tools. Here's a summary of what's new and notable in LLM 0.26:

  • LLM can now run tools. You can grant LLMs from OpenAI, Anthropic, Gemini, and local models access to any tool you represent as a Python function.
  • Tool plugins are introduced, allowing installation of plugins that add new capabilities to any model you use.
  • Tools can be installed from plugins and loaded by name with the --tool/-T option.
  • You can pass Python function code on the command line with the --functions option to add ad-hoc command-line tools.
  • The Python API supports tools, including async and sync contexts.
  • Demonstrations with built-in tools like llm_version and llm_time show how tools can be invoked.
  • Examples of plugin tools include:
    • llm-tools-simpleeval for safe Python expression evaluation (useful for math calculations).
    • llm-tools-quickjs for running sandboxed JavaScript code.
    • llm-tools-sqlite for read-only SQL queries on SQLite databases.
    • llm-tools-datasette for querying remote Datasette instances.
  • Tools allow LLMs to overcome limitations such as poor math skills by delegating calculations to reliable tool executions.
  • Using --functions, you can define Python code inline to create tools, e.g., web searching or other custom logic.
  • The Python API method model.chain() supports chaining and tool calls to execute multiple sequential actions and integrate tool outputs into prompts.
  • The new tools feature aligns with industry patterns called "tool use" or "function calling" adopted by major vendors like OpenAI, Anthropic, Google, and more.
  • LLM 0.26 also introduces improvements for plugins, schemas, fragments, templates, embeddings, and more.
  • Plugins support models from various providers, including OpenAI, Anthropic, Google Gemini, and local models via Ollama.

In short, the 0.26 release of LLM focuses heavily on making large language models more extensible and capable by integrating tool usage tightly in the CLI and Python API, enabling new workflows and enhanced abilities.

For a full detailed explanation and examples, see this article by Simon Willison on LLM 0.26 tools: https://simonwillison.net/2025/May/27/llm-tools/

Token usage

21,528 input, 501 output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment