Model: o4-mini
-- none --
5cffc2099f0cb10f33ff1225aa39a46e13e302afd7260b592b1fabe5b1c4a5c0
# Prompting with LLM
Let's start by running some prompts using the LLM command-line interface.
## Setting a default model
LLM defaults to [gpt-4o-mini](https://platform.openai.com/docs/models/gpt-4o-mini). A month ago OpenAI [released the GPT-4.1 series](https://openai.com/index/gpt-4-1/). They're a big step up from GPT-4o - in particular, they have a one million token context window which means you can feed them a *lot* more data (gpt-4o-mini was limited to 128,000 tokens).
Let's switch to [gpt-4.1-mini](https://platform.openai.com/docs/models/gpt-4.1-mini) as our new default model:
```bash
llm models default gpt-4.1-mini
```
## Running a prompt
The LLM command-line tool takes a prompt as its first argument:
```bash
llm 'Ten pun names for a teashop run by a pelican and a walrus'
```
## What did that do for us?
Let's run a prompt the manual way, using `curl` and the OpenAI API:
```bash
curl https://api.openai.com/v1/chat/completions \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $(llm keys get openai)" \
-d '{
"model": "gpt-4.1-mini",
"messages": [
{"role": "user", "content": "Ten pun names for a teashop run by a pelican and a walrus"}
]
}'
```
Now try that again with `"stream": true` to see what the streaming response looks like:
```bash
curl https://api.openai.com/v1/chat/completions \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $(llm keys get openai)" \
-d '{
"model": "gpt-4.1-mini",
"stream": true,
"messages": [
{"role": "user", "content": "Ten pun names for a teashop run by a pelican and a walrus"}
]
}'
```
Every API provider has a similar, albeit slightly different, way of doing this. LLM and its plugins provide wrappers around those APIs so you don't need to think about those differences.
## Continuing the conversation
The `llm -c` flag stands for `--continue` - it lets you continue the most previous conversation:
```bash
llm -c 'Three more with darker overtones'
```
## Seeing it in the logs
LLM logs every prompt and response to a SQLite database. You can see the location of that database by running:
```bash
llm logs path
```
The `llm logs` command shows logged conversations. Use `-c` for the most recent conversation:
```bash
llm logs -c
```
The output looks something like this:
```
# 2025-05-14T13:54:58 conversation: 01jv7h7jcf20b4hbg3jnh57syh id: 01jv7h7ens68awxrk17p2pq356
Model: **gpt-4.1-mini**
## Prompt
Ten pun names for a teashop run by a pelican and a walrus
## Response
Sure! Here are ten punny teashop name ideas featuring a pelican and a walrus:
1. **The Pelitea & Wally Brew**
2. **Beak & Tusks Tea House**
...
```
As you can see, the output is in Markdown format. I frequently share my conversation logs by pasting that into a [GitHub Gist](https://gist.github.com).
Add the `-u` (short for `--usage`) flag to see how many tokens were used in the conversation:
```bash
llm logs -c -u
```
You can also get output in JSON using the `--json` flag:
```bash
llm logs -c --json
```
Every conversation has an ID. If you know the ID of a conversation you can retrieve its logs using `--cid ID`.
```bash
llm logs --cid 01jv7h7jcf20b4hbg3jnh57syh
```
The `-s` option stands for `--short` and provides a more compact view, useful for finding conversation IDs:
```bash
llm logs -s
```
Add `-q` to search:
```bash
llm logs -s -q 'pelican'
```
And `-n 0` to see **every** match:
```bash
llm logs -s -q 'pelican' -n 0
```
## Browsing the logs with Datasette
[Datasette](https://datasette.io/) is my open source tool for exploring SQLite databases. Since LLM logs to SQLite you can explore that database in your web browser using Datasette like this:
```bash
datasette "$(llm logs path)"
```
This will start a local web server which you can visit at `https://localhost:8001/`
On Codespaces you should first run this command to install a plugin to make Datasette work better in that environment:
```bash
datasette install datasette-codespaces
# Then
datasette "$(llm logs path)"
```
## Using different models
Use the `-m` option to specify a different model. You can see a list of available models by running:
```bash
llm models list
```
Add the `--options` flag to learn more about them, including what options they support and what capabilities they hav:
```bash
llm models list --options
```
Let's get some pun names for a teashop from the more powerful `o4-mini`:
```bash
llm 'Ten pun names for a teashop run by a pelican and a walrus' -m o4-mini
```
[o4-mini](https://platform.openai.com/docs/models/o4-mini) is a reasoning model, so there's a delay at the start while it "thinks" about the problem.
## Piping in content
The best thing about having a command-line tool for interacting with models is you can pipe things in!
## Using system prompts
A **system prompt** is a special kind of prompt that has higher weight than the rest of the prompt. It's useful for providing instructions about *what to do* with the rest of the input.
```bash
cat requirements.txt | llm -s 'convert this to pyproject.toml'
```
## Prompting with an image
LLM supports **attachments**, which are files that you can attach to a prompt. Attachments can be specified as filepaths or as URLs.
Let's describe a photograph:
```bash
llm -a https://static.simonwillison.net/static/2025/two-pelicans.jpg 'Describe this image' -u
```
That `-u` causes the token usage to be displayed. You can paste that token line into https://www.llm-prices.com/ and select the model to get a price estimate.
## Using fragments
The `-f` option can be used to specify a **fragment** - an extra snippet of text to be added to the prompt. Like attachments, these can be filepaths or URLs.
Fragments are mainly useful as a storage optimization: the same fragment will be stored just once in the database no matter how many prompts you use it with.
Here's our `requirements.txt` example again, this time with a fragment:
```bash
llm -f requirements.txt 'convert this to pyproject.toml'
```
The `-e` option can be used with `llm logs` to expand any fragments:
```bash
llm logs -c -e
```
## Fragment plugins
The most exciting thing about fragments is that they can be customized with **plugins**.
Install the [llm-fragments-github](https://github.com/simonw/llm-fragments-github) plugin like this:
```bash
llm install llm-fragments-github
```
This adds several new fragment types, including `github:` which can be used to fetch the full contents of a repository and `issue:` which can load an issue thread.
```bash
llm -f issue:https://github.com/simonw/llm/issues/898 -s 'summarize this issue'
```
Or let's suggest some new features for that plugin:
```bash
llm -f github:simonw/llm-fragments-github -s 'Suggest new features for this plugin' -u
```
This is a good point for a digression to talk about **long context** and why it's such an important trend.
09c8f5bba5e59ea201af661bfd6b6a02c06282d45da334a6b6f0c45556296eb2
# Building a text to SQL tool
We're going to build something *genuinely useful*. Weirdly enough, this is a "hello world" exercise for prompt engineering.
Ask a question of your database in English, get a response from a custom SQL query written by the LLM.
## Prototyping against the logs database
We're going to use the LLM logs database itself, and prototype against it using the `sqlite-utils` CLI tool:
```bash
sqlite-utils schema "$(llm logs path)"
```
Let's write that to a file:
```bash
sqlite-utils schema "$(llm logs path)" > schema.sql
```
Now we can feed it to LLM and write our first query:
```bash
llm -f schema.sql \
-s "reply with sqlite SQL" \
"how many conversations are there?"
```
I got back this:
````
```sql
SELECT COUNT(*) AS conversation_count FROM conversations;
```
````
As you can see, the LLM decided to wrap it in a fenced code block.
We could ask it not to, but we can also use the `--extract` flag to extract the SQL from the response:
```bash
llm -f schema.sql \
-s "reply with sqlite SQL" \
--extract \
"how many conversations are there?"
```
Let's run that query in the most diabolical way possible:
```bash
sqlite-utils "$(llm logs path)" "$(llm -f schema.sql \
-s 'reply with sqlite SQL' \
--extract \
'how many conversations are there?')"
```
## Turning that into a Python function
Let's upgrade our hacky CLI prototype into a Python function.
```python
import sqlite_utils
import llm
model = llm.get_model("gpt-4.1-mini")
def text_to_sql(db: sqlite_utils.Database, question: str) -> str:
"""Convert a prompt to SQL using the LLM."""
prompt = "Schema:\n\n{}\n\nQuestion:\n\n{}".format(
db.schema, question
)
return model.prompt(
prompt,
system="reply with SQLite SQL, not in markdown, just the SQL",
).text()
db = sqlite_utils.Database(llm.user_dir() / "logs.db")
sql = text_to_sql(db, "how many conversations are there?")
print(sql)
# Now execute it
result = db.query(sql)
print(list(result))
```
## Upgrading that to a CLI tool
Now that we have this working, let's turn it into a small CLI tool using `argparse` from the Python standard library:
```python
import argparse
from pathlib import Path
import sqlite_utils
import llm
# pick your model
model = llm.get_model("gpt-4.1-mini")
def text_to_sql(db: sqlite_utils.Database, question: str) -> str:
"""Convert an English question into a SQLite SQL statement."""
prompt = "Schema:\n\n{}\n\nQuestion:\n\n{}".format(db.schema, question)
resp = model.prompt(
prompt,
system="reply with SQLite SQL, not in markdown, just the SQL",
)
return resp.text().strip()
def main():
parser = argparse.ArgumentParser(
description="Turn a natural-language question into SQL (and optionally run it)."
)
parser.add_argument(
"question",
help="The question to ask of your SQLite database, in plain English.",
)
parser.add_argument(
"--db",
"-d",
default=str(llm.user_dir() / "logs.db"),
help="Path to the SQLite database file. [default: %(default)s]",
)
parser.add_argument(
"--execute",
"-x",
action="store_true",
help="Execute the generated SQL and print the results instead of just showing the SQL.",
)
args = parser.parse_args()
db_path = Path(args.db)
if not db_path.exists():
parser.error(f"Database file not found: {db_path!r}")
db = sqlite_utils.Database(db_path)
sql = text_to_sql(db, args.question)
if args.execute:
try:
rows = list(db.query(sql))
except Exception as e:
print("ERROR running SQL:", e)
print("SQL was:", sql)
raise SystemExit(1)
# print rows as simple CSV
for row in rows:
print(row)
else:
print(sql)
if __name__ == "__main__":
main()
```
Here's a fun note: the above block just said "FILL ME" and then I ran this command:
```bash
llm -m o4-mini -f text-to-sql.md -s 'Write the code for the FILL ME bit'
```
## Ways to make this better
This is the most basic version of this, but it works pretty well!
Some ways we could make this better:
- **Examples**. The single most powerful prompt engineering trick is to give the LLM illustrative examples of what you are trying to achieve. A small number of carefully selected examples of questions and the expected SQL answer can radically improve the results.
- **Column values**. A common failure case for text to SQL is when the question is e.g. "How many schools are in California?" and the model queries for `where state = 'California'` when it should have queried for `where state = 'CA'`. Feeding in some example value from each column can be all the model needs to get it right.
- **Data documentation**. Just like a real data analyst, the more information you can feed the model the better.
- **Loop on errors**. If the SQL query fails to run, feed the error back to the LLM and have it try again. You can use `EXPLAIN ...` for a cheap validation of the query without running the whole thing.
6212fd909bdf9ec712777347ebe11732d4d2246b5f4aecddc3d86fa2b0515013
(tools)=
# Tool usage
This feature of LLM is *so new* that I haven't even formally announced it yet. I got the alpha together just in time for the conference.
LLM systems can be expanded by giving them the ability to **execute tools**.
Ever heard the complaints about how LLMs are computers that can'd do math, and language tools that can't count the number of Rs in the word strawberry?
Tools can solve those problems, and so many more.
LLM can treat any Python function as a tool. It can grant tool access to a range of models - currently models from OpenAI, Anthropic and Gemini are supported by their relevant alpha plugins.
## Passing Python functions to an LLM
Let's solve the hardest problem in LLMs:
```bash
llm --functions '
def count_char_in_string(char: str, string: str) -> int:
"""Count the number of times a character appears in a string."""
return string.lower().count(char.lower())
' 'Count the number of Rs in the word strawberry' --td
```
The `--td` stands for `--tools-debug` - it shows you what's going on.
Use `--ta/--tools-approve` to manually approve every tool call.
Run `llm logs -c` to see how tool calls are logged.
## Multiple tools can run in a row
Here's a simple example of that:
```bash
llm --functions '
def lookup_population(country: str) -> int:
"Returns the current population of the specified fictional country"
return 123124
def can_have_dragons(population: int) -> bool:
"Returns True if the specified population can have dragons, False otherwise"
return population > 10000
' 'Can the country of Crumpet have dragons?' --td
```
## Tools in Python
In Python code an array of functions can be passed to the `tools=` argument of the `chain` method of a model:
```python
model = llm.get_model("gpt-4.1-mini")
def lookup_population(country: str) -> int:
"Returns the current population of the specified fictional country"
return 123124
def can_have_dragons(population: int) -> bool:
"Returns True if the specified population can have dragons, False otherwise"
return population > 10000
chain_response = model.chain(
"Can the country of Crumpet have dragons? Answer with only YES or NO",
tools=[lookup_population, can_have_dragons],
stream=False,
key=API_KEY,
)
print(chain_response.text())
```
The `chain()` method handles running multiple prompts in a row to pass the tool results on to th e next prompt.
## Running calculations with simpleeval
We can do mathematics using [danthedeckie/simpleeval](https://github.com/danthedeckie/simpleeval):
```bash
llm install simpleeval # Puts it in the right environment
llm --functions 'from simpleeval import simple_eval' \
'Calculate 42342 * 21123' --td
```
## Let's build some tools!
Open exercise: who can come up with the most interesting tool?
`httpx` is already available in LLM's environment.
Based on this documentation implement a short Python script for the "chat with your database" example, it should take input from the user a line at a time and execute a new execute_sql() tool to try and answer their questions, displaying the results as pretty-printed JSON to them
Here’s a self-contained script called chat_with_db.py
that will:
- Open your LLM logs SQLite database (by default
~/.llm/logs.db
). - Read questions from stdin, one line at a time.
- Send each question to the LLM, giving it an
execute_sql(sql: str)
tool. - Pretty-print the JSON results of whatever SQL the model chooses to run.
#!/usr/bin/env python3
import json
import sqlite_utils
import llm
from pathlib import Path
def main():
# 1. locate the SQLite DB
db_path = Path(llm.user_dir()) / "logs.db"
if not db_path.exists():
raise FileNotFoundError(f"Database not found at {db_path!r}")
db = sqlite_utils.Database(db_path)
# 2. pick your model
model = llm.get_model("gpt-4.1-mini")
# 3. wrap execute_sql into a tool the LLM can call
def execute_sql(sql: str):
"""
Tool for the LLM: run arbitrary SQL against our SQLite DB
and return the rows as a list of dicts.
"""
try:
return list(db.query(sql))
except Exception as e:
return {"error": str(e)}
# 4. Prepare a system prompt with the schema
system_prompt = f"""You are a helpful assistant for a SQLite database.
Here is the schema of the database you have access to:
{db.schema}
When given a user question, think about the SQL you need,
then call the execute_sql(sql: str) tool with exactly that SQL.
Finally return only the JSON result of that tool call (no markdown)."""
print(f"Connected to {db_path}")
print("ASK A QUESTION (Ctrl-D to exit)")
while True:
try:
question = input(">> ").strip()
except EOFError:
print("\nGoodbye.")
break
if not question:
continue
# 5. call the model with our tool
chain_response = model.chain(
question,
system=system_prompt,
tools=[execute_sql],
stream=False
)
# 6. drill into the tool call result if we can
result = None
if hasattr(chain_response, "tool_calls") and chain_response.tool_calls:
# take the last tool result
result = chain_response.tool_calls[-1].result
else:
# fallback: try parsing the raw assistant text as JSON
text = chain_response.text().strip()
try:
result = json.loads(text)
except Exception:
# last resort: just show whatever the assistant said
print(text)
continue
# 7. pretty-print JSON
print(json.dumps(result, indent=2, default=str))
print()
if __name__ == "__main__":
main()
Save that as chat_with_db.py
, make it executable (chmod +x chat_with_db.py
), install your dependencies (pip install llm sqlite-utils
) and then run:
./chat_with_db.py
Now type questions like:
>> how many conversations were logged yesterday?
… and the script will show you the results as nicely-formatted JSON.
3,870 input, 4,314 output, {"completion_tokens_details": {"reasoning_tokens": 3584}}