Skip to content

Instantly share code, notes, and snippets.

@danclaytondev
Created December 9, 2024 16:54
Show Gist options
  • Save danclaytondev/51c6d1add250c092777c7e4a4b773341 to your computer and use it in GitHub Desktop.
Save danclaytondev/51c6d1add250c092777c7e4a4b773341 to your computer and use it in GitHub Desktop.
Testing Ollama v0.5 to see if JSON format is injected into Promt

Testing the new structured output format example from Ollama v0.5 blog post

I wanted to see if the JSON schema in the new Ollama stuctured outputs was injected into the prompt to help the model output valid JSON. It doesn't look like it is.

OLLAMA_DEBUG=1 ollama serve

New shell

curl -X POST http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
  "model": "llama3.2",
  "messages": [{"role": "user", "content": "Tell me about Canada."}],
  "stream": false,
  "format": {
    "type": "object",
    "properties": {
      "name": {
        "type": "string"
      },
      "capital": {
        "type": "string"
      },
      "languages": {
        "type": "array",
        "items": {
          "type": "string"
        }
      }
    },
    "required": [
      "name",
      "capital", 
      "languages"
    ]
  }
}'
{
  "model": "llama3.2",
  "created_at": "2024-12-09T16:45:04.380923Z",
  "message": {
    "role": "assistant",
    "content": "{ \"capital\": \"Ottawa\", \"languages\": [\"English\",\"français\"] ,\"name\": \"Canada\" }"
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 1551747417,
  "load_duration": 27405625,
  "prompt_eval_count": 30,
  "prompt_eval_duration": 858000000,
  "eval_count": 29,
  "eval_duration": 663000000
}

And if we look in the logs from the ollama serve shell,

time=2024-12-09T16:41:53.293Z level=DEBUG source=routes.go:1464 msg="chat request" images=0 prompt="<|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\n\n<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nTell me about Canada.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
time=2024-12-09T16:41:53.294Z level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=0 prompt=30 used=0 remaining=30
[GIN] 2024/12/09 - 16:41:57 | 200 |    5.3341375s |       127.0.0.1 | POST     "/api/chat"
time=2024-12-09T16:41:57.533Z level=DEBUG source=sched.go:466 msg="context for request finished"
time=2024-12-09T16:41:57.533Z level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/Users/daniel.clayton/.ollama/models/blobs/sha256-dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff duration=5m0s
time=2024-12-09T16:41:57.533Z level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/Users/daniel.clayton/.ollama/models/blobs/sha256-dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff refCount=0

There is certainly nothing in the prompt about the JSON schema. The model still produced valid JSON.

Comparing to tool calling

Example from here

curl -X POST http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
  "model": "llama3.2",
  "messages": [
    {
      "role": "user",
      "content": "What is the weather in Toronto?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather for a city",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {
              "type": "string",
              "description": "The name of the city"
            }
          },
          "required": ["city"]
        }
      }
    }
  ]
}'
{
  "model": "llama3.2",
  "created_at": "2024-12-09T16:49:27.254675Z",
  "message": {
    "role": "assistant",
    "content": "",
    "tool_calls": [
      {
        "function": {
          "name": "get_current_weather",
          "arguments": {
            "city": "Toronto"
          }
        }
      }
    ]
  },
  "done": false
}
{
  "model": "llama3.2",
  "created_at": "2024-12-09T16:49:27.27598Z",
  "message": {
    "role": "assistant",
    "content": ""
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 991354584,
  "load_duration": 19719709,
  "prompt_eval_count": 166,
  "prompt_eval_duration": 673000000,
  "eval_count": 15,
  "eval_duration": 297000000
}

Now looking in the logs from ollama serve again:

time=2024-12-09T16:49:26.304Z level=DEBUG source=routes.go:1464 msg="chat request" images=0 prompt="<|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\n\nWhen you receive a tool call response, use the output to format an answer to the orginal user question.\n\nYou are a helpful assistant with tool calling capabilities.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nGiven the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.\n\nRespond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}. Do not use variables.\n\n{\"type\":\"function\",\"function\":{\"name\":\"get_current_weather\",\"description\":\"Get the current weather for a city\",\"parameters\":{\"type\":\"object\",\"required\":[\"city\"],\"properties\":{\"city\":{\"type\":\"string\",\"description\":\"The name of the city\"}}}}}\n\nWhat is the weather in Toronto?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

We can see lots of info injected into the prompt to help the model produce a tool call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment