Skip to content

Instantly share code, notes, and snippets.

@roychri
Created May 2, 2024 17:50
Show Gist options
  • Save roychri/33a48738703f9a81698f752c58bfa64b to your computer and use it in GitHub Desktop.
Save roychri/33a48738703f9a81698f752c58bfa64b to your computer and use it in GitHub Desktop.
Stream Ollama (openai) chat completion API on CLI with HTTPie and jq

Stream Ollama (openai) chat completion API on CLI with HTTPie and jq

Explanation

This command sends a request to the Chat Completion API to generate high-level documentation for the file @src/arch.js. The API is configured to use the llama3-gradient model and to respond in Markdown format.

The messages array contains two elements:

  • The first element is a system message that provides the prompt for the API.
  • The second element is a user message that specifies the file for which to generate documentation.

The options object configures the API to use a repeat penalty of 1.5, a temperature of 0.2, and to generate 2048 predictions. The stream option is set to true to enable streaming output.

The output of the API is piped through several commands to process and format the response:

  • cut -c 7- removes the first 6 characters from each line of output.
  • sed's/\[DONE\]//' removes the [DONE] marker from the output.
  • jq --stream -r -j 'fromstream(1|truncate_stream(inputs))[0].delta.content' extracts the generated documentation from the API response and formats it as a JSON stream.

Usage

To use this command, replace @src/arch.js with the path to the file for which you want to generate documentation. You can also modify the options object to customize the behavior of the API.

http localhost:11434/v1/chat/completions \
model=llama3-gradient \
messages[0][role]=system \
messages[0][content]="Create a high level documentation of the file provided by the user. Do not repeat yourself. Respond in Markdown" \
messages[1][role]=user \
messages[1][content]=@src/arch.js \
options[repeat_penalty]:=1.5 \
options[temperature]:=0.2 \
options[num_predict]:=2048 \
stream:=true | \
cut -c 7- | \
sed 's/\[DONE\]//' | \
jq --stream -r -j 'fromstream(1|truncate_stream(inputs))[0].delta.content'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment