Skip to content

Instantly share code, notes, and snippets.

@dougbtv
Last active May 9, 2025 14:02
Show Gist options
  • Save dougbtv/fb4415371e979388cf93193f40f0d870 to your computer and use it in GitHub Desktop.
Save dougbtv/fb4415371e979388cf93193f40f0d870 to your computer and use it in GitHub Desktop.
Quick python script to try structured output with vLLM
from openai import OpenAI
import argparse
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="-",
)
parser = argparse.ArgumentParser(description="Query local vLLM")
parser.add_argument("prompt", help="Prompt to send to the model")
args = parser.parse_args()
chat_template = (
"{% for message in messages %}"
"{% if message['role'] == 'system' %}"
"<|system|>{{ message['content'] }}\n"
"{% elif message['role'] == 'user' %}"
"<|user|>{{ message['content'] }}\n"
"{% elif message['role'] == 'assistant' %}"
"<|assistant|>{{ message['content'] }}\n"
"{% endif %}"
"{% endfor %}"
"<|assistant|>"
)
completion = client.chat.completions.create(
model="mistralai/Mistral-7B-v0.1",
messages=[
{"role": "user", "content": "Classify if this is rad or bogus as if you're some ski bro from Vermont, dude: " + args.prompt}
],
extra_body={
"chat_template": chat_template,
"guided_choice": ["rad", "bogus"]
},
)
print(completion.choices[0].message.content)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment