Skip to content

Instantly share code, notes, and snippets.

@synthbot-anon
Last active December 24, 2023 23:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save synthbot-anon/3397743abc118885898a88e9d6e7b8b0 to your computer and use it in GitHub Desktop.
Save synthbot-anon/3397743abc118885898a88e9d6e7b8b0 to your computer and use it in GitHub Desktop.

Installation

pip install --upgrade git+https://github.com/ThatOneDevGuy/guidance_instructor

Overview

This lets you reliably get structured outputs from open source LLMs using guidance + pydantic.

  • guidance is a library for restricting LLM outputs based on regular expressions and context free grammars.
  • pydantic is an OpenAPI-friendly library for defining data classes with strong typing and validation.

Example

from enum import Enum
from typing import Optional
from typing_extensions import Annotated

import guidance
from guidance_instructor import generate_object
from pydantic import BaseModel

# Load a chat-finetuned mistral model
model = guidance.models.LlamaCppChat("/bulk/models/openhermes-2.5-mistral-7b.Q5_K_M.gguf")

# Create a sample pydantic class. This nests a FruitEnum inside a SimpleClass.
class FruitEnum(str, Enum):
    pear = "pear"
    banana = "banana"
    apple = "apple"

class SimpleClass(BaseModel):
    name: Annotated[str, "Put the pony's name here."]
    species: Annotated[str, "Put the pony's species here."]
    favorite_fruit: Optional[FruitEnum]

# Send the 'user' message to the LLM. In guidance, messages and generations get appended
# to some pseudo-string that begins with the underlying model. The `with guidance.user()`
# notation is how guidance abstracts different roles that the LLM recognizes.
with guidance.user():
    lm = model + "Extract the following into an object: Applejack is a cute earth pony mare with three apples for a cutie mark, which happen to be her favorite fruit."

# Read the 'assistant' message from the LLM
with guidance.assistant():
    # This returns an "lm" object, which can be continued for further generations, plus a
    # "applejack" object, which contains a SimpleClass representation of Applejack's information as
    # described in the user instruction above.
    lm, applejack = generate_object(lm, SimpleClass)

print(applejack)
# prints: {'name': 'Applejack', 'species': 'earth pony', 'favorite_fruit': 'apple'}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment