Skip to content

Instantly share code, notes, and snippets.

@Straafe
Created November 14, 2023 21:15
Show Gist options
  • Save Straafe/72caa6b9cd137666d38a95243a81b8f5 to your computer and use it in GitHub Desktop.
Save Straafe/72caa6b9cd137666d38a95243a81b8f5 to your computer and use it in GitHub Desktop.
Claude 2's attempt to create basic documentation for ooba's OpenAI-based API scripts

Here is some documentation for the OpenAI API compatible endpoints:

Completions

/v1/completions

Generates text completions for the provided prompt.

Parameters:

  • prompt (required): The prompt to generate completions for, as a string or list of strings.

  • model: Unused parameter. To change the model, use the /v1/internal/model/load endpoint.

  • stream: If true, will stream back partial responses as text is generated.

  • max_tokens: The maximum number of tokens to generate.

  • temperature: Sampling temperature, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

  • top_p: An alternative to sampling with temperature, called nucleus sampling.

  • echo: If true, the prompt will be included in the completion.

  • stop: Up to 4 sequences where generation will stop if any are matched.

See GenerationOptions in typing.py for other generation parameters.

Returns:

  • id: ID of the completion.

  • choices: List containing the generated completions.

  • usage: Number of prompt tokens, completion tokens, and total tokens used.

/v1/chat/completions

Generates chat message completions based on a provided chat history.

Parameters:

  • messages (required): Chat history as a list of messages with role (user, assistant) and content.

  • model: Unused parameter. To change the model, use the /v1/internal/model/load endpoint.

  • stream: If true, will stream back partial responses as text is generated.

  • mode: instruct, chat, or chat-instruct. Controls whether assistant is in character.

  • instruction_template: Name of instruction template file to use.

  • character: Name of character file to use for assistant.

See ChatCompletionRequest in typing.py for other parameters.

Returns:

Same as /v1/completions.

Models

/v1/models

Lists the currently available models.

/v1/models/{model_id}

Gets information about the specified model.

Billing

/v1/billing/usage

Gets usage statistics for billing purposes.

Parameters:

  • start_date: Start date for usage stats, in YYYY-MM-DD format.

  • end_date: End date for usage stats, in YYYY-MM-DD format.

Returns:

  • total_usage: Total token usage during the specified period.

Audio

/v1/audio/transcriptions

Transcribes an audio file using Whisper.

Parameters:

  • file (required): The audio file to transcribe.

  • language: Language spoken in the audio.

  • model: Whisper model to use, tiny or base.

Returns:

  • text: Transcription text.

Images

/v1/images/generations

Generates images using Stable Diffusion.

Parameters:

  • prompt (required): The text prompt to generate images for.

  • size: Size of images to generate, like 512x512.

  • n: Number of images to generate.

Returns:

  • data: List of generated images.

Embeddings

/v1/embeddings

Gets sentence embeddings for the provided input text.

Parameters:

  • input (required): Input text to get embeddings for, as string or list of strings.

  • encoding_format: float or base64.

Returns:

  • object: list

  • data: List of embeddings, one for each input.

Moderations

/v1/moderations

Checks input text for harmful content.

Parameters:

  • input (required): Input text to moderate.

Returns:

  • results: List of moderation results, one for each input text.

Encoding/Decoding

/v1/internal/encode

Encodes text into tokens.

Parameters:

  • text (required): Text to encode.

/v1/internal/decode

Decodes tokens into text.

Parameters:

  • tokens (required): Tokens to decode.

/v1/internal/token-count

Gets the number of tokens for text.

Parameters:

  • text (required): Text to get token count for.

Internal

/v1/internal/model/info

Gets information about the currently loaded model.

/v1/internal/model/load

Loads a new model. Can be used to switch model on the fly.

Parameters:

  • model_name (required): Name of model to load.

  • args: Dict of args to pass to model loading.

  • settings: Dict of settings to update after loading.

/v1/internal/stop-generation

Stops any ongoing text generation.

So in summary, the main endpoints are /v1/completions, /v1/chat/completions, and /v1/models for generating text, chat, and listing models. The /v1/internal endpoints can be used to switch models, encode/decode, and more.

@Straafe
Copy link
Author

Straafe commented Nov 14, 2023

Tried again, this time feeding it the entire repo, and it came up with these:

Here are the API endpoints I identified in the code along with their associated parameters and python file/function references:

POST /v1/completions

Parameters:

  • prompt (str|List[str]) - Prompt text for text completion. See completions_common() in extensions/openai/completions.py.

  • stream (bool) - Whether to stream back partial responses. See completions_common() in extensions/openai/completions.py.

POST /v1/chat/completions

Parameters:

  • messages (List[dict]) - Chat history. See chat_completions_common() in extensions/openai/completions.py.

  • stream (bool) - Whether to stream back partial responses. See chat_completions_common() in extensions/openai/completions.py.

POST /v1/images/generations

Parameters:

  • prompt (str) - Prompt text for image generation. See generations() in extensions/openai/images.py.

POST /v1/embeddings

Parameters:

  • input (List[str]) - Text to encode. See embeddings() in extensions/openai/embeddings.py.

POST /v1/moderations

Parameters:

  • input (str|List[str]) - Text to moderate. See moderations() in extensions/openai/moderations.py.

GET /v1/models

Lists models. See list_models() in extensions/openai/models.py.

GET /v1/models/{model}

Gets model info. See model_info_dict() in extensions/openai/models.py.

POST /v1/internal/model/load

Loads a model. See _load_model() in extensions/openai/models.py.

POST /v1/internal/encode

Encodes text. See token_encode() in extensions/openai/tokens.py.

POST /v1/internal/decode

Decodes tokens. See token_decode() in extensions/openai/tokens.py.

POST /v1/internal/token-count

Gets token count for text. See token_count() in extensions/openai/tokens.py.

GET /v1/internal/model/info

Gets current model info. See get_current_model_info() in extensions/openai/models.py.

Let me know if you need any clarification or have additional questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment