Skip to content

Instantly share code, notes, and snippets.

@AIAnytime
Created October 30, 2023 08:09
Show Gist options
  • Save AIAnytime/2f0666b87cfb9a02cc0bebfec1d59307 to your computer and use it in GitHub Desktop.
Save AIAnytime/2f0666b87cfb9a02cc0bebfec1d59307 to your computer and use it in GitHub Desktop.
top p top k thing
Nucleus sampling is a technique used in large language models to control the randomness and diversity of generated text. It works by sampling from only the most likely tokens in the model’s predicted distribution.
The key parameters are:
Temperature: Controls randomness, higher values increase diversity.
Top-p (nucleus): The cumulative probability cutoff for token selection. Lower values mean sampling from a smaller, more top-weighted nucleus.
Top-k: Sample from the k most likely next tokens at each step. Lower k focuses on higher probability tokens.
In general:
Higher temperature will make outputs more random and diverse.
Lower top-p values reduce diversity and focus on more probable tokens.
Lower top-k also concentrates sampling on the highest probability tokens for each step.
So temperature increases variety, while top-p and top-k reduce variety and focus samples on the model’s top predictions. You have to balance diversity and relevance when tuning these parameters for different applications.
OpenAI recommends only altering either temperature or top-p from the default.
Top-k is not exposed.
Nucleus sampling parameters alone cannot stop an AI from hallucinating, but they can keep the output on a path of low perplexity. When the temperature is set high, alternate token choices can be made that are not a good fit:
The cause of most astronaut deaths in one word?
Acc = 87.53%
M = 5.81%
Expl = 2.98%
F = 0.60%
Mis = 0.34%
You can see that one mis-step can send the conversation on a whole new course.
Try temperature 0.4 unless you really want unexpected writing. The person chatting about computer code will appreciate it.