Daniel Hofheinz dhofheinz

## gpt4_text_compression.md

      
              1 file
            
          
              2 forks
            
          
              10 comments
            
          
              26 stars
            
          
                jimsrc
                / gpt4_text_compression.md
            
            
              Last active
              June 18, 2024 15:03
            
              
                Minimizing the number of tokens usage to interact with GPT-4.
              
          
    Overview

I just read this trick for text compression, in
order to save tokens in subbsequent interactions during a long conversation, or in a subsequent long text to summarize.
SHORT VERSION:

It's useful to give a mapping between common words (or phrases) in a given long text that one intends to pass later.
Then pass that long text to gpt-4 but encoded with such mapping. The idea is that the encoded version contains less
tokens than the original text.
There are several algorithms to identify frequent words or phrases inside a given text, such as NER, TF-IDF, part-of-speech (POS) tagging, etc.

  
## gpt4_abbreviations.md

      
              1 file
            
          
              6 forks
            
          
              10 comments
            
          
              147 stars
            
          
                VictorTaelin
                / gpt4_abbreviations.md
            
            
              Last active
              June 18, 2024 15:03
            
              
                Notes on the GPT-4 abbreviations tweet
              
          
    Notes on this tweet.


The screenshots were taken on different sessions.


The entire sessions are included on the screenshots.


I lost the original prompts, so I had to reconstruct them, and still managed to reproduce.


The "compressed" version is actually longer! Emojis and abbreviations use more tokens than common words.