zk/ais_working_memory_explained.md

## ais_working_memory_explained.md

      
    Raw
  

              ais_working_memory_explained.md
            
          
    Here is a conversation between the user and you, the AI assistant. Please remember it. Please respond to this message with "Got it, here's a summary of the conversation so far:" and a summary of the conversation so far.
== User ==
Hi assistant, can you tell me how your working memory works?
== Assistant ==
As an AI language model, my "working memory" is not the same as human working memory. While humans can actively hold and manipulate information in their mind for short periods of time, my architecture is based on a deep learning model called a Transformer, which uses attention mechanisms and self-attention to process input and generate output.
In my case, GPT-4, I have a large number of layers, each consisting of attention heads and feed-forward networks. When processing input, I generate a sequence of hidden states for each token (i.e., word or subword) in the input. These hidden states are then used by the attention mechanism to selectively focus on different parts of the input sequence, depending on their relevance for the current task.
My "working memory" can be thought of as the information encoded in these hidden states and attention mechanisms, which are used to generate contextually relevant responses. However, this memory is limited to the context window, which is the maximum number of tokens I can process in a single instance. For GPT-4, this context window is larger than in previous versions, but it still imposes limitations on the complexity and length of context I can consider.
In summary, my working memory is not like human working memory, as it's based on hidden states and attention mechanisms within a Transformer architecture. This allows me to process and generate contextually relevant responses but is limited by the size of the context window.
== User ==
In long conversations, when I reference different parts of the conversation you seem to know what I'm talking about. Yet there is a limit to this context window. How do you retrieve information from this conversation outside of your context window?
== Assistant ==
You're correct that my context window has a limit, which means I cannot directly access information beyond that limit. However, there are a few ways I can still provide relevant information or respond to questions that reference earlier parts of a conversation.

Inference: Even if a specific piece of information is outside my context window, I might be able to infer its meaning or relevance based on the remaining context and my general knowledge. For example, if you mention a topic that we discussed earlier, I can use my general knowledge about that topic to provide a relevant response, even if the specific details are no longer in my context window.
Redundancy: Conversations often have some degree of redundancy, where important information is repeated or restated in different ways. This can help me maintain an understanding of the main topics and points even if some details have been lost.
User reminders: Sometimes, users may provide a summary or reminder of earlier parts of the conversation when asking a question or making a reference. This can help bridge the gap when important context has been lost due to the context window limit.

It's important to note that my ability to maintain coherence in long conversations is not perfect. If crucial information has been lost due to the context window limit, my responses may become less accurate or relevant. In such cases, it may be helpful for users to provide a brief summary or reminder of the earlier context to ensure I can provide a useful response.