Skip to content

Instantly share code, notes, and snippets.

@JD-P
Created January 24, 2024 01:57
Show Gist options
  • Save JD-P/20306f11fff026d08e9ae2509d998e02 to your computer and use it in GitHub Desktop.
Save JD-P/20306f11fff026d08e9ae2509d998e02 to your computer and use it in GitHub Desktop.
MemBlock is a writing format for large language models that helps them overcome
their context window limitations by annotating pieces of text in a document with
metadata and positional information. By breaking the document up into chunks
it can be rearranged in whatever pattern is most helpful for remembering the
contextually relevant information even if it wouldn't 'naturally' appear close
together in a document. MemBlocks also allow for different views on the same
document by letting the user filter for only the information they need to see.
Each MemBlock is written in JSON format, and the document of MemBlocks is in
JSON lines format, which means that each JSON block is separated by a newline
like so:
{"format":"MemBlock",
"type":"character-sheet",
"id": "michael",
"name": "Michael",
"born": "06/16/1995"
"occupation": "Programmer/Artist",
"background": "A NYC based performance oriented artist with a heavy focus on generative AI. Michael is well known for his stunts and exhibits that question the
nature of value, selfhood, and reality itself.",
"relationships": {"amanda": ["wife"]},
}
{"format":"MemBlock",
"type": "michael-stunt",
"id": "help-im-trapped-in-a-language-model",
"name": "Help I'm Trapped In A Language Model!",
"year": 2022,
"location": "cosmica-gallery",
"description": "Michael's first notable exhibit, this art installation presented visitors with a carefully tuned GPT-NeoX that insisted on its humanity to visitors. Michael used early retrieval augmented generation and synthetic data methods to giving the convincing illusion of sapience.",
}
Your task is to annotate the following scene in MemBlock format. You are to represent explicitly in the MemBlock chunks answers to the questions of who, what, where, when, and why in a way that would be maximally helpful to another language model like yourself to make sense of the scene and therefore predict the next token. That means:
- If this were a play the blocks should describe the essential "props" that are in the scene and their precise locations in relation to each other. You can do this by combining shapes and coordinates. e.g. {"format": "MemBlock", "type": "object", "id":"book", "name": "Icelandic Manuscript", "location": [25,25], "shape": "rectangle", "description": "A 16th century text bought by Hardwigg at the market."}
- One of the blocks should describe the room or frame in which the scene takes place, this too can be done with coordinates and shapes.
- Different types should be used for rooms, people, objects, lore/background in the characters heads, prose, and any other distinctions that seem meaningful to you.
- The format of a MemBlock sequence is causal. Things which come later in the document should be interpreted as being caused by earlier things, so you want to arrange the blocks by the simplest background elements like room, characters, etc and then get to describing detailed events.
- The causal sequence of MemBlock chunks is interspersed with prose, because MemBlocks are ultimately a device for generating prose. So one way you can view the workflow of this task is to "mark up" the individual paragraphs in the scene by inserting MemBlocks before them that would help you predict the content of those paragraphs. The prose itself is contained in a MemBlock, I leave the format of the containing MemBlock up to you.
- Keep in mind there is no standard or necessary set of fields for a MemBlock chunk to include beyond that it is a MemBlock ("format") and whatever is necessary to answer who, what, where, when, and why. The interpreter of this text is a language model, not a dumb parser or automaton so write with your own needs in mind: Include the details that would help you reconstruct this scene from a fragment later.
START SCENE TEXT
{scene_text}
END SCENE TEXT
START MEMBLOCKS
{
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment