Tomcc/client_blob_cache.md

## client_blob_cache.md

      
    Raw
  

              client_blob_cache.md
            
          
    Client Blob Cache

What's the Client Blob Cache

The Client Blob Cache is a new Bedrock optimization - it allows blocks and biomes to be cached on Clients to avoid resending identical chunks over and over.
Chunks and biomes make up the vast majority of network traffic in a lot of common cases (eg. login, teleport or dimension switches) but at the same time, they rarely change. Allowing the Client to reuse chunks it has seen in the past can save a ton of traffic and latency!
The Client Cache is a Content Addressed Storage (a bit like git) that stores Blobs and retrieves them based on their full hashes (BlobIds). This means that the cache doesn't actually know about Chunks - in the future, we might start using it for more types of data, like skins or data driven entities.
A nice thing we get from the CAS approach is that the cache is persistent: returning players will be able to reuse content that was sent them in previous sessions or even previous sessions in different servers as long as that content matches exactly what the server is attempting to send right now.

The Client enforces the correctness of all BlobIds by verifying that its independently computed hash matches with what the server is sending, so other 3rd party servers cannot corrupt your content.
The protocol involves a bit of back and forth: when the Server tells the client to reuse a known BlobId, it starts a Cache Transaction. This means that it must keep Blobs it referred to around and be ready to send them to the client if a Cache Miss Request is received.
Best practices to improve cache usage for Server Owners and Map Makers

The Client Cache always saves a little bit of bandwidth, but there's a few things you can do to make it save as much bandwidth as possible by making cache-friendly worlds.
The most important thing is to keep in mind that blocks are sent in grid-aligned 16x16x16 cubes (SubChunks), and that these cubes aren't sent if an identical cube was already seen by the client.
This means that the best case for the cache is for example, something like a world made by just stone up to y=64 and then just air.

So a few guidelines are:

If your underground isn't playable, use /fill or an editor to make every part of the underground uniformly stone. Ores, Gravel, Caves and dirt/*ite formations cause underground chunks to be unique and unshareable.
If your map has obvious borders that the player can't see through, use an editor to remove all blocks past that border. Chunks made of just air don't require any sending at all. Or use an infinite sea, infinite grass plane, etc. Anything works as long as it's uniform.
If you don't care about biomes, pick one biome and use it across the entire world; this way, biomes don't have to be sent.
If you run several servers/minigames, try reusing parts of your world between them and make sure they're aligned in all the worlds. This way, if someone plays Minigame A, they already have a lot of that level when they join Minigame B.

Supporting the Cache Protocol

The protocol is actually kinda complicated when getting in the details, and must be implemented with a lot of caution around race conditions, invalidating blobs too early, too late, using too much memory, throttling sends, etc.
Ok, can I just disable it for now?

Yes, the protocol lets the Server shut the caching off entirely so there's no extra work to upgrade to R12 until you're ready to work on supporting the full protocol.
Just set LevelChunkPacket's new cacheEnabled bool to false and the cache will be forced off from the server side, falling back to the old data format.
Implementing the Cache Protocol

ClientCacheStatusPacket

It's not part of a transaction and is sent by the Client once, at login, to communicate if it supports the cache or not. The client can not turn off/on the cache during the session or send that packet more than once. If a client declares it doesn't support the cache, sending BlobIds to it is an error. Platforms like Switch don't support the cache, so this must be supported!
For now, the game only supports using cached content in LevelChunkPacket.

Each LevelChunkPacket starts a Cache Transaction, each composed of 3 packets always sent/received in the order below.
LevelChunkPacket

It was previously FullChunkDataPacket. It got renamed because it doesn't necessarily contain any data anymore :)

It's used to start a Chunk Transaction and contains a few new fields:

bool cacheEnabled: this lets the server turn off the cache for this chunk even if the Client signaled it supports it. Set it to true to enable reusing the cache and the other new fields.
varint subChunkCount: how many SubChunks exist in this Chunk.
varint blobCount: must be the same as subChunkCount. Added for future changes.
uint64 blobId 1..n: after blobCount, there are n 64-bit numbers that represent the BlobIds. ID 0 is for SubChunk 0, ID 1 is for SubChunk 1, and so on. You have to fill these in by hashing the content of the blobs with XXHash64 with seed 0. No other hashing is supported and the client will cross-check and refuse blobs if their content doesn't match the XXHash64 hash.

Note: When serializing a SubChunk into a blob, it must be serialized in its persistent (disk) format, eg. its palette must be a list of valid Block NBTs, not RuntimeIds! This is very important because the cache is persistent and the Client caches directly the blobs that are sent by the server, so a blob can't contain RuntimeIds that change from session to session.

The last ID in this sequence is the biome blob.
Border Blocks, Block Entities and Biomes follow the ids in the old format.

So the python-ish pseudocode for building a LevelChunkPacket and storing the blobs could be:
usedBlobs = dict()

for each subchunk in chunk:
    blob = serialize(subchunk)
    blobId = XXHash64(blob)
    
    # add the blob to the current "transaction" set of required blobs
    usedBlobs[blobId] = blob

# do the same for biomes
biomes = serialize(chunk.biomes)
biomesId = XXHash64(biomes)
usedBlobs[biomesId] = biomes

# now write out the packet
stream.writeChunkPos(chunk.pos) # position
stream.writeUnsignedVarInt(len(chunk)) # number of subchunks

stream.writeBool(True) # enable the cache

stream.writeUnsignedVarInt(len(usedBlobs))
for id in usedBlobs:
    stream.writeUint64(id)

# add the old stuff too
writeBorderBlocks(stream, chunk)
writeBlockEntities(stream, chunk)

# Keep the transaction object alive to keep track of how many transactions are active
# and to be able to decide when to delete a blob because everyone is done with it
server.trackTransaction(clientId, usedBlobs)
After this is sent, the client will respond with a ClientCacheBlobStatusPacket.
ClientCacheBlobStatusPacket

ClientCacheBlobStatusPacket is sent periodically by the client to update the server on which blobs it was able to retrieve from the cache (ACK) and which blobs it is lacking (MISS). Note that for performance reasons this packet is not sent for each LevelChunkPacket - instead, the client batches the ACKs and MISSes into two big sets and sends them once in a while, eg. each tick.
When the Server receives one of these, it should go through each blob in the MISS list, fetch it from its storage, add it to a ClientCacheMissResponsePacket sent it over to the Client.
You probably also want to use the ACKs/MISSes in this packet to decrement the blob data refcounts to find out which blobs have been confirmed by everyone and don't need to be kept around anymore.
Each ClientCacheBlobStatusPacket can only contain up to 4095 Ids, so packets bigger than that can be rejected.
ClientCacheMissResponsePacket

This packet is just a list of <blobId, blob> pairs. Any missing blob should just be thrown into one of these packet ASAP and sent.
Extra: How to throttle cache transactions

Throttling chunks is really important to keep latency under control for clients - if the server sends several MB's of blobs at once, those will hog the connection for several seconds until they all get through. High priority packets like movement or block updates will be queued after all that and it will cause heavy lagging on bad connections.
You should count the active transactions for each client, and only send new LevelChunkPackets if there aren't too many active transactions. In Vanilla, depending on the connection quality, we only allow between 1 and 8 concurrent transactions.
Don't try to throttle ClientCacheMissResponsePacket, on the contrary, try to send it as soon as possible. Once a LevelChunkPacket is sent the client needs missing Blobs as soon as possible, so it's critical to keep the delay between LevelChunkPacket and the ClientCacheMissResponsePacket containining the requested blobs to a minimum.