Client Blob Cache
What's the Client Blob Cache
The Client Blob Cache is a new Bedrock optimization - it allows blocks and biomes to be cached on Clients to avoid resending identical chunks over and over. Chunks and biomes make up the vast majority of network traffic in a lot of common cases (eg. login, teleport or dimension switches) but at the same time, they rarely change. Allowing the Client to reuse chunks it has seen in the past can save a ton of traffic and latency!
The Client Cache is a Content Addressed Storage (a bit like
git) that stores
Blobs and retrieves them based on their full hashes (
BlobIds). This means that the cache doesn't actually know about Chunks - in the future, we might start using it for more types of data, like skins or data driven entities.
A nice thing we get from the CAS approach is that the cache is persistent: returning players will be able to reuse content that was sent them in previous sessions or even previous sessions in different servers as long as that content matches exactly what the server is attempting to send right now.
The Client enforces the correctness of all
BlobIds by verifying that its independently computed hash matches with what the server is sending, so other 3rd party servers cannot corrupt your content.
The protocol involves a bit of back and forth: when the Server tells the client to reuse a known
BlobId, it starts a Cache Transaction. This means that it must keep
Blobs it referred to around and be ready to send them to the client if a Cache Miss Request is received.
Best practices to improve cache usage for Server Owners and Map Makers
The Client Cache always saves a little bit of bandwidth, but there's a few things you can do to make it save as much bandwidth as possible by making cache-friendly worlds.
The most important thing is to keep in mind that blocks are sent in grid-aligned 16x16x16 cubes (
SubChunks), and that these cubes aren't sent if an identical cube was already seen by the client.
This means that the best case for the cache is for example, something like a world made by just stone up to y=64 and then just air.
So a few guidelines are:
- If your underground isn't playable, use
/fillor an editor to make every part of the underground uniformly stone. Ores, Gravel, Caves and dirt/*ite formations cause underground chunks to be unique and unshareable.
- If your map has obvious borders that the player can't see through, use an editor to remove all blocks past that border. Chunks made of just air don't require any sending at all. Or use an infinite sea, infinite grass plane, etc. Anything works as long as it's uniform.
- If you don't care about biomes, pick one biome and use it across the entire world; this way, biomes don't have to be sent.
- If you run several servers/minigames, try reusing parts of your world between them and make sure they're aligned in all the worlds. This way, if someone plays Minigame A, they already have a lot of that level when they join Minigame B.
Supporting the Cache Protocol
The protocol is actually kinda complicated when getting in the details, and must be implemented with a lot of caution around race conditions, invalidating blobs too early, too late, using too much memory, throttling sends, etc.
Ok, can I just disable it for now?
Yes, the protocol lets the Server shut the caching off entirely so there's no extra work to upgrade to R12 until you're ready to work on supporting the full protocol.
cacheEnabled bool to
false and the cache will be forced off from the server side, falling back to the old data format.
Implementing the Cache Protocol
It's not part of a transaction and is sent by the Client once, at login, to communicate if it supports the cache or not. The client can not turn off/on the cache during the session or send that packet more than once. If a client declares it doesn't support the cache, sending
BlobIds to it is an error. Platforms like Switch don't support the cache, so this must be supported!
For now, the game only supports using cached content in
LevelChunkPacket starts a Cache Transaction, each composed of 3 packets always sent/received in the order below.
It was previously
FullChunkDataPacket. It got renamed because it doesn't necessarily contain any data anymore :)
It's used to start a Chunk Transaction and contains a few new fields:
bool cacheEnabled: this lets the server turn off the cache for this chunk even if the Client signaled it supports it. Set it to
trueto enable reusing the cache and the other new fields.
varint subChunkCount: how many SubChunks exist in this Chunk.
varint blobCount: must be the same as
subChunkCount. Added for future changes.
uint64 blobId 1..n: after
blobCount, there are n 64-bit numbers that represent the
BlobIds. ID 0 is for SubChunk 0, ID 1 is for SubChunk 1, and so on. You have to fill these in by hashing the content of the blobs with XXHash64 with seed
0. No other hashing is supported and the client will cross-check and refuse blobs if their content doesn't match the XXHash64 hash.
Note: When serializing a
SubChunkinto a blob, it must be serialized in its persistent (disk) format, eg. its palette must be a list of valid Block NBTs, not
RuntimeIds! This is very important because the cache is persistent and the Client caches directly the blobs that are sent by the server, so a blob can't contain
RuntimeIds that change from session to session.
The last ID in this sequence is the biome blob.
- Border Blocks, Block Entities and Biomes follow the ids in the old format.
So the python-ish pseudocode for building a
LevelChunkPacket and storing the blobs could be:
usedBlobs = dict() for each subchunk in chunk: blob = serialize(subchunk) blobId = XXHash64(blob) # add the blob to the current "transaction" set of required blobs usedBlobs[blobId] = blob # do the same for biomes biomes = serialize(chunk.biomes) biomesId = XXHash64(biomes) usedBlobs[biomesId] = biomes # now write out the packet stream.writeChunkPos(chunk.pos) # position stream.writeUnsignedVarInt(len(chunk)) # number of subchunks stream.writeBool(True) # enable the cache stream.writeUnsignedVarInt(len(usedBlobs)) for id in usedBlobs: stream.writeUint64(id) # add the old stuff too writeBorderBlocks(stream, chunk) writeBlockEntities(stream, chunk) # Keep the transaction object alive to keep track of how many transactions are active # and to be able to decide when to delete a blob because everyone is done with it server.trackTransaction(clientId, usedBlobs)
After this is sent, the client will respond with a
ClientCacheBlobStatusPacket is sent periodically by the client to update the server on which blobs it was able to retrieve from the cache (ACK) and which blobs it is lacking (MISS). Note that for performance reasons this packet is not sent for each
LevelChunkPacket - instead, the client batches the ACKs and MISSes into two big sets and sends them once in a while, eg. each tick.
When the Server receives one of these, it should go through each blob in the MISS list, fetch it from its storage, add it to a
ClientCacheMissResponsePacket sent it over to the Client.
You probably also want to use the ACKs/MISSes in this packet to decrement the blob data refcounts to find out which blobs have been confirmed by everyone and don't need to be kept around anymore.
ClientCacheBlobStatusPacket can only contain up to 4095 Ids, so packets bigger than that can be rejected.
This packet is just a list of
<blobId, blob> pairs. Any missing blob should just be thrown into one of these packet ASAP and sent.
Extra: How to throttle cache transactions
Throttling chunks is really important to keep latency under control for clients - if the server sends several MB's of blobs at once, those will hog the connection for several seconds until they all get through. High priority packets like movement or block updates will be queued after all that and it will cause heavy lagging on bad connections.
You should count the active transactions for each client, and only send new
LevelChunkPackets if there aren't too many active transactions. In Vanilla, depending on the connection quality, we only allow between 1 and 8 concurrent transactions.
Don't try to throttle
ClientCacheMissResponsePacket, on the contrary, try to send it as soon as possible. Once a
LevelChunkPacket is sent the client needs missing
Blobs as soon as possible, so it's critical to keep the delay between
LevelChunkPacket and the
ClientCacheMissResponsePacket containining the requested blobs to a minimum.