Skip to content

Instantly share code, notes, and snippets.

@Tomcc
Last active February 21, 2024 21:41
  • Star 33 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save Tomcc/a96af509e275b1af483b25c543cfbf37 to your computer and use it in GitHub Desktop.
Block Changes in Beta 1.2.13

Block Storage & Network Protocol Changes

Paletted chunks & removing BlockIDs brings a few big changes to how blocks are represented. In Bedrock, Blocks used to be represented by their 8 bit ID and 4 bit data; this means that we can only represent 256 blocks and 16 variants for each block. As it happens we ran out of IDs in Update Aquatic, so we had to do something about it :)

After this change, we can represent infinite types of Blocks and infinite BlockStates, just like in Java. BlockStates are what is sent over the network as they are roughly equivalent to the old ID+data information, eg. they're all the information attached to a block.

BlockStates are serialized in two ways:

PersistentID: a PersistentID is a NBT tag containing the BlockState name and its variant number; for example

{
    "name":"minecraft:log",
    "val":14
}

in the future it will contain the actual properties of the BlockState explicitly like in Java. This format is pretty expensive to serialize and send over the network though, so it isn't used in the protocol yet (and possibly never) but it might still be useful information for map makers :)

RuntimeID: a RuntimeID is an alternate ID that is used to refer to a BlockState within a game session. It's an unsigned int that is assigned to each new BlockState we discover when loading a world; worlds with different Behavior packs can (in the future) represent the same BlockState with different RuntimeIDs.

Because of this, you shouldn't ever write RuntimeIDs to disk or make them persist across game sessions, because they should change. A future version of the protocol will let the server send a GlobalPalettePacket which will let you tell the Client which block is which and have it install new blocks.

For the time being however, RuntimeIDs are actually static ( :( ) so you will have to use a lookup table to figure out which is which! The lookup table is appended at the end.

Typically, RuntimeIDs are serialized as varints.

Item changes

We didn't do the same ID removal for Item, because Item's ID was a short already and so it wasn't as much of a concern; what we did was to move new blocks to negative IDs. So, every block past 256 will be -1, -2, -3 and so on.

This means that you can't compare Block IDs and Item IDs directly anymore, even if you keep Block IDs around!

Changed packets

This is a (hopefully exhaustive) list of packets that changed and that your server should handle to talk with this version of Minecraft. If you're wondering where FullChunkDataPacket is, no worries! That one still supports the old format so you don't have to implement block palettes right away. Other packets aren't retrocompatible though, so you'll have to fix those:

  • UpdateBlockPacket: BlockID -> RuntimeID
  • LevelSoundEventPacket: BlockID -> Data (which is a RuntimeID)
  • LevelEventPacket: the Data varint now represents a RuntimeID for ParticlesDestroyBlock, ParticlesCrackBlock, ParticlesCropEaten and ParticlesDestroyArmorStand.
  • SetEntityDataPacket: Endermen and FallingBlock use RuntimeID to represent their BlockState.

FullChunkDataPacket

FullChunkDataPacket is an exception because it still understands the old format, as we needed to implement that anyway to read old worlds. Anyhow, you should start considering moving to the new format because until you do, you won't be able to send Blocks past 256 to the client!

Also, there's a small penalty to converting old chunks to new chunks on the client, so you should do that for speed as well.

The new SubChunk format

While the LevelChunk format itself is unchanged, what changed is the internal format of each SubChunk. The version is always a byte and there are a few valid SubChunkFormat versions:

  • 1.2 format (versions 0,2,3,4,5,6,7) The old format. It's just [version:byte][16x16x16 blocks][16x16x16 data]. There can be light information, but it will be discarded. Note how values 0,2,3,4,5,6,7 all mean this format! We had to do this because tools like MCEdit put those values in that field and we needed to keep those worlds working.

  • Palettized format (version 1) This format was briefly used in the beta after palettization came in. The SubChunk just contains one block storage, so the format is [version:byte][block storage]

  • Palettized format with storage list (version 8) This is the final Update Aquatic format, added to allow for several block storages. The additional block storage is used only for water for now, but we made the format generic.
    The format is [version:byte][num_storages:byte][block storage1]...[blockStorageN]

Block Storage format

A Block Storage is now its own type of object (different from SubChunk) with its own format and version! Don't put weird numbers in the version field this time :)

  • 1 bit: whether the chunk is serialized for Runtime or for Persistence: always 1 when over the network.
  • 7 bits: the internal format, eg. the type of palette or compression that the SubChunk is using. This one is used to select which subclass of BlockStateStorage to create. Valid types that the client understands are:
enum class Type : uint8_t {
	Paletted1 = 1,   // 32 blocks per word
	Paletted2 = 2,   // 16 blocks per word
	Paletted3 = 3,   // 10 blocks and 2 bits of padding per word
	Paletted4 = 4,   // 8 blocks per word
	Paletted5 = 5,   // 6 blocks and 2 bits of padding per word
	Paletted6 = 6,   // 5 blocks and 2 bits of padding per word
	Paletted8  = 8,  // 4 blocks per word
	Paletted16 = 16, // 2 blocks per word
}

A word is just a 4 byte unsigned int.
The padded formats are kind of nasty to implement but the good news are, you don't need to implement all of them (if you do, I suggest to use a template). The server can pick whatever format and the Client will understand it. Of course, if you waste space in the palettes you won't have the best compression.

  • 4096 / blocksPerWord words + 1 optional padding word: The actual blocks. Each block takes a fixed amount of bits, plus eventual padding. The padding byte is only present if the words have padding, eg. for sizes 3,5 and 6.
  • 1 varint: The palette size N
  • (for network) N varints: The palette entries, as RuntimeIDs. You should use the RuntimeID table to convert those to actual blocks.
  • (for persistence) N NBT tags: The palette entries, as PersistentIDs. You should read the "name" and "val" fields to figure out what blocks it represents.
@Tomcc
Copy link
Author

Tomcc commented Mar 5, 2018

@tyronx they are infinite. They are serialized as a varint so the actual network protocol can carry as big as a number as you want.
How you want to represent the varints you read/write in your code is up to you. I wrote unsigned int not 32-bit unsigned int :)

We choose to use uint32 because that allows for 4 billion blockstates. Those would take at least 400gb of RAM just for the blockstate definitions. I think that it's future proof enough :)

Copy link

ghost commented Mar 5, 2018

RuntimeID is already used to reference entity runtimes, and could get quite confusing. Is there any chance we could get this referred to as BlockRuntimeID, or something similar?

Also regarding the old format - if it's to stay for backwards compatibility, do you think we could have it removed in a future version? The Minecraft development community tends to be quite lazy in upgrading these sort of things, so it would be a good nudge in the right direction. Perhaps we could get a deadline for the old format to be removed or disabled (eg. 1.2.15)? Just a thought.

Copy link

ghost commented Mar 5, 2018

Also, a question, and suggestion, regarding block state data:

The bit packing that is done for this is in a very different order to that of Java Edition.
For example, piston facing block state in JE vs BE:

BlockFace BE JE
DOWN 0 0
UP 1 1
NORTH 2 3
SOUTH 3 2
WEST 4 5
EAST 5 4

This sort of thing is a really big pain for mod makers and server developers. It means we have to keep long and unwanted conversion tables when migrating worlds from one platform to another. Is there any chance for Update Aquatic that the Bedrock and Java teams could work together in standardising this packed block state data?

@Tomcc
Copy link
Author

Tomcc commented Mar 5, 2018

@oJBR in our code it's already called BlockRuntimeID. This doc is a spec, it doesn't say how you should name things in your code, either, other than in the packet fields.

About blockstate data: we've just converted what was already there... I'm aware that that sucks, but changing that is just as big of a pain for us. Plus, that pain might not be motivated because Java tech is not developed together with us for the most part and can/will diverge on internal details like this one. So encouraging the assumption that you can just move bits over 1:1 is going to cause problems for you or a lot of work for us, eventually.
You should never expect internal implementation details like that to stay consistent because it's just not practical for us to have to consider every single data structure in the game a public part of the interface that can't be changed without people breaking.
I think the best way to do it is to just convert everything by default, and assume that 1:1 conversions are temporary.

tl;dr: if you do modding/reverse engineering, changing internals are a part of life. We do try to make it as un-painful as possible though!

@extremeheat
Copy link

On Java Edition, chunks are sent over the network with their own section local palette that correspond to a global palette, are there plans to add in local palettization when there's non-static BlockState IDs?

Also, any estimate on if the block properties like JE1.13 are coming any time soon to prepare for ahead of time? So we're not going to have to immediately rework everything again? :)

@Tomcc
Copy link
Author

Tomcc commented Mar 7, 2018

@extremeheat we do send the subchunks over the network with their own global palette, check the FullChunkDataPacket section :) Or did I misunderstand?

Block Properties won't come in before update acquatic I think, but this time I'll try to give some heads up ahead of time. Also, that change only affects NBT which can be upgraded silently, so probably you won't have to do anything about it. The old NBTs with val in it will keep working correctly.

@MickenNinja
Copy link

So... will we be able to add in blocks with Addons?

@7kasper
Copy link

7kasper commented Apr 9, 2018

Hey @Tomcc thanks for publishing this!
Here are just a few notes for the casual reader as some parts are unclear:

Don't be fooled by the:

1 bit: whether the chunk is serialized for Runtime or for Persistence: always 1 when over the network.
7 bits: the internal format, eg. the type of palette or compression that the SubChunk is using.

It is the other way around, first 7 bits palette type and then the flag for Runtime / Persistance.

Also, please note that

This format was briefly used in the beta after palettization came in.

includes the 1.2.13 release! (So that is version 1 with one blockstorage)

Furthermore if you are thinking: "what the heck does:

4096 / blocksPerWord words + 1 optional padding word: The actual blocks. Each block takes a fixed amount of bits, plus eventual padding. The padding byte is only present if the words have padding

mean?"
Don't worry much, this is just simply saying that you have to round up when dividing to caclculate the number of words.

Also repeating the usual stuff (again in figuring this out I always have these problems 😄 ) Ints etc are almost always LittleEndian, this includes the words. Also: varint, when not specified that they are unsigned, they are signed (yes even though the size and runtimeIDs are never below zero).

Special thanks to @MCMrARM for helping me out a ton by quickly providing packetdata and suggestions.

Here's some psuedocode to read a full chunk packet with all listed versions (I didn't do that much my best to make it tidy but it should work in theory (or else comment plz and I will change 👍 ). I'll let you figure out writing yourself! 😉 )

int chunkX = VarNumberSerializer.readSVarInt(data);
int chunkZ = VarNumberSerializer.readSVarInt(data);
bytes = ArraySerializer.readByteArray(data);
Section[] sections = new Section[data.readByte()];
for (int i = 0; i < sections.length; i++) {
    int version = data.readByte();
    int storages = 1;
    switch(version) {
        case 8: {
            storages = VarNumberSerializer.readSVarInt(data); //Is this really svarint? 
            //I don't think so on non-network, maybe try intLE or byte instead..
        }
        case 1: {
            for (int storage = 0; storage < storages; storage++) {
                 int paletteAndFlag = bytes.readByte();
                 boolean isRuntime = paletteAndFlag & 1 != 0;
                 int bitsPerBlock = paletteAndFlag >> 1;
                 int blocksPerWord = (int) Math.floor(32 / bitsPerBlock);
                 int wordCount = (int ) Math.ceil(4096.0 / blocksPerWord);
                 int blockIndex = bytes.readerIndex();
                 bytes.skipBytes(wordCount * 4); //4 bytes per word.
                 Palette localPallete; //To get 'real' data
                 if (isRuntime) {
                      localPallete = new RuntimePallete(VarNumberSerializer.readSVarInt(bytes));
                      for (int palletId = 0; palletId < localPallete.size(); palletId++) {
                          localPallete.put(palletId, VarNumberSerializer.readSVarInt(bytes));
                      }
                 } else {
                     //Says varint, but I don't think so on flatfile :wink:
                     localPallete = new PersistancePallete(bytes.readIntLE); 
                      for (int palletId = 0; palletId < localPallete.size(); palletId++) {
                          localPallete.put(palletId, NBTTagSerializer.readTag(bytes));
                      }
                 }
                 int afterPaletteIndex = bytes.readerIndex();
                 bytes.readerIndex(blockIndex);
                 int position = 0;
                 for (int wordi = 0; wordi < wordCount; wordi++) {
                     int word = bytes.readIntLE();
                     for (int block = 0; block < blocksPerWord; block++) {
                         int state = (word >> ((position % blocksPerWord) * bitsPerBlock)) & ((1 << bitsPerBlock) - 1);
                         int x = (position >> 8) & 0xF;
                         int y = position & 0xF;
                         int z = (position >> 4) & 0xF; 
                         section.setBlockId(x, y, z, localPallete.getBlockId(state));
                         section.setBlockData(x, y, z, localPallete.getBlockData(state));
                         position++;
                     }
                 }
                 bytes.readerIndex(afterPaletteIndex);
             }
             break;
        }
        default: {
           for (int x = 0; x < 16; x++) {
               for (int z = 0; z < 16 z++) {
                   for (int y = 0; y < 16; y ++) {
                       section.setBlockId(x, y, z, bytes.readByte());
                   }
               }
           }
           for (int x = 0; x < 16; x++) {
               for (int z = 0; z < 16 z++) {
                   for (int y = 0; y < 16; y += 2) {
                       int states = bytes.readByte();
                       section.setBlockData(x, y + 1, z, (states >> 4) & 0xF);
                       section.setBlockData(x, y, z, states & 0xF);
                   }
               }
           }
           break;
        }
    }
}
for (int x = 0; x < 16; x++) {
    for (int z = 0; z < 16; z++) {
        chunk.setHeight(x, z, bytes.readShortLE());
    }
}
for (int x = 0; x < 16; x++) {
    for (int z = 0; z < 16; z++) {
        chunk.setBiome(x, z, bytes.readByte());
    }
}
readExtraData(); //I haven't actually figured out what this even means. Skip it? :P (Amount is send via signed varint)
while(bytes.readableBytes() > 0) {
   chunk.addTile(NBTTagSerializer.readTag(bytes));
}

Edit: Fixed wrong operator in reading chunk data.
Figuring out the data: 😄
20180409_195805

@geNAZt
Copy link

geNAZt commented Apr 16, 2018

version 8 subchunk sending seems to be a bit wrong. when you send the amount of storages with a signed varint the client will crash, sending a byte is ok, also reading non network with a byte is ok. so if you run into any issues writing/reading the amount of storages, try it with a byte

@dktapps
Copy link

dktapps commented Apr 18, 2018

@geNAZt is correct, the blockstorage count is a byte not varint.

image

@MattPryze
Copy link

MattPryze commented Apr 23, 2018

I'm confused about version 8... From your post I would have assumed that version 8 would pertain to the chunk version (since you have version 0 - 7) however you put version 8 alongside subchunk versions 0-1. Why? I would expect it to just have subchunk version 2. But I can work with it.

Why though are multiple block storages within a subchunk a thing? Can someone explain the usefulness?

To clarify.... is each additional block storage a copy of some of the first storages blocks? Or should we expect all storages to combine together with blocks in later storages overwriting earlier storages?

@Tomcc
Copy link
Author

Tomcc commented Apr 25, 2018

@geNAZt oops you're correct, I shouldn't write these docs from memory :)
@MRG95 it's written just above, versions 2 to 7 have been blocked out because tools in the wild used them wrong. So it should have been 2, but instead it became 8. Multiple block storages are for now just to store water, which in Update Acquatic can coexist with blocks in the first layer. So yeah, layer 1 is different, and contains fluids.

@ocecaco
Copy link

ocecaco commented Mar 28, 2019

Is it a bug that some of the blocks refer to an out-of-range index in the palette of the BlockStorage? I'm getting that the array of block data sometimes contains a 24 even when there are only 20 NBT tags following the block data, for instance. This is happening with the persistent (i.e. not network) format. When I look at those locations in the game itself, there is nothing unusual about them. There is usually just air or stone there. I also verified that I could correctly determine the type of block at almost every world position in a world, so there is nothing obviously wrong with my decoding logic. Furthermore, I also check that I have consumed all of the data in a subchunk after deserializing, so I am not missing any of the NBT tags.

EDIT: Never mind, I have found my bug already. My parser was working incorrectly for chunks where there were padding bits in each word.

@maple-shaft
Copy link

@ocecaco I have the same problem right now... Can you elaborate a little more to what you were doing wrong? I am pulling my hair out trying to figure out why I am getting indices that exceed palette size.

@ocecaco
Copy link

ocecaco commented Jul 15, 2020

@maple-shaft The problem in my case had to do with the fact that I messed up the direction in which I was reading the bits from each word. I don't remember anymore what the right ordering was, but this commit might provide some insight: ocecaco/mcworld@989bea5#diff-25d902c24283ab8cfbac54dfa101ad31 . That's the commit where I finally fixed the issue (after thinking I had fixed it before). You might notice that the shift direction has changed from <<= to >>=, so clearly it had something to do with the direction of in which the blocks are packed into a word. Another thing that can cause bugs is if you "drop" the padding bits from the wrong side of the word. If you are packing 6 5-bit words, you end up with 30 bits and 2 bits of padding. I'm not sure where the 2 bits of padding are in the block encoding anymore, however.

@maple-shaft
Copy link

@ocecaco Thanks! Your code gave me a needed sanity check, I was doing this right but I did have a bug in my code where I was casting a signed byte to a signed int and it was changing the underlying bits on me. Sometimes I hate Java. Am trying to make a library that makes it easier to program ROM modules in Bedrock Redstone computers.

@gentlegiantJGC
Copy link

It looks like 1.17.30 added sub-chunk version 9 which adds an extra byte storing the sub-chunk index after the number of storages.
Would be nice to have official confirmation.
[version:byte][num_storages:byte][sub_chunk_index:byte][block storage1]...[blockStorageN]

@AncientHello
Copy link

It looks like 1.17.30 added sub-chunk version 9 which adds an extra byte storing the sub-chunk index after the number of storages. Would be nice to have official confirmation. [version:byte][num_storages:byte][sub_chunk_index:byte][block storage1]...[blockStorageN]

Thanks for the heads up! There is definitely an extra byte in version 9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment