You take a stream/buffer of binary data. The start of your data is a "Page", which has a header followed by data.
You need to read the following data from the header (see https://xiph.org/ogg/doc/framing.html for more detail)
capture_pattern
- bytes 0 to 3, must read OggSstream_structure_version
- byte 4, must be 0header_type_flag
- byte 5, a bitflag that tells you metadata about the page (is it a new packet? is a packet continued here? is it the first/last page of the stream?)absolute granule position
- bytes 6 to 13, not needed unless you want seekingstream serial number
- bytes 14 to 17, a serial number given to each stream contained in the Ogg file. This is important for playing Ogg files that contain other streams, such as video or cover art. You need to identify and only bother parsing the Opus stream.page sequence no
- bytes 18 to 21, the page numberchecksum
- bytes 22 to 25, a checksum (you can ignore this)page_segments
- byte 26, the number of segments in the lacing table
Now you have to calculate the size of each packet -- each segment makes up a packet of data, this can be an Opus packet or Opus metadata.
To do this, we use the lacing_table
. The lacing_table runs from bytes 27 to the value of page_segments
, i.e. a page_segments
value of 5 means the lacing_table is 5 bytes long.
To calculate the size of each packet (and therefore the number of packets), we use this algorithm:
i = 0 (the position in the lacing table)
packetLengths = []
while i < len(table):
packetLength = 0
packetLength += table[i]
while table[i] == 255:
i++
packetLength += table[i]
packetLengths.push(packetLength)
totalBodyLength = sum(packetLengths)
numberOfPackets = len(packetLengths)
To read the first packet, we read the first packetLengths[0]
bytes after the lacing table, the second packet is the first packetLengths[1]
bytes after the first packet etc.
Now we need to start looking through the data to see if there are Opus Packets to pick out.
We take the first 8 bytes of each packet as header
. If header == 'OpusHead'
then this bitstream (defined in the page header above) is an Opus stream, and we should only bother deciphering future pages if their bitstream values are the same as this one. If header == 'OpusTags'
, we're looking at metadata for the file, such as artist and title. We can skip this. The last case is where the header is equal to neither of these, in that case it's an actual Opus Packet we can use!
Great Explanation, but if I'm not mistaken there is a missing
i++
afterpacketLengths.push(packetLength)
.