Skip to content

Instantly share code, notes, and snippets.

@rasmusbergpalm
Last active September 17, 2019 05:41
Show Gist options
  • Save rasmusbergpalm/47f5237a5412665d9d9ee992d142aa5b to your computer and use it in GitHub Desktop.
Save rasmusbergpalm/47f5237a5412665d9d9ee992d142aa5b to your computer and use it in GitHub Desktop.
A small note on bits back coding
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Bits back coding"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Assume we wish to use a VAE for compression. Our generative model is\n",
"\n",
"$$\n",
"p(x,z) = p(x|z)p(z)\n",
"$$\n",
"\n",
"And we approximate the true posterior $p(z|x)$ with $q(z|x)$\n",
"\n",
"Assume we have trained our model and sent all the parameters to the receiver.\n",
"\n",
"We now need to send $\\hat{x}$ to the receiver.\n",
"\n",
"We'll use two arithmetic coders to sample z from $q(z|\\hat{x})$.\n",
"\n",
"The first arithmetic coder uses $q(z|\\hat{x})$ to some precision (e.g. 32 bits floats) and runs in decode mode: given enough random bits it outputs a sample from $q(z|\\hat{x})$.\n",
"\n",
"The second arithmetic coder uses a probability model of something else we also wish to send, say text. It uses a $p(c_t|c_{t-1})$ conditional distribution of the frequencies of characters depending on the previous character. We run it in the encoder mode and input some text, say \"bits back coding is super easy to understand\", and it spits out a string of \"random\" bits, e.g. \"10100010101010010010\". \n",
"\n",
"We input text in the second encoder and feed the resulting bits into the first arithmetic coder. Once the first coder has enough bits it spits out a sample $\\hat{z} \\sim q(z|\\hat{x})$ and we stop.\n",
"\n",
"We transmit this $\\hat{z}$ to the receiver (using an arithmetic coder again), incurring $-log_2p(\\hat{z})$ bits cost. This cost will most likely be high since we chose a high precision $\\hat{z}$, e.g. 32 bit.\n",
"\n",
"We calculate $p(x|\\hat{z})$ and transmit $\\hat{x}$ using this distribution and arithmetic coding, incurring $-log_2 p(\\hat{x}|\\hat{z})$ cost. We've now transmitted $\\hat{x}$ losslessly at a cost of $-log_2 p(\\hat{z}) -log_2 p(\\hat{x}|\\hat{z})$.\n",
"\n",
"Now we calculate the posterior $p(z|\\hat{x})$ on the receiver side.\n",
"\n",
"Now we can run the second arithmetic coder in decode mode, using the posterior $p(z|\\hat{x})$ and inputting $\\hat{z}$. This will recover the message: \"bits back coding is super easy to understand\".\n",
"\n",
"We have now \"gotten some bits back\", the information content of the message. On average this will be $H(q(z|x))$ bits."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"# Unanswered questions\n",
"\n",
"Not all $\\hat{z}$ sampled from the posterior will be equally good. You might sample a highly unlikely $\\hat{z}$, which would make $\\hat{x}$ very unlikely under $p(x|\\hat{z})$, thus incurring an extra cost on the $-log_2 p(\\hat{x}|\\hat{z})$ term.\n",
"\n",
"However, an unlikely $z$ means you can send a bigger message.\n",
"\n",
"How much and when these two terms cancel out I don't know. Intuitively it feels like a zero sum game, and that sending the mode of the posterior instead of sampling from it should be as good as sending a sample + an extra message encoded in the sample."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.11"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment