Skip to content

Instantly share code, notes, and snippets.

@wiso
Created October 9, 2017 10:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wiso/19c6772114cbc91bff7a919686756afe to your computer and use it in GitHub Desktop.
Save wiso/19c6772114cbc91bff7a919686756afe to your computer and use it in GitHub Desktop.
Example of rounding error issues
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import struct\n",
"from numpy import float32, float64\n",
"\n",
"def binary(num):\n",
" if type(num) in (float, float64):\n",
" t = 'd'\n",
" elif type(num) is float32:\n",
" t = 'f'\n",
" return ''.join(bin(ord(c)).replace('0b', '').rjust(8, '0') for c in struct.pack('!%s' % t, num))\n",
"\n",
"def split_float(num):\n",
" b = binary(num)\n",
" if type(num) in (float, float64):\n",
" return b[0], b[1:12], b[12:] \n",
" elif type(num) is float32: \n",
" return b[0], b[1:32-23], b[32-23:]\n",
"\n",
"def float32todec(sign, mantissa, exp):\n",
" return (-1) ** sign * (1 + int(mantissa, base=2) / 2. ** 23) * 2**(int(exp, base=2)-127)\n",
"\n",
"def show_float(num):\n",
" sign, exp, mantissa = split_float(num)\n",
" c = '1' if num != 0 else '0'\n",
" return \"{} {} ({}.){}\".format(sign, exp, c, mantissa)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Il numero decimale 6.5 è rappresentato in binario da 6.5<sub>10</sub>=110.1<sub>2</sub>, infatti\n",
"$$\n",
"1\\times 2^2 + 1 \\times 2^1 + 0 \\times 2^0 + 1 \\times 2^{-1} = 4 + 2 + 0 + 0.5 = 6.5\n",
"$$\n",
"La conversione da decimale può essere fatta come:\n",
" * $6.5 / 2^2 = 1$ con resto $2.5$\n",
" * $2.5 / 2^1 = 1$ con resto $0.5$\n",
" * $0.5 / 2^0 = 0$ con resto $0.5$\n",
" * $0.5 / 2^{-1} = 1$ con resto $0$\n",
" \n",
"Scritto in forma normalizzata risulta: $1.101_{2} \\times 2^{2}$. All'esponente si aggiunge un bias, nel caso di float32 127: 2 + 127 = 129 = 10000001<sub>2</sub> e poiché la rappresentazione è normalizzata la cifra prima del punto non viene memorizzata. Quindi la rappresentazione sarà:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 10000001 (1.)10100000000000000000000\n"
]
}
],
"source": [
"print show_float(float32(6.5))"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.10 0 01111011 (1.)10011001100110011001101\n",
"0.20 0 01111100 (1.)10011001100110011001101\n",
"0.25 0 01111101 (1.)00000000000000000000000\n",
"0.30 0 01111101 (1.)00110011001100110011010\n",
"0.40 0 01111101 (1.)10011001100110011001101\n",
"0.50 0 01111110 (1.)00000000000000000000000\n"
]
}
],
"source": [
"for num in 0.1, 0.2, 0.25, 0.3, 0.4, 0.5:\n",
" print \"{:.2f} {}\".format(num, show_float(float32(num)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 0.3 - 0.2"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.10000001"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"float32(0.3) - float32(0.2)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 01111011 (1.)10011001100110011001110\n"
]
}
],
"source": [
"print show_float(float32(0.3) - float32(0.2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"La prima operazione è 0.3 - 0.2. Questi due non hanno lo stesso esponente, quindi il più piccolo va riscritto\n",
"\n",
" 0.30 0 01111101 (1.)00110011001100110011010\n",
" 0.20 0 01111100 (1.)10011001100110011001101\n",
" 0.20 0 01111101 (0.)1100110011001100110011010\n",
" \n",
"Nello standard IEEE 745 si usano due bit in più durante i calcoli: il guard bit G il round bit R:\n",
"\n",
" 0.30 0 01111101 (1.)0011001100110011001101000\n",
" 0.20 0 01111101 (0.)1100110011001100110011010\n",
" 0.30-0.20 0 01111101 (0.)0110011001100110011001110 \n",
" \n",
"Normalizzandolo (moltiplico per 2<sup>2</sup> la mantissa (shift di due a sinistra) e sottraggo 2 all'esponente): \n",
"\n",
" 0.30-0.20 0 01111011 (1.)10011001100110011001110\n",
" \n",
"Questo numero è diverso dalla rappresentazione di 0.1 e vale: "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.10000000894069672"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"float32todec(0, \"10011001100110011001110\", \"01111011\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 0.3 - 0.2 - 0.1"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"7.4505806e-09"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"float32(0.3) - float32(0.2) - float32(0.1)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 01100100 (1.)00000000000000000000000\n"
]
}
],
"source": [
"print show_float(float32(0.3) - float32(0.2) - float32(0.1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Partiamo dal risultato precedente. Possiamo sottrarre le due mantissa perché gli esponenti sono già uguali:\n",
" \n",
" 0.30-0.20 0 01111011 (1.)10011001100110011001110\n",
" 0.10 0 01111011 (1.)10011001100110011001101\n",
" 0.3-0.2-01 0 01111011 (0.)00000000000000000000001\n",
" 0.3-0.2-01 0 01100100 (1.)00000000000000000000000\n",
" \n",
"In questo caso G e R non avrebbero fatta nessuna differenza. Questo numero vale:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"7.450580596923828e-09"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"float32todec(0, \"00000000000000000000000\", \"01100100\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Usando 64 bit (precisione doppia, double) si otterrebbe:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-2.7755575615628914e-17"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"float64(0.3) - float64(0.2) - float64(0.1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 0.3 + 0.1"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.40000001"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"float32(0.3) + float32(0.1)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 01111101 (1.)10011001100110011001101\n"
]
}
],
"source": [
"print show_float(float32(0.3) + float32(0.1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" 0.30 0 01111101 (1.)00110011001100110011010\n",
" 0.10 0 01111011 (1.)10011001100110011001101\n",
"\n",
"Bisogna rinormalizzare\n",
"\n",
" 0.30 0 01111101 (1.)00110011001100110011010\n",
" 0.10 0 01111101 (0.)0110011001100110011001101\n",
" 0.30+0.10 0 01111101 (1.)1001100110011001100110101\n",
" \n",
"E arrotondare\n",
"\n",
" 0.30+0.10 0 01111101 (1.)10011001100110011001101\n",
" \n",
"Questo numero vale: "
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.4000000059604645"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"float32todec(0, \"10011001100110011001101\", \"01111101\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 0.4 - (0.3 + 0.1)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.0"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"float32(0.4) - (float32(0.3) + float32(0.1))"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 00000000 (0.)00000000000000000000000\n"
]
}
],
"source": [
"print show_float(float32(0.4) - (float32(0.3) + float32(0.1)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" 0.4 0 01111101 (1.)10011001100110011001101\n",
" 0.3+0.1 0 01111101 (1.)10011001100110011001101\n",
" 0.4-(0.3+0.1) 0 00000000 (0.)00000000000000000000000"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment