Skip to content

Instantly share code, notes, and snippets.

@jeremyfix
Created May 21, 2019 11:51
Show Gist options
  • Save jeremyfix/36eadaec2121f21de97bc16407591fa0 to your computer and use it in GitHub Desktop.
Save jeremyfix/36eadaec2121f21de97bc16407591fa0 to your computer and use it in GitHub Desktop.
digits loader
# Data from : http://www.metz.supelec.fr//metz/personnel/frezza/ApprentissageNumerique/TP-MachineLearning/dig_app_text.cb
# Data are said to be provided UCI Machine Learning Repository ... but I did not find the dataset on the repository..
import numpy as np
digit_idx = 0
header = 5
width = 28
height = 28
num_digits = 4000
filename = 'dig_app_text.cb'
samples = np.zeros(num_digits, dtype=[('input', float, width*height), ('label', int, 1)])
with open(filename) as digits:
line = digits.readline()
for i in range(header):
line = digits.readline()
while line:
# Parse the digit
digit = []
for i in range(height):
digit += map(int, line.split())
line = digits.readline()
samples['input'][digit_idx, :] = np.array(digit) / 255.
# Parse the label
label = list(map(int, line.split())).index(1)
samples['label'][digit_idx] = label
digit_idx += 1
line = digits.readline()
np.save('dig_app_text.cb.npy', samples)
import matplotlib.pyplot as plt
fig, axes = plt.subplots(1,10, figsize=(10,2))
for i in range(10):
axes[i].imshow(samples['input'][i].reshape((width, height)) , cmap='gray_r')
axes[i].set_xticks([])
axes[i].set_yticks([])
axes[i].set_title('I\'m a {}'.format(samples['label'][i]))
plt.savefig('digits.png', bbox_inches='tight')
plt.show()
@jeremyfix
Copy link
Author

Get the data from dig_app_text.cb and copy it close to the script.

python3 load_digits.py

digits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment