I'm reading "Grokking deep learning" - I've enjoyed chapters 1-4 but Chapter 5's speed, mistakes and omitted materials have lead to a bit of a roadblock for me.
I believe I've managed to create a simple neural net, that uses gradient descent to learn but when I use the raw MNIST dataset, without transforming the pixel values from large numbers (numbers above 1) to binary (0,1) then I can't train my model, I get "double scalar" errors and all sorts of weird things (see below for examples).
I was wondering if you could look over my code, tell me where I've went wrong (tweet me: @bunsen or email: auston.bunsen@gmail.com) & if you'd like you can use this code for free, under the MIT license.
@iamtrask thanks! One question, why do I have to do this?
why doesn't the value for a pixels darkness (?) from MNIST work out of the box?
# val = ord(i_f.read(1)) why does this break it? do I need to scale this to something between 0 - 1?