rushilgupta/wip-RNN.md

## wip-RNN.md

      
    Raw
  

              wip-RNN.md
            
          
    Intro

There is a lot of buzz around neural-nets and how cool they are for image-recognition and text-recognition. There are a lot of sources out there to learn tensorflow and their application to solve these set of problems.
I'd reommend a course offered by Google on Udacity which teaches all about this stuff: https://classroom.udacity.com/courses/ud730.
This course doesn't cover the details of mathematics behind neural-nets and so doesn't this gist. However, the instructors do go into the intuition, breaking down a complex problem like visual recognition into abstract concepts and writing a program to predict some tensors and ultimately those concepts.
Through this gist I've tried to cover a very basic introduction to the tensorflow framework, the data structures it uses and it's application in a predictive-text model.
What is a tensor?

Tensor is a N-dimensional array. You can declare it as:
import tensorflow as tf

scalar = tf.constant(10000) # 0-Dimension
vector = tf.constant([1, 2, 3, 4, 5]) # 1-Dimension
matrix = tf.constant([[1, 2, 3], [4, 5, 6]]) # 2-Dimension
cube = tf.constant([[[1], [2], [3]], [[4], [5], [6]], [[7], [8], [9]]]) # 3-Dimension

print(cube)
print(matrix)
print(vector)
print(scalar)
When you print it
Tensor("Const_3:0", shape=(3, 3, 1), dtype=int32)
Tensor("Const_2:0", shape=(2, 3), dtype=int32)
Tensor("Const_1:0", shape=(5,), dtype=int32)
Tensor("Const:0", shape=(), dtype=int32)

It didn't print the tensors but that it did was printed their properties.
e.g. Cube is a 3 dim tensor with dimensions 3x3x1 of type int32.
Why we want it to flow?

Now moving on to tensor 'flow'.
How neural-nets work is that they try to do a mathematical compute (say, a logistic regression) on these vectors.

Training: this mathematical compute step is performed over and over on these N-sized vectors until we get a desired output. This mathematical compute step is called 'training'.
Testing: we do 'training' in iterations until models predict our 'test' data decently enough (read 'until we get the desired output').
Training data: Input data converted to tensors for training.
Test data: Input data converted to tensors for testing

Tensorflow is all about converting data in tensors in desire shape and training those tensors into a model by 'flowing' them. Let's see if I can conjure a small example of a compute step:
The problem we are trying to solve
A memory cell
Words2Vec
Bringing it all together
The hackernews comment replier
Do I need neural nets?

This is actually a great question! In industry, a product sometimes just need a bunch of hardcoded values, a weighted graph, or a statistical model involving something eh..more superficial?
I am no ML ninja and my mathematics is pretty much at undergraduate level; but you've to understand the requirements and then build solution on it.
The udacity course starts with a naive and relatively superficial (A-J) character visual recognition using logistic recognition where we get an accuracy of 87%. Then the course builds on making it a deeper network and you get an accuracy of 95% and then you learn more tricks of the trade and get it to 97%.
What I am getting at is, start with an easier solution (KISS approach) and improve accuracy by adopting more extensive techniques.
Who knows, maybe all your product needs is text-similarity: http://multithreaded.stitchfix.com/blog/2017/10/18/stop-using-word2vec/
Footnote