Skip to content

Instantly share code, notes, and snippets.

@rushilgupta
Last active May 28, 2019 06:01
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save rushilgupta/01b4ce282d4b0055198bdfbf2688c6c8 to your computer and use it in GitHub Desktop.
Save rushilgupta/01b4ce282d4b0055198bdfbf2688c6c8 to your computer and use it in GitHub Desktop.
Recurrent Neural Nets and Tensorflow

Intro

There is a lot of buzz around neural-nets and how cool they are for image-recognition and text-recognition. There are a lot of sources out there to learn tensorflow and their application to solve these set of problems.

I'd reommend a course offered by Google on Udacity which teaches all about this stuff: https://classroom.udacity.com/courses/ud730.

This course doesn't cover the details of mathematics behind neural-nets and so doesn't this gist. However, the instructors do go into the intuition, breaking down a complex problem like visual recognition into abstract concepts and writing a program to predict some tensors and ultimately those concepts.

Through this gist I've tried to cover a very basic introduction to the tensorflow framework, the data structures it uses and it's application in a predictive-text model.

What is a tensor?

Tensor is a N-dimensional array. You can declare it as:

import tensorflow as tf

scalar = tf.constant(10000) # 0-Dimension
vector = tf.constant([1, 2, 3, 4, 5]) # 1-Dimension
matrix = tf.constant([[1, 2, 3], [4, 5, 6]]) # 2-Dimension
cube = tf.constant([[[1], [2], [3]], [[4], [5], [6]], [[7], [8], [9]]]) # 3-Dimension

print(cube)
print(matrix)
print(vector)
print(scalar)

When you print it

Tensor("Const_3:0", shape=(3, 3, 1), dtype=int32)
Tensor("Const_2:0", shape=(2, 3), dtype=int32)
Tensor("Const_1:0", shape=(5,), dtype=int32)
Tensor("Const:0", shape=(), dtype=int32)

It didn't print the tensors but that it did was printed their properties.

e.g. Cube is a 3 dim tensor with dimensions 3x3x1 of type int32.

Why we want it to flow?

Now moving on to tensor 'flow'.

How neural-nets work is that they try to do a mathematical compute (say, a logistic regression) on these vectors.

  • Training: this mathematical compute step is performed over and over on these N-sized vectors until we get a desired output. This mathematical compute step is called 'training'.
  • Testing: we do 'training' in iterations until models predict our 'test' data decently enough (read 'until we get the desired output').
  • Training data: Input data converted to tensors for training.
  • Test data: Input data converted to tensors for testing

Tensorflow is all about converting data in tensors in desire shape and training those tensors into a model by 'flowing' them. Let's see if I can conjure a small example of a compute step:

The problem we are trying to solve

A memory cell

Words2Vec

Bringing it all together

The hackernews comment replier

Do I need neural nets?

This is actually a great question! In industry, a product sometimes just need a bunch of hardcoded values, a weighted graph, or a statistical model involving something eh..more superficial?

I am no ML ninja and my mathematics is pretty much at undergraduate level; but you've to understand the requirements and then build solution on it.

The udacity course starts with a naive and relatively superficial (A-J) character visual recognition using logistic recognition where we get an accuracy of 87%. Then the course builds on making it a deeper network and you get an accuracy of 95% and then you learn more tricks of the trade and get it to 97%.

What I am getting at is, start with an easier solution (KISS approach) and improve accuracy by adopting more extensive techniques.

Who knows, maybe all your product needs is text-similarity: http://multithreaded.stitchfix.com/blog/2017/10/18/stop-using-word2vec/

Footnote

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment