abishekk92/learn_fizzbuzz.txt

## learn_fizzbuzz.txt
So I tried training a 2 Layer GRU Encoder - Decoder Recurrent Neural Network to solve the well known fizzbuzz problem.
For a max sequence length of 5 and 5K toy samples, the network was able to reach 98%  validation accuracy in 30 epochs.

Model Summary
============

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
gru_1 (GRU)                      (None, 5, 128)        63744       gru_input_1[0][0]
____________________________________________________________________________________________________
gru_2 (GRU)                      (None, 128)           98688       gru_1[0][0]
____________________________________________________________________________________________________
repeatvector_1 (RepeatVector)    (None, 8, 128)        0           gru_2[0][0]
____________________________________________________________________________________________________
gru_3 (GRU)                      (None, 8, 128)        98688       repeatvector_1[0][0]
____________________________________________________________________________________________________
gru_4 (GRU)                      (None, 8, 128)        98688       gru_3[0][0]
____________________________________________________________________________________________________
timedistributed_1 (TimeDistribute(None, 8, 37)         4773        gru_4[0][0]
____________________________________________________________________________________________________
activation_1 (Activation)        (None, 8, 37)         0           timedistributed_1[0][0]
====================================================================================================
Total params: 364581
____________________________________________________________________________________________________

('Iteration', 6)
Train on 4500 samples, validate on 500 samples
Epoch 1/5
4500/4500 [==============================] - 5s - loss: 0.0672 - acc: 0.9824 - val_loss: 0.0950 - val_acc: 0.9737
Epoch 2/5
4500/4500 [==============================] - 5s - loss: 0.0569 - acc: 0.9848 - val_loss: 0.0834 - val_acc: 0.9767
Epoch 3/5
4500/4500 [==============================] - 5s - loss: 0.0442 - acc: 0.9903 - val_loss: 0.0806 - val_acc: 0.9768
Epoch 4/5
4500/4500 [==============================] - 5s - loss: 0.0363 - acc: 0.9922 - val_loss: 0.0774 - val_acc: 0.9763
Epoch 5/5
4500/4500 [==============================] - 5s - loss: 0.0322 - acc: 0.9937 - val_loss: 0.0665 - val_acc: 0.9817

Sample Results
=============

8024->8024
8024 is Correct


10370->buzz
buzz is Correct


5287->5287
5287 is Correct


96520->buzz
buzz is Correct


301->301
301 is Correct


914->914
914 is Correct


7054->7054
7054 is Correct


82650->fizzbuzz
fizzbuzz is Correct


9951->fizz
fizz is Correct


7314->fizz
fizz is Correct
-
	So I tried training a 2 Layer GRU Encoder - Decoder Recurrent Neural Network to solve the well known fizzbuzz problem.
	For a max sequence length of 5 and 5K toy samples, the network was able to reach 98% validation accuracy in 30 epochs.

	Model Summary
	============

	____________________________________________________________________________________________________
	Layer (type) Output Shape Param # Connected to
	====================================================================================================
	gru_1 (GRU) (None, 5, 128) 63744 gru_input_1[0][0]
	____________________________________________________________________________________________________
	gru_2 (GRU) (None, 128) 98688 gru_1[0][0]
	____________________________________________________________________________________________________
	repeatvector_1 (RepeatVector) (None, 8, 128) 0 gru_2[0][0]
	____________________________________________________________________________________________________
	gru_3 (GRU) (None, 8, 128) 98688 repeatvector_1[0][0]
	____________________________________________________________________________________________________
	gru_4 (GRU) (None, 8, 128) 98688 gru_3[0][0]
	____________________________________________________________________________________________________
	timedistributed_1 (TimeDistribute(None, 8, 37) 4773 gru_4[0][0]
	____________________________________________________________________________________________________
	activation_1 (Activation) (None, 8, 37) 0 timedistributed_1[0][0]
	====================================================================================================
	Total params: 364581
	____________________________________________________________________________________________________

	('Iteration', 6)
	Train on 4500 samples, validate on 500 samples
	Epoch 1/5
	4500/4500 [==============================] - 5s - loss: 0.0672 - acc: 0.9824 - val_loss: 0.0950 - val_acc: 0.9737
	Epoch 2/5
	4500/4500 [==============================] - 5s - loss: 0.0569 - acc: 0.9848 - val_loss: 0.0834 - val_acc: 0.9767
	Epoch 3/5
	4500/4500 [==============================] - 5s - loss: 0.0442 - acc: 0.9903 - val_loss: 0.0806 - val_acc: 0.9768
	Epoch 4/5
	4500/4500 [==============================] - 5s - loss: 0.0363 - acc: 0.9922 - val_loss: 0.0774 - val_acc: 0.9763
	Epoch 5/5
	4500/4500 [==============================] - 5s - loss: 0.0322 - acc: 0.9937 - val_loss: 0.0665 - val_acc: 0.9817

	Sample Results
	=============

	8024->8024
	8024 is Correct


	10370->buzz
	buzz is Correct


	5287->5287
	5287 is Correct


	96520->buzz
	buzz is Correct


	301->301
	301 is Correct


	914->914
	914 is Correct


	7054->7054
	7054 is Correct


	82650->fizzbuzz
	fizzbuzz is Correct


	9951->fizz
	fizz is Correct


	7314->fizz
	fizz is Correct
	-