Skip to content

Instantly share code, notes, and snippets.

@Orbifold
Created November 25, 2016 09:47
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save Orbifold/b31265032d001646370cc9306cf4ca6d to your computer and use it in GitHub Desktop.
Feedforward examples using Keras.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Some feedforward neural networks using Keras\n",
"\n",
"In [a previous article](http://www.orbifold.net/default/2016/11/18/bare-bones-of-neural-networks/) I explained the way I see neural networks and gave some basic examples. Personally I believe in 'simple examples' as a way to comprehend crucial principles and this article continues in this fashion. By looking at a single cell the activation functions are highlighted and it's shown that picking the most appropriate one can be done using grid-search. Along the way you can see that simple feedforward networks are a way to dissipate noise and that neural networks are really just functions.\n",
"Like the previous article, the examples are on top of the Keras framework but you can recreate all of this in TensorFlow, Caffe or any other neural framework. \n",
"\n",
"Feedforward networks are fairly easy but can nevertheless produce great results. One would sometimes forget, considering what the internet is buzzing about, that not everything needs to be casted in convolutional and/or recurrent topologies.\n",
"\n",
"None of the examples require GPU or datacenters, the synthetic or artificial data is designed to highlight a particular aspect and not a real-world case.\n",
"\n",
"\n",
"## Counting from 0 to 9\n",
"\n",
"Let's start with learning a network to count from 0 to 9. The code predicts the next number for a given sequence of the previous numbers. The number 9 is followed by 0 in cycles."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [],
"source": [
"import numpy as np\n",
"from keras.datasets import imdb\n",
"from keras.models import Sequential\n",
"from keras.layers import Dense\n",
"\n",
"from keras.preprocessing import sequence\n",
"from keras.utils import np_utils\n",
"\n",
"base_series = [0,1,2,3,4,5,6,7,8,9]\n",
"series = base_series*10\n",
"seq_length = len(base_series)\n",
"X = []\n",
"Y = []\n",
"def unit(index): return [1.0 if i == index else 0.0 for i in range(seq_length)]\n",
"# make buckets\n",
"for i in range(0, len(series) - seq_length, 1):\n",
" X.append(series[i:i + seq_length])\n",
" Y.append(unit(np.mod(i, seq_length)))\n",
"X = np.array(X)\n",
"Y = np.array(Y)\n",
"\n",
"\n",
"model = Sequential()\n",
"\n",
"model.add(Dense(seq_length, input_dim=X.shape[1], init='normal', activation='softmax'))\n",
"# try alternatives if you wish\n",
"#model.add(Dense(30,input_dim=X.shape[1], activation=\"relu\", init='normal'))\n",
"#model.add(Dense(seq_length, init='normal', activation='softmax'))\n",
"\n",
"model.compile(loss='mean_absolute_error', optimizer='rmsprop', metrics=['accuracy'])\n",
"model.fit(X, Y, nb_epoch=350, verbose=0)\n",
"scores = model.evaluate(X, Y, verbose=0)\n",
"print(\"Model Accuracy: %.2f%%\" % (scores[1]*100))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the data is partitioned in buckets so the to-be-predicted number is not based on a single digit but on a bucket of digits. When data has some time-like ordering one typically uses networks with memories aka recurrence but this bucket approach works just as well in simple situations (i.e. no variations and few features). You should try to make the same prediction network with only a single number. The bucket approach is useful as a preparation for recurrent networks where one typically has this step-backward situation.\n",
"\n",
"## Bit shift operator\n",
"\n",
"Like the counting example we take some binary buckets and shift the 1-bits to the right."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import numpy as np\n",
"from keras.datasets import imdb\n",
"from keras.models import Sequential\n",
"from keras.layers import Dense\n",
"from keras.layers import LSTM\n",
"from keras.layers.embeddings import Embedding\n",
"from keras.preprocessing import sequence\n",
"from keras.utils import np_utils\n",
"\n",
"\n",
"X = []\n",
"Y = []\n",
"train_size = 50\n",
"seq_length = 5\n",
"def unit(index): return [1.0 if i == index else 0.0 for i in range(seq_length)]\n",
"for i in range(train_size):\n",
" X.append(unit(np.mod(i, seq_length)) )\n",
" Y.append(unit(np.mod(i+1, seq_length)))\n",
"X = np.array(X)\n",
"Y = np.array(Y)\n",
"#print(X.shape, Y.shape)\n",
"\n",
"model = Sequential()\n",
"model.add(Dense(20,input_dim=X.shape[1], activation=\"relu\", init='normal'))\n",
"model.add(Dense(20, activation=\"relu\", init='normal'))\n",
"model.add(Dense(seq_length, init='normal', activation='softmax'))\n",
"\n",
"model.compile(loss='mean_absolute_error', optimizer='rmsprop', metrics=['accuracy'])\n",
"model.fit(X, Y, nb_epoch=350, verbose=0)\n",
"scores = model.evaluate(X, Y, verbose=0)\n",
"print(\"Model accuracy: %.2f%%\" % (scores[1]*100))\n",
"print(\"Model loss: %.2f%%\" % (scores[0]*100))\n",
"# you can see what the network does to the whole training data by means of\n",
"# model.predict(X)\n",
"# to see the output of a single vector you can use\n",
"model.predict(np.array([[0,1,0,0,0]]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The output is not precise but you can truncate it and plot it to see it more clearly."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"%matplotlib inline\n",
"s = np.array([[0,1,0,0,0]])\n",
"plt.imshow(np.concatenate( (s, model.predict(s)) ), interpolation='nearest', cmap=plt.cm.Greys)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using the score you can see that all solutions give accuracy 100% but the loss differs:\n",
"single: 20.62%\n",
"one extra: 4.05%\n",
"two extra: 5.82%\n",
"one (20): 0.24%\n",
"two (20): 0.00%\n",
"\n",
"So, you don't need to increase complexity in order to achieve accuracy but the signal will be more sharp if you do.\n",
"\n",
"The truncation to integers can also be achieved by means of custom layers. Below you can find the `Round` layer which does precisely this.\n",
"\n",
"\n",
"# Neurons as functions\n",
"\n",
"A single neuron, node or cell is just a function and if you play a bit with the API you can also visualize the various activation functions.\n",
"Let's explicitly assign weights to a single cell thus preventing this to affect the output:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"we = [np.array([[0.8]]), np.array([0.])]\n",
"model = Sequential()\n",
"model.add(Dense(1, input_dim=1, weights=we))\n",
"model.summary()\n",
"model.layers[0].get_weights()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that if no activation is specified it will default to linear and that compiling a network will typically assign random weights to the nodes. Details of how the activations are effectively implemeted (really straightforward) can be found [here](https://github.com/fchollet/keras/blob/016d85c9e6d8a36fe7107e32752f6a9cd8d77c86/keras/activations.py) but you can also easily plot the activation functions by using the single cell as a functions: "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"we = [np.array([[2.]]), np.array([0.])]\n",
"def pred(name):\n",
" model = Sequential()\n",
" model.add(Dense(1, input_dim=1, weights=we, activation=name))\n",
" return model.predict(np.array([[i] for i in np.arange(-2,2,.1)]))\n",
"\n",
"f, ar = plt.subplots(2, 2, sharey=True)\n",
"plt.ylim(-.1,1.1)\n",
"ar[0,0].plot(pred(\"hard_sigmoid\"))\n",
"ar[0,0].set_title('hard_sigmoid')\n",
"ar[0,1].plot(pred(\"relu\"))\n",
"ar[0,1].set_title('relu')\n",
"ar[1,0].plot(pred(\"sigmoid\"))\n",
"ar[1,0].set_title('sigmoid')\n",
"plt.subplots_adjust(top=1.5)\n",
"ar[1,1].plot(pred(\"tanh\"))\n",
"ar[1,1].set_title('tanh')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you want a custom activation function you can simply plug in your own function instead of a name (string). Whether something like the sine below makes sense is of course another matter, but you can indeed use anything you like:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def custom(x): \n",
" return np.sin(x)**4\n",
"we = [np.array([[2.]]), np.array([0.])]\n",
"model = Sequential()\n",
"model.add(Dense(1, input_dim=1, weights=we, activation=custom))\n",
"pred = model.predict(np.array([[i] for i in np.arange(-2,2,.1)]))\n",
"plt.ylim(-.1,1.1)\n",
"plt.plot(pred)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that activation can be added separately to the model like so"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from keras.layers.core import Activation\n",
"model = Sequential()\n",
"model.add(Dense(1, input_dim=1, weights=we))\n",
"model.add(Activation(custom))\n",
"pred = model.predict(np.array([[i] for i in np.arange(-2,2,.1)]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, there are also **advanced activation functions** in Keras which are there for specific tasks. Though you can use them like any other activation function, they work well for image-oriented learning. For instance, the parametric rectified linear unit or [PReLu](https://keras.io/layers/advanced_activations/) function was invented to [surpass human-level performance on ImageNet classification](https://arxiv.org/pdf/1502.01852v1.pdf).\n",
"\n",
"# Picking the most appropriate activation functions\n",
"The activation functions seem to be only slightly different but they actually do make a big difference. In the example below we have some artificial data consisting of a line with a bump in the middle together with some noise and try to make the neural network learn the shape of the curve. You can see from the plot below that relu does a much lesser job than the hyperbolic tangent activation."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"%matplotlib inline\n",
"from keras.models import Sequential\n",
"from keras.layers import Dense\n",
"from keras.preprocessing import sequence\n",
"from keras.utils import np_utils\n",
"from keras.optimizers import *\n",
"def shakenGauss(x): return x + 5*np.exp(-(x+2)**2)+0.1*np.random.randn()\n",
"shakenGauss = np.vectorize(shakenGauss)\n",
"\n",
"X = np.arange(-7, 7, 0.005)\n",
"Y = shakenGauss(X)\n",
"plt.plot(X,Y)\n",
"# try to use other optimizers to see what it gives\n",
"# here the stochastic gradient descent\n",
"#https://github.com/fchollet/keras/blob/f127b2f81d5d71fa9ab938ba6f42866d31864259/keras/optimizers.py#L114\n",
"# lr: learning rate or how fast the minimum is reached\n",
"opt = SGD(lr=0.001)\n",
"\n",
"def fit(activationName):\n",
" model = Sequential()\n",
" model.add(Dense(10,input_dim=1)) \n",
" model.add(Dense(10, activation=activationName))\n",
" model.add(Dense(1))\n",
" model.compile(loss='mean_absolute_error', optimizer=opt, metrics=['accuracy'])\n",
" model.fit(X, Y, nb_epoch=800, verbose=0)\n",
" return model\n",
"\n",
"model1 = fit(\"tanh\")\n",
"pred1 = model1.predict(X)\n",
"# metrics from the evaluate process can be fetched from model1.metrics_names\n",
"print(\"\\ntanh loss: %s \"%model1.evaluate(X,Y)[0])\n",
"plt.plot(X, pred1, color=\"orange\")\n",
"\n",
"model2 = fit(\"relu\")\n",
"pred2 = model2.predict(X)\n",
"print(\"\\nrelu loss: %s \"%model2.evaluate(X,Y)[0])\n",
"plt.plot(X, pred2, color=\"red\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How can one optimize this and pick up the most appropriate activation? You can loop over the various activations or use the sklearn wrapper for Keras which allows you to use Keras networks as machine learning models in sklearn."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [],
"source": [
"import numpy\n",
"from sklearn.grid_search import GridSearchCV\n",
"from keras.models import Sequential\n",
"from keras.layers import Dense\n",
"from keras.wrappers.scikit_learn import KerasClassifier\n",
"from sklearn.metrics import make_scorer\n",
"def shakenGauss(x): return x + 5*np.exp(-(x+2)**2)+0.1*np.random.randn()\n",
"shakenGauss = np.vectorize(shakenGauss)\n",
"\n",
"X = np.arange(-5, 5, 0.05)\n",
"Y = shakenGauss(X)\n",
"def create_model(activationName):\n",
" model = Sequential()\n",
" model.add(Dense(10,input_dim=1)) \n",
" model.add(Dense(10, activation=activationName))\n",
" model.add(Dense(1))\n",
" model.compile(loss='mean_absolute_error', optimizer='adam', metrics=['accuracy'])\n",
" model.fit(X, Y, nb_epoch=100, verbose=0)\n",
" return model\n",
"def overall_average_score(actual,prediction): \n",
" return np.average(np.abs(actual - prediction))\n",
"\n",
"model = KerasClassifier(build_fn=create_model, nb_epoch=100, batch_size=10, verbose=0)\n",
"activationNames = ['softmax', 'softplus', 'softsign', 'relu', 'tanh', 'sigmoid', 'hard_sigmoid', 'linear']\n",
"param_grid = dict(activationName = activationNames)\n",
"grid_scorer = make_scorer(overall_average_score, greater_is_better=False)\n",
"grid = GridSearchCV(estimator = model, param_grid = param_grid, n_jobs=1, scoring=grid_scorer)\n",
"grid_result = grid.fit(X, Y) \n",
"\n",
"print(\"Best activation is '%s'.\" % grid_result.best_params_[\"activationName\"])\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The scores can be seen from `grid_scores_`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [],
"source": [
"grid.grid_scores_"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can further refine the network with grid-searching the appropriate optimizer, loss function and pretty much every parameter (including the weights). Ain't it wonderful you can combine Keras and Scikit-learn? [Jason Brownlee](http://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/) has a great blog post on how to do all of this.\n",
"\n",
"The neural network (especially the low-loss ones) approximates the syntehtic function quite well and can see through the super-imposed noise. One could of course filter out the noise in other ways (chi-square or moving averages) but the fact that the network does this without explicitly encoding it is a nice feature on its own."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Cellular automata\n",
"\n",
"[Cellular automata](https://en.wikipedia.org/wiki/Cellular_automaton) are in a way primitive neural networks in the sense that they encapsulate state machines which can be found inside e.g. LSTM nodes. From another angle, a cellular automata is just a (discrete) function and like any other function can be mimiced or approximated by neural nets. The rule 30 used below is [a world on its own](https://en.wikipedia.org/wiki/A_New_Kind_of_Science) and one could probably find interesting morphisms (same category?) between the world of automata and the world or neural networks.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# this outputs a piece of automata\n",
"def ca_data(rulenum:int = 30, height:int = 50, width:int = 20, dorandom:bool = True ): \n",
" if dorandom:\n",
" first_row = [np.random.randint(2) for i in range(width)]\n",
" else:\n",
" first_row = [0]*width\n",
" first_row[int(width/2)] = 1\n",
" results = [first_row] \n",
" rule = [int((30/pow(2,i)) % 2) for i in range(8)]\n",
"\n",
" for i in range(height-1):\n",
" data = results[-1] \n",
" new = [rule[4*data[(j-1)%width]+2*data[j]+data[(j+1)%width]] for j in range(width)]\n",
" results.append(new)\n",
" return results\n",
"\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"%matplotlib inline\n",
"\n",
"plt.imshow(ca_data(), interpolation='nearest', cmap=plt.cm.Greys)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's try to use the data to train a dense network. Note that we define a custom layer to output bits and that it's really easy to add your own modules or layers. Like above, there are other ways to truncate data but this shows how you can plug into the API."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import numpy as np\n",
"from keras.datasets import imdb\n",
"from keras.models import Sequential\n",
"from keras.layers import Dense\n",
"from keras.layers import LSTM\n",
"from keras.layers.embeddings import Embedding\n",
"from keras.preprocessing import sequence\n",
"from keras.utils import np_utils\n",
"from keras.callbacks import EarlyStopping\n",
"from keras.layers.core import Dense, Activation\n",
"from keras.models import Sequential\n",
"from keras.optimizers import SGD\n",
"from keras.layers import Layer\n",
"import keras.backend as K\n",
"\n",
"world_size = 20\n",
"data = ca_data(30, 1000, world_size, True)\n",
"X_train = np.array(data[:-1])\n",
"y_train = np.array(data[1:])\n",
"test_data = ca_data(30, 100, world_size, True)\n",
"X_test = np.array(test_data[:-1])\n",
"y_test = np.array(test_data[1:])\n",
"\n",
"# custom Keras layer to truncate floats to bits\n",
"class Round(Layer):\n",
" def get_output_shape_for(self, input_shape): \n",
" return input_shape\n",
" def call(self, x, mask=None): \n",
" return K.round(x)\n",
" \n",
"def build_and_train_mlp_network(X_train, y_train, X_test, y_test):\n",
"\n",
" nb_epoch = 600\n",
" batch_size = 10\n",
"\n",
" model = Sequential()\n",
" model.add(Dense(15, input_dim=X_train.shape[1], activation='sigmoid')) \n",
" model.add(Dense(20, activation='linear')) \n",
" model.add(Dense(20, activation='sigmoid')) \n",
" model.add(Dense(world_size, activation='sigmoid')) \n",
" model.add(Round()) \n",
" model.compile(loss='binary_crossentropy', optimizer=\"adam\") \n",
"\n",
" model.fit(X_train,\n",
" y_train,\n",
" batch_size=batch_size,\n",
" nb_epoch=nb_epoch,\n",
" verbose=0)\n",
" return model\n",
"\n",
"model = build_and_train_mlp_network(X_train, y_train, X_test, y_test)\n",
"#np.sum(np.abs(model.predict(X_test) - y_test))\n",
"plt.imshow(np.abs(model.predict(X_test) - y_test), interpolation='nearest', cmap=plt.cm.prism)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It's clear that this approach is not successful. One way to proceed would be to engage recurrent or convolutional networks. The other is to model the actual rule and not the instances produced by the rule.\n",
"The following is a straighforward function prediction model with 100% accuracy."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"np.random.seed(233)\n",
"ruleNumber = 30\n",
"rule = [int((ruleNumber/pow(2,i)) % 2) for i in range(8)]\n",
"X = []\n",
"Y = []\n",
"train_size = 400\n",
"X = np.random.randint(0,8, train_size) \n",
"Y = [rule[i] for i in X]\n",
"model = Sequential()\n",
"model.add(Dense(20, input_dim = 1, activation='hard_sigmoid'))\n",
"model.add(Dense(1, activation='tanh'))\n",
"#model.add(Round()) \n",
"model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])\n",
"model.fit(X, Y, nb_epoch=500, verbose=0)\n",
"scores = model.evaluate(X, Y, verbose=0)\n",
"print(\"Model accuracy: %.2f%%\" % (scores[1]*100))\n",
"print(\"Model loss: %.2f%%\" % (scores[0]*100))\n",
"#model.predict(X)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reuters classification\n",
"\n",
"In this last example we pick up Reuters data which has been preprocessed (as part of the Keras framework). The original data consists of paragraphs but the words have already been embedded (mapped to vectors) and you can extract immediately training and test data.\n",
"\n",
"A word about dropout. This is a way to regularize networks and to suppress overfitting. It effectively switches off some of the neurons (in a random fashion) so that feedback does not affect all of the neurons all the time. Typically you will see that [a dropout of half of the nodes is a common approach](https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf). Like everything else, some experimentation reveals what works best with the data and what you aim for."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from __future__ import print_function\n",
"import numpy as np\n",
"np.random.seed(1337) \n",
"\n",
"from keras.datasets import reuters\n",
"from keras.models import Sequential\n",
"from keras.layers import Dense, Dropout, Activation\n",
"from keras.utils import np_utils\n",
"from keras.preprocessing.text import Tokenizer\n",
"\n",
"max_words = 1000\n",
"batch_size = 100\n",
"nb_epoch = 200\n",
"\n",
"(X_train, y_train), (X_test, y_test) = reuters.load_data(nb_words=max_words, test_split=0.2)\n",
"\n",
"\n",
"nb_classes = np.max(y_train)+1\n",
"\n",
"tokenizer = Tokenizer(nb_words=max_words)\n",
"X_train = tokenizer.sequences_to_matrix(X_train, mode='binary')\n",
"X_test = tokenizer.sequences_to_matrix(X_test, mode='binary')\n",
"\n",
"Y_train = np_utils.to_categorical(y_train, nb_classes)\n",
"Y_test = np_utils.to_categorical(y_test, nb_classes)\n",
"\n",
"model = Sequential()\n",
"model.add(Dense(512, input_shape=(max_words,), activation=\"relu\"))\n",
"model.add(Dropout(0.5))\n",
"model.add(Dense(nb_classes, activation=\"softmax\"))\n",
"model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])\n",
"\n",
"history = model.fit(X_train, Y_train, nb_epoch=nb_epoch, batch_size=batch_size, verbose=0, validation_split=0.1)\n",
"score = model.evaluate(X_test, Y_test, batch_size=batch_size, verbose=1)\n",
"print(\"\\n\\nModel accuracy: %.2f%%\" % (score[1]*100))\n",
"print(\"Model loss: %.2f%%\" % (score[0]*100))\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This gives around 80% accuracy in very little time (couple of minutes). If you try the same dataset with XGBoost it will take quite a long time (around 10 minutes) for the same 80% accuracy:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy\n",
"import xgboost # might require 'pip install xgboost'\n",
"from sklearn import cross_validation\n",
"from sklearn.metrics import accuracy_score\n",
"(X_train, y_train), (X_test, y_test) = reuters.load_data(nb_words=max_words, test_split=0.2)\n",
"nb_classes = np.max(y_train)+1\n",
"\n",
"tokenizer = Tokenizer(nb_words=max_words)\n",
"X_train = tokenizer.sequences_to_matrix(X_train, mode='binary')\n",
"X_test = tokenizer.sequences_to_matrix(X_test, mode='binary')\n",
"\n",
"Y_train = np_utils.to_categorical(y_train, nb_classes)\n",
"Y_test = np_utils.to_categorical(y_test, nb_classes)\n",
"\n",
"model = xgboost.XGBClassifier()\n",
"model.fit(X_train, y_train)\n",
"\n",
"y_pred = model.predict(X_test)\n",
"predictions = [round(value) for value in y_pred]\n",
"\n",
"accuracy = accuracy_score(y_test, predictions)\n",
"print(\"Accuracy: %.2f%%\" % (accuracy * 100.0))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For sure one can tune both approaches but it shows that neural nets are not necessarily data and processing hungry in all cases and that neural networks are easy to play with. At least, if you use a framework like Keras. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python [Root]",
"language": "python",
"name": "Python [Root]"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment