Skip to content

Instantly share code, notes, and snippets.

@dodijk
Last active August 29, 2015 14:00
Show Gist options
  • Save dodijk/11372064 to your computer and use it in GitHub Desktop.
Save dodijk/11372064 to your computer and use it in GitHub Desktop.
xTAS Tutorial
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "",
"signature": "sha256:66069ca39917016656275aafa77f1d949a5a7f91d78e2f081425429bf249ed41"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# xtas, the eXtensible Text Analysis Suite\n",
"\n",
"This a tutorial for xtas, a distributed text analysis package based on Celery and Elasticsearch. Assuming you\u2019ve [properly configured and started xtas as described in the setup](http://xtas.net/setup.html), here\u2019s how to do interesting work with it.\n",
"\n",
"If you want to get a real head start, use this:\n",
"\n",
" curl https://www.rabbitmq.com/releases/rabbitmq-server/v3.3.0/rabbitmq-server-mac-standalone-3.3.0.tar.gz | tar xzf -\n",
"\n",
" curl https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.1.1.zip > elasticsearch-1.1.1.zip\n",
" unzip elasticsearch-1.1.1.zip\n",
" rm elasticsearch-1.1.1.zip\n",
"\n",
" easy_install -U xtas ipython\n",
"\n",
" curl http://qwone.com/~jason/20Newsgroups/20news-bydate.tar.gz | tar xzf -\n",
"\n",
" screen -S xTAS -t iPython bash -c \"\\\n",
" screen -t ElasticSearch elasticsearch-1.1.1/bin/elasticsearch; \\\n",
" screen -t RabbitMQ rabbitmq_server-3.3.0/sbin/rabbitmq-server; \\\n",
" sleep 5s; \\\n",
" screen -t Celery celery -A xtas.tasks worker --loglevel=info; \\\n",
" ipython notebook\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Getting started with xTAS\n",
"\n",
"First, you need a document collection. If you don\u2019t have one already, download the 20newsgroups dataset:\n",
"\n",
" $ curl http://qwone.com/~jason/20Newsgroups/20news-bydate.tar.gz | tar xzf -\n",
" \n",
"Store the documents in Elasticsearch:"
]
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"from elasticsearch import Elasticsearch\n",
"import os\n",
"es = Elasticsearch()\n",
"files = (os.path.join(d, f) for d, _, fnames in os.walk('20news-bydate-train') for f in fnames)\n",
"\n",
"for i, f in enumerate(files):\n",
" body = {'text': open(f).read().decode('utf-8', errors='ignore')}\n",
" es.create(index='20news', doc_type='post', body=body, id=i)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we can run named-entity recognition on the documents. Let\u2019s try it on one document:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from xtas.tasks import es_document, stanford_ner_tag\n",
"doc = es_document('20news', 'post', 1, 'text')\n",
"tagged = stanford_ner_tag(doc)\n",
"persons = [token for token, tag in tagged[:-1] if tag == 'PERSON']\n",
"print persons"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"['Huxley']\n"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We just fetched the document from ES to run Stanford NER locally. That\u2019s not the best we can do, so let\u2019s run it remotely. We can do so by running the stanford_ner_tag tasks asynchronously. First, observe that doc isn\u2019t really the document: it\u2019s only a handle on the ES index:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"doc"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 5,
"text": [
"{'field': 'text', 'id': 1, 'index': '20news', 'type': 'post'}"
]
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This handle can be sent over the wire to make Stanford NER run in the worker:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"result = stanford_ner_tag.apply_async([doc])\n",
"print result\n",
"persons = [token for token, tag in result.get()[:-1] if tag == 'PERSON']\n",
"print persons"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"3949d141-65fe-4bf7-9c9b-1734e7b55f46\n",
"[u'Huxley']"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We have the same result, but now from a worker process. The result object is an AsyncResult returned by Celery; see its documentation for full details."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Batch tasks\n",
"\n",
"Some tasks require a batch of documents to work; an example is topic modeling. Such tasks are available in the xtas.tasks.cluster package, so named because most of the tasks can be considered a form of clustering. Batches of documents are addressed using Elasticsearch queries, which can be performed using xtas. For example, to search for the word \u201chello\u201d in the 20news collection:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from xtas.tasks.es import fetch_query_batch\n",
"hello = fetch_query_batch('20news', 'post', {'term': {'text': 'hello'}}, 'text')\n",
"len(hello)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 8,
"text": [
"10"
]
}
],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This fetches the 'text' field of the documents that match the query. ('text' appears twice since you might want to match on the title, but retrieve the body text, etc.)\n",
"\n",
"Now we can fit a topic model to these document. (You need the gensim package for this, pip install gensim.) Try:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from xtas.tasks.cluster import lda\n",
"from pprint import pprint\n",
"pprint(lda(hello, 2))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stderr",
"text": [
"WARNING:gensim.models.ldamodel:no word id mapping provided; initializing from corpus, assuming identity\n"
]
},
{
"output_type": "stream",
"stream": "stderr",
"text": [
"WARNING:gensim.models.ldamodel:too few updates, training might not converge; consider increasing the number of passes or iterations to improve accuracy\n"
]
},
{
"output_type": "stream",
"stream": "stderr",
"text": [
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[[(u'appreciated', 0.12328608249216905),\n",
" (u'any', 0.11360959019717877),\n",
" (u'am', 0.11346935940774378),\n",
" (u'an', 0.11059267457307038),\n",
" (u'anyone', 0.1069785285068582),\n",
" (u'are', 0.10287441970083552),\n",
" (u'and', 0.096730589102969625),\n",
" (u'advance', 0.085980256227530305),\n",
" (u'all', 0.07987723291538068),\n",
" (u'about', 0.066601266876263734)],\n",
" [(u'am', 0.1299002385226545),\n",
" (u'all', 0.12021676411725173),\n",
" (u'an', 0.10584355046110787),\n",
" (u'are', 0.10407610204924428),\n",
" (u'appreciated', 0.10026179932754227),\n",
" (u'anyone', 0.099202462699055069),\n",
" (u'any', 0.093195320941297441),\n",
" (u'and', 0.086258048978009677),\n",
" (u'advance', 0.085870139835319298),\n",
" (u'about', 0.075175573068518006)]]\n"
]
},
{
"output_type": "stream",
"stream": "stderr",
"text": [
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n",
"/Users/dodijk/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/gensim/models/ldamodel.py:636: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n",
" score += numpy.sum(cnt * logsumexp(Elogthetad + Elogbeta[:, id]) for id, cnt in doc)\n"
]
}
],
"prompt_number": 9
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, the lda task returns (term, weight) pairs for two topics. Admittedly, the topics aren\u2019t very pretty on this small set.\n",
"\n",
"Of course, fetching the documents and running the topic model locally isn\u2019t optimal use of xtas. Instead, let\u2019s set up a chain of tasks that runs the query and fetches the results on a worker node, then runs the topic model remotely as well. We\u2019ll use Celery syntax to accomplish this:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from celery import chain\n",
"fetch = fetch_query_batch.s('20news', 'post', {'term': {'text': 'hello'}}, 'text')\n",
"fetch_lda = chain(fetch, lda.s(k=2)) # make a chain\n",
"result = fetch_lda() # run the chain\n",
"pprint(result.get()) # get results and display them"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[[[u'am', 0.12763838164061253],\n",
" [u'an', 0.11986846195572962],\n",
" [u'appreciated', 0.11742021029668612],\n",
" [u'any', 0.11334178036225759],\n",
" [u'anyone', 0.09739236374728127],\n",
" [u'all', 0.09161215723781733],\n",
" [u'are', 0.09091426948247729],\n",
" [u'and', 0.08764095117066052],\n",
" [u'about', 0.0779921338065763],\n",
" [u'advance', 0.07617929029990156]],\n",
" [[u'am', 0.1161120040945202],\n",
" [u'are', 0.11568383051385608],\n",
" [u'all', 0.10874223377922959],\n",
" [u'anyone', 0.10852413229369665],\n",
" [u'appreciated', 0.10600339193322801],\n",
" [u'an', 0.09684893723119241],\n",
" [u'advance', 0.09538455904030751],\n",
" [u'and', 0.0951032657983954],\n",
" [u'any', 0.09349728073053906],\n",
" [u'about', 0.06410036458503529]]]\n"
]
}
],
"prompt_number": 10
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"More details on creating chains can be found in the Celery userguide.\n",
"\n",
"Storing results\n",
"We just saw how to run jobs remotely, fetching documents from an Elasticsearch index. What is even more interesting is that we can also store results back to ES, so we can use xtas as preprocessing for a semantic search engine.\n",
"\n",
"We can use the store_single task to run NER on a document from the index and store the result back, if we append it to our chain:\n"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from xtas.tasks.es import store_single\n",
"doc = es_document('20news', 'post', 3430, 'text')\n",
"ch = chain(stanford_ner_tag.s(doc, output=\"names\"),\n",
" store_single.s('ner', doc['index'], doc['type'], doc['id']))\n",
"result = ch()\n",
"pprint(result.get())"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[[u'Ohio State University Lines', u'ORGANIZATION'],\n",
" [u'Network and System Administration User', u'ORGANIZATION'],\n",
" [u'Bruce Webster', u'PERSON'],\n",
" [u'US', u'LOCATION']]\n"
]
}
],
"prompt_number": 11
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"result.get() will now report the output from the NER tagger, but getting it locally is not what we\u2019re after. The store_single task has also stored the result back into the document, as you can verify with:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"pprint(es.get('20news', 3430)['_source']['xtas_results'])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"{u'ner': {u'data': [[u'Ohio State University Lines', u'ORGANIZATION'],\n",
" [u'Network and System Administration User',\n",
" u'ORGANIZATION'],\n",
" [u'Bruce Webster', u'PERSON'],\n",
" [u'US', u'LOCATION']],\n",
" u'timestamp': u'2014-04-29T18:55:29.409601'}}\n"
]
}
],
"prompt_number": 12
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This result can now be used in ES queries."
]
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment