Skip to content

Instantly share code, notes, and snippets.

@hrp
hrp / twitter.json
Created April 4, 2011 00:20
Example JSON response from Twitter streaming API
{
"text": "RT @PostGradProblem: In preparation for the NFL lockout, I will be spending twice as much time analyzing my fantasy baseball team during ...",
"truncated": true,
"in_reply_to_user_id": null,
"in_reply_to_status_id": null,
"favorited": false,
"source": "<a href=\"http://twitter.com/\" rel=\"nofollow\">Twitter for iPhone</a>",
"in_reply_to_screen_name": null,
"in_reply_to_status_id_str": null,
"id_str": "54691802283900928",
@karmi
karmi / movie-titles.rb
Created January 13, 2013 20:42
Multiple analyzers and query fields in Elasticsearch for auto-completion
require 'tire'
# Tire.configure { logger STDERR, level: 'debug' }
Tire.index('movie-titles') do
delete
create \
settings: {
index: {
analysis: {
@nlothian
nlothian / Penn Treebank II Tags.md
Last active June 11, 2024 19:06
Penn Treebank II Tags
@konradkonrad
konradkonrad / es_features.py
Last active January 25, 2022 23:52
tfidf from elasticsearch
import elasticsearch
from math import log
def tfidf_matrix(es, index, doc_type, fields, size=10, bulk=500, query=dict(match_all=[])):
"""Generate tfidf for `size` documents of `index`/`doc_type`.
All `fields` need to have the mapping "term_vector": "yes".
This is the consuming version (i.e. get everything at once).
:param es: elasticsearch client
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.