Skip to content

Instantly share code, notes, and snippets.

View duydo's full-sized avatar

Duy Do duydo

View GitHub Profile
{
"title": "Tweets Search",
"rows": [
{
"title": "Options",
"height": "50px",
"editable": true,
"collapse": false,
"collapsable": true,
"panels": [
#!/bin/sh
# Variables
USER="admin"
PASS="password"
# Assert Root User
SCRIPTUSER=`whoami`
if [ "$SCRIPTUSER" != "root" ]
then
@duydo
duydo / twitter_mapping.sh
Created October 17, 2013 09:52
Preserving Special Characters During Tokenization twitter message with elasticsearch
curl -XPUT 'http://localhost:9200/twitter' -d '{
"settings" : {
"index" : {
"number_of_shards" : 1,
"number_of_replicas" : 1
},
"analysis" : {
"filter" : {
"tweet_filter" : {
"type" : "word_delimiter",
# ========================================
# Testing n-gram analysis in ElasticSearch
# ========================================
curl -X DELETE localhost:9200/ngram_test
curl -X PUT localhost:9200/ngram_test -d '
{
"settings" : {
"index" : {
"analysis" : {
#!/usr/bin/env python
"""
Example of fetching followers of multiple Twitter accounts recursively, using twitterspawn.
"""
import atexit
import json
import twitterspawn
@duydo
duydo / s3delete.py
Created October 26, 2013 16:15 — forked from jerem/s3delete.py
#!/usr/bin/env python
import gevent.monkey
gevent.monkey.patch_all()
import sys
import optparse
import gevent
from boto.s3.connection import S3Connection

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs

Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs

@duydo
duydo / simplefulltextsearch.py
Created November 13, 2013 08:11
Simple fulltex search in-memory
# -*- coding: utf-8 -*-
import re
import shlex
def search(query):
pieces = shlex.split(query.encode('utf-8'))
include, or_include, exclude = [], [], []
for piece in pieces:
p = piece.decode('utf-8')
if p.startswith('-'):
#!/bin/bash
# from here: http://www.codingsteps.com/install-redis-2-6-on-amazon-ec2-linux-ami-or-centos/
# and here: https://raw.github.com/gist/257849/9f1e627e0b7dbe68882fa2b7bdb1b2b263522004/redis-server
###############################################
# To use:
# wget https://raw.github.com/gist/2776679/04ca3bbb9f085b192f6aca945120fe12d59f15f9/install-redis.sh
# chmod 777 install-redis.sh
# ./install-redis.sh
###############################################
echo "*****************************************"
@duydo
duydo / gist:7701304
Created November 29, 2013 03:45 — forked from simonw/gist:104413
def extract_form_fields(self, soup):
"Turn a BeautifulSoup form in to a dict of fields and default values"
fields = {}
for input in soup.findAll('input'):
# ignore submit/image with no name attribute
if input['type'] in ('submit', 'image') and not input.has_key('name'):
continue
# single element nome/value fields
if input['type'] in ('text', 'hidden', 'password', 'submit', 'image'):