Skip to content

Instantly share code, notes, and snippets.

View duydo's full-sized avatar

Duy Do duydo

View GitHub Profile
@duydo
duydo / aggs_top_hits.json
Created December 3, 2014 10:52
elasticsearch aggs top_hits
curl -XGET "http://ext-es2.ssh.sentifi.com:9200/analytic/relevant_document/_search" -d'
{
"size": 0,
"query": {
"range": {
"date": {
"from": "now-1h"
}
}
},
@duydo
duydo / merge_tweet.py
Created November 26, 2014 08:28
Merge tweet
__author__ = 'duydo'
import re
from dateutil.parser import parse
RT_PATTERN = r'(RT|MT|retweet|from|via)((?:\b\W*@\w+)+)(:*)'
RT_REGEX = re.compile(RT_PATTERN, re.UNICODE | re.IGNORECASE)
@duydo
duydo / gfeed
Created October 23, 2014 03:46
Find RSS Feeds
#! /bin/sh
set -euf
export query="$1"
export data="$(
curl -sS "https://www.google.com/uds/GfindFeeds?v=1.0&q=$query"
)"
nodejs <<EOF
query = process.env.query
data = JSON.parse(process.env.data)
data.responseData.entries.forEach(function (entry, index) {
import sys
from gevent import server
from gevent.baseserver import _tcp_listener
from gevent.monkey import patch_all; patch_all()
from multiprocessing import Process, current_process, cpu_count
def note(format, *args):
sys.stderr.write('[%s]\t%s\n' % (current_process().name, format%args))

I've been hacking away recently at a JVM framework for doing asynchronous, non-blocking applications using a variation of the venerable Reactor pattern. The core of the framework is currently in Java. I started with Scala then went with Java and am now considering Scala again for the core. What can I say: I'm a grass-is-greener waffler! :) But it understands how to invoke Groovy Closures, Scala anonymous functions, and Clojure functions, so you can use the framework directly without needing wrappers.

I've been continually micro-benchmarking this framework because I feel that the JVM is a better foundation on which to build highly-concurrent, highly-scalable, C100K applications than V8 or Ruby. The problem has been, so far, no good tools exist for JVM developers to leverage the excellent performance and manageability of the JVM. This yet-to-be-publicly-released framework is an effort to give Java, Groovy, Scala, [X JVM language] developers access to an easy-to-use programming model that removes the necessity

@duydo
duydo / timeout.py
Last active August 29, 2015 14:06 — forked from felipecruz/timeout.py
import signal
def signal_handler(signum, frame):
raise Exception("Timed out!")
signal.signal(signal.SIGALRM, signal_handler)
signal.alarm(10) # Ten seconds
try:
long_function_call()
# Don't use this.
import zmq
import os
class Worker:
def __init__(self):
print "parent: %d, pid: %d" % (os.getppid(), os.getpid())
self.pid = os.getppid()
self.context = zmq.Context()
self.sub = self.context.socket(zmq.SUB)
@duydo
duydo / pagination.md
Last active August 29, 2015 14:06 — forked from mislav/pagination.md

Pagination 101

Article by Faruk Ateş, [originally on KuraFire.net][original] which is currently down

One of the most commonly overlooked and under-refined elements of a website is its pagination controls. In many cases, these are treated as an afterthought. I rarely come across a website that has decent pagination, and it always makes me wonder why so few manage to get it right. After all, I'd say that pagination is pretty easy to get right. Alas, that doesn't seem the case, so after encouragement from Chris Messina on Flickr I decided to write my Pagination 101, hopefully it'll give you some clues as to what makes good pagination.

Before going into analyzing good and bad pagination, I want to explain just what I consider to be pagination: Pagination is any kind of control system that lets the user browse through pages of search results, archives, or any other kind of continued content. Search results are the o

@duydo
duydo / tree.md
Last active August 29, 2015 14:06 — forked from hrldcpr/tree.md

One-line Tree in Python

Using Python's built-in defaultdict we can easily define a tree data structure:

def tree(): return defaultdict(tree)

That's it!

@duydo
duydo / crawler.py
Last active August 29, 2015 14:06 — forked from jmoiron/crawler.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Simple async crawler/callback queue based on gevent."""
import traceback
import logging
import httplib2
import gevent