Skip to content

Instantly share code, notes, and snippets.

import io
import avro.schema
import avro.io
import lipsum
import random
from kafka.client import KafkaClient
from kafka.producer import SimpleProducer, KeyedProducer
g = lipsum.Generator()
worker_processes 2;
error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
use epoll;
}
[unix_http_server]
file=/tmp/supervisor.sock ; path to your socket file
[supervisord]
logfile=/var/log/supervisord/supervisord.log ; supervisord log file
logfile_maxbytes=50MB ; maximum size of logfile before rotation
logfile_backups=10 ; number of backed up logfiles
loglevel=error ; info, debug, warn, trace
pidfile=/var/run/supervisord.pid ; pidfile location
nodaemon=false ; run supervisord as a daemon
@qxj
qxj / lda_gibbs.py
Last active August 29, 2015 14:19 — forked from mblondel/lda_gibbs.py
"""
(C) Mathieu Blondel - 2010
License: BSD 3 clause
Implementation of the collapsed Gibbs sampler for
Latent Dirichlet Allocation, as described in
Finding scientifc topics (Griffiths and Steyvers)
"""
@qxj
qxj / kafka.md
Last active August 29, 2015 14:20 — forked from ashrithr/kafka.md

Introduction to Kafka

Kafka acts as a kind of write-ahead log (WAL) that records messages to a persistent store (disk) and allows subscribers to read and apply these changes to their own stores in a system appropriate time-frame.

Terminology:

  • Producers send messages to brokers
  • Consumers read messages from brokers
  • Messages are sent to a topic
@qxj
qxj / crontab.sh
Created May 16, 2015 02:04
Collect *.cron in the directory, and then APPEND to the original crontab
#!/usr/bin/env bash
# @(#) crontab.sh Time-stamp: <Julian Qian 2015-05-15 18:14:28>
# Copyright 2015 Julian Qian
# Author: Julian Qian <junist@gmail.com>
# Version: $Id: crontab.sh,v 0.1 2015-05-14 10:53:03 jqian Exp $
#
# Collect *.cron in the directory, and then APPEND to the original crontab
# TODO fix potential confliction when more than one crontab.sh instances are running concurrently.
@qxj
qxj / hadoop_avro_job.sh
Created June 9, 2015 07:46
If input files are serialized with avro, unserialize them by org.apache.avro.mapred.AvroAsTextInputFormat in hadoop streaming.
#!/usr/bin/env bash
# @(#) norm.sh Time-stamp: <Julian Qian 2015-06-09 15:35:35>
# Copyright 2015 Julian Qian
# Author: Julian Qian <junist@gmail.com>
# Version: $Id: norm.sh,v 0.1 2015-06-08 18:03:30 jqian Exp $
#
day=$(date +%Y%m%d -d yesterday)
input=/user/hive/warehouse/query_log/ds=$day/hr=00
@qxj
qxj / MultiLayerExperiment.java
Created August 17, 2015 08:25
分层实验的示例代码。原文见这里:http://blog.sina.com.cn/s/blog_e59371cc0102vopg.html ,但是其中代码格式错乱了,帮忙整理了一下。
// Overlapping Experiment Demo
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.security.SecureRandom;
import java.util.LinkedList;
import java.util.List;
public class MultiLayerExperiment {
private static String byteArrayToHex(byte[] byteArray) {
char[] hexDigits = {'0', '1', '2', '3', '4', '5', '6', '7',
@qxj
qxj / .screenrc
Created December 14, 2011 15:56
My screen settings
# skip the startup message
startup_message off
# encoding
defutf8 on
# go to home dir
# chdir
# Automatically detach on hangup
@qxj
qxj / tumblr-style.css
Created December 14, 2011 16:05
My tumblr style setting
#Posts h1 {
font-size: 180%;
font-weight: bold;
color: #000;
/* border-bottom: 5px solid #000; */
margin-bottom: 10px;
text-align: center;
}
#Posts h2 {