Skip to content

Instantly share code, notes, and snippets.

@t-ob
t-ob / atom.scm
Created January 28, 2012 14:19
Atom?
(define atom?
(lambda (x)
(and (not (pair? x)) (not (null? x)))))
@t-ob
t-ob / gcd.clj
Created May 18, 2012 15:47
Recursively compute the gcd of two numbers
(fn gcd [a b]
(if (zero? b)
a
(recur b (rem a b))))
@t-ob
t-ob / factors.clj
Created May 21, 2012 18:33
Number of factors of (n!)^2
(defn val-p-fact [n p]
(letfn [(prime-powers [n p i]
(lazy-seq
(when (<= (Math/pow p i) n)
(cons (Math/floor (/ n (Math/pow p i)))
(prime-powers n p (inc i))))))]
(reduce + (prime-powers n p 1))))
(defn interview-street-equations [n]
(int (reduce #(mod (* %1 %2) 1000007)
@t-ob
t-ob / tour.clj
Created July 11, 2012 23:47
Graph tour
(defn graph-tour [edges]
(let [vertices (reduce #(into %1 %2) #{} edges)
degree (fn [vertex] (count (filter #(some #{vertex} %) edges)))]
(<= (count (filter odd? (map degree vertices))) 2)))
@t-ob
t-ob / joins.clj
Created July 27, 2012 15:46
Cascalog joins
(def short-urls
[["http://t.co/yERArQn0"]
["http://t.co/gI8TjreI"]
["http://t.co/CBsucpNm"]
["http://t.co/F74GG1oN"]
["http://t.co/hyoXObbU"]])
(def longified-urls
[["http://t.co/yERArQn0" "http://news.sky.com/story/964624/olympic-lanes-open-traffic-delays-in-london"]
["http://t.co/CBsucpNm" "http://media.tumblr.com/tumblr_m65uu5UsrQ1r7h1lt.gif"]
@t-ob
t-ob / multiple-columns.clj
Created July 27, 2012 15:53
Multiple column families in a tap
(defn tweets
[]
(let [scheme (HBaseScheme. (Fields. (into-array String ["id"]))
(into-array String ["base" "raw"])
(into-array [(Fields. (into-array String ["tweet_id"
"screen_name"
"content"
"created_at"
"urls"]))
(Fields. (into-array String ["topsy_url"]))]))
@t-ob
t-ob / first-n.clj
Created July 30, 2012 15:24
first-n
(defn limited-output
[keywords-str limit]
(let [tweets-tap (tweets-new)
urls-tap (short-urls)
jdbc-sink db/jdbc-tap]
(?- (stdout)
(c/first-n
(<- [ ?args ]
; query
limit))))
@t-ob
t-ob / csv.sh
Created July 31, 2012 14:45
csv script
#!/bin/bash
NOW=$(date +"%Y-%m-%d-%T")
USERS10="users-10.$NOW.csv"
EVENTS10="events-10.$NOW.csv"
USERS20="users-20.$NOW.csv"
EVENTS20="events-20.$NOW.csv"
echo $USERS10;
@t-ob
t-ob / hbase.txt
Created July 31, 2012 17:15
hbase
import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.util.Bytes
scan 'tweets', { COLUMNS => 'base:content', FILTER => SingleColumnValueFilter.new(Bytes.toBytes('base'),Bytes.toBytes('content'), CompareFilter::CompareOp.valueOf('EQUAL'),SubstringComparator.new('burn')) }
@t-ob
t-ob / bad.sql
Created August 1, 2012 17:04
bad tweets
select count(*), tweet_id, content, max(screen_name), max(created_at), min(screen_name), min(created_at)
from tweets_with_id
group by tweet_id
having count(*) = 2