Skip to content

Instantly share code, notes, and snippets.

@wgpshashank
Created September 16, 2011 23:26
Show Gist options
  • Save wgpshashank/1223394 to your computer and use it in GitHub Desktop.
Save wgpshashank/1223394 to your computer and use it in GitHub Desktop.
Snippets of code I've written recently to various programming challenges
'''Problem - A 7 digit number consists of 7 distinct digits (from 0..9 ofcourse); The product of the first 3 digits = product of central 3 digits = product of last 3 digits; Find the middle digit.
'''
'''Used Constraint satisfaction to solve the problem - requires http://labix.org/download/python-constraint/python-constraint-1.1.tar.bz2'''
from constraint import *
def main():
"""
My explanation here, let's first make their numbers be a through g, we have abc=cde=efg.
First of all, I'll rule out 0, for one thing, including zero makes anything multiplied by it 0. Moreover zero can be shared inside only two triplets. So, it's out.
I'm going to rule out 5 or 7 , because they are prime and no other 1-digit number contains their multiple. So, the remaining numbers are 1, 2, 3, 4, 6, 8, 9.
To determine the sequence lets find the prime factors of above numbers:
2 = 2
3 = 3
4 = 2 * 2
6 = 2 * 3
8 = 2 * 2 * 2
9 = 3 * 3
So we have seven 2's and four 3's. Let's go through the cases where the following could be the middle number.
1: If 1 is the middle digit, then we have seven 2's to be distributed among abc and efg. Because, the number of 2's is not even, we can't have 1 as middle digit.
3,6,9: If we have 3 or 6 in middle, we are left with three 3's in other digit which can't balance number of 3's in left and right group.
If we have 9 in middle then we are left with only two 3's in other digits which can only balance one group with middle group.
So 3, 6, and 9 are not suitable for middle digit.
4,8: If we have 4 in middle then we are left with five 2's in other digits which can't balance number of 2's in left and right group.
If we have 8 in middle then we are left with only four 2's in other digits which are not sufficient to balance the other two group with middle group.
So 4 and 8 are not suitable for middle digit.
Since abc=efg, (abcdefg)/d must be a perfect square. 1*2*3*4*6*8*9 should have seven 2's and 4 3's and d can be only 2 or 8.
If d=2, the numbers 1, 3, 4, 6, 8, 9 can be made into 1*8*9=3*4*6=72, and we can have c=9, e=4: 9*2*4=72.
If d=8, the numbers 1, 2, 3, 4, 6, 9 can be made into 1*4*9=2*3*6=36, but 36 is not divisible by 8.
When we have 2 as middle digit we remain with six 2's and four 3's which can be evenly distributed among left and right group and can balance the equation.
When the 2's and 3's are evenly distributed we have three 2's and two 3's on each group making 2 * 2 * 2 * 3 * 3 = 72 like we found out above
We already have 2 as middle digit so for middle group to make 71 the other two digit of middle group should multiply to make 36 and the only possible answer is 4 * 9.
So the other two digit of middle group are 4 and 9.
Lets consider 4 is shared with left group and 9 is shared with right group is the sequence for middle group is "4, 2, 9".
Now for the left group to make 72 the other two digit should multiply to make 18 and with the digit 1, 3, 6, and 8 left the only possible answer is 3 and 8.
And the digit 8 and 1 become part of right group with 9 and make 72 as the multiplication. And thus the sequence must be
3 6 4 2 9 8 1
Ultimately, it's a matter of constraint satisfaction: i.e. a*b = d*e and c*d = f*g. I'll use a constaint SAT solver to find our solution :-)
"""
problem = Problem()
problem.addConstraint(AllDifferentConstraint())
problem.addVariables(["a","b","c","d","e","f","g"],[1,2,3,4,5,6,7,8,9])
problem.addConstraint(lambda c,d,f,g: c*d == f*g, ("c","d","f","g"))
problem.addConstraint(lambda a,b,d,e: a*b == d*e, ("a","b","d","e"))
solutions=problem.getSolutions()
print solutions
if __name__ == "__main__":
main()
#Problem= Given N, arrange all numbers from 1 to N such that the average of any two numbers in the list #does not lie between them.
#if this doesn't run, "gem install memoize" add sudo in front if you don't use RVM to manage your gems
require 'rubygems'
require 'memoize'
require 'benchmark'
include Memoize
def good_list(n)
list=(1..n).to_a
list_even=[]; list_odd=[]
list.each do |i| # sift them into two lists
if (i%2==0) ? list_even << i : list_odd << i
end
end
Benchmark.bm(15) do |b|
#b.report("Regular processor:") { processor(list_even,list_odd) }
p "***************************************************"
b.report("Memoized processor:") { memoize(:processor); processor(list_even,list_odd) }
end
#memoize(:processor)
#r= processor(list_even,list_odd)
#p r
end
def processor(even,odd)
result=[]
if (even.length <=3 && odd.length <= 3) #base-case length 3 or 2
if even.length == 3 ? result << even[0] << even[2] << even[1] : result += even; end
if odd.length == 3 ? result << odd[0] << odd[2] << odd[1] : result += odd; end
return result
end
e=even.collect {|i| i/2} #divide each element by 2
ne,no=list_splitter(e) #split the list divided by two into even and odd again; ne - new even; no - new odd sublist
even_results= processor(ne,no) #recursively split the problem until we hit base case, then get the result and * 2
result += even_results.collect{|i| i*2}
o=odd.collect {|i| (i-1)/2 +1 } #convert odd elements into k s.t. 2*k-1 gives the original element back
ne,no=list_splitter(o)
result+=processor(ne,no).collect{|i| i*2-1}
return result
end
def list_splitter(list)
list_even_temp=[]; list_odd_temp=[]
(0..list.length-1).each do |i|
if list[i]%2==0 ? list_even_temp << list[i] : list_odd_temp << list[i]
end
end
return list_even_temp, list_odd_temp
end
good_list(10000000)
import os,sys,random
'''Problem - . Given a list of N line segments on the X axis: (x1[i], x2[i]) where i = 1 to N; And then a list of M queries. Each query inputs one x-coordinate and you need to output the set of all {i} that overlaps this point.
example:
N = 2
3 6
1 5
M = 2
query: 3
answer: 1,2
query: 6
answer: 1
My Solution:
I used an AVL tree for cheap insertions, self-balancing and search under O(log n)
this AVL Tree comes with some augmentation for keeping track of maximum right range found in a node's children
each node contains an array of four details - [x1, x2, maximum right range found in child (max_right), parent_node]
each node is inserted by comparing x1. After node insertion as a leaf, node's max_right is set initially to x2 and is then propogated up to the root O(height of tree)
querying is pretty easy and cheap. We first check to see if q is contained in root r - if yes, we're done.
If not, then, we check r.left_bound >= q, if it is, we go left and recursively do the same trick until we find q. If r.left_bound <= q, we have two cases.
case 1: if r.left.max >= q, recursively go left.
case 2: if r.left.max <= q, recursively go right.
querying should take O(log n) as it's simply a matter of Binary search :-)
'''
found=0
class Node:
def __init__(self, data):
self.data = data
self.parent = None
self.set_childs(None, None)
def set_childs(self, left, right):
self.left = left
if self.left:
self.left.parent=self
self.right = right
if self.right:
self.right.parent=self
def findMax(self,i): #if i, return max of children's nodes
#if self.parent:
#print self.data,"||",self.parent.data
if self.left == None and self.right == None:
self.data[2]=self.data[1]
if i:
return self.data[1]
elif self.left and self.right:
self.data[2] = max(self.data[2], max(self.left.data[2], self.right.data[2]))
if i:
return max(self.left.data[2], self.right.data[2])
elif self.left and self.right==None:
self.data[2] = max(self.data[2], self.left.data[2])
if i:
return self.left.data[2]
elif self.right and self.left == None:
self.data[2] = max(self.data[2], self.right.data[2])
if i:
return self.right.data[2]
'''if self.parent!=None:
self.parent.findMax()
if self.parent==None:
if self.left: self.left.findMax()
if self.right: self.right.findMax()'''
def balance(self):
lheight = 0
if self.left:
lheight = self.left.height()
rheight = 0
if self.right:
rheight = self.right.height()
return lheight - rheight
def height(self):
lheight = 0
if self.left:
lheight = self.left.height()
rheight = 0
if self.right:
rheight = self.right.height()
return 1 + max(lheight, rheight)
def rotate_left(self):
self.data, self.right.data = self.right.data, self.data
old_left = self.left
self.set_childs(self.right, self.right.right)
self.left.set_childs(old_left, self.left.left)
self.left.data[2]=self.left.findMax(1)
def rotate_right(self):
self.data, self.left.data = self.left.data, self.data
old_right = self.right
self.set_childs(self.left.left, self.left)
self.right.set_childs(self.right.right, old_right)
if self.parent != None:
self.right.data[2]=self.right.findMax(1)
def rotate_left_right(self):
self.left.rotate_left()
self.rotate_right()
def rotate_right_left(self):
self.right.rotate_right()
self.rotate_left()
def do_balance(self):
#print "balancing",self.data
bal = self.balance()
if bal > 1:
if self.left.balance() > 0:
self.rotate_right()
else:
self.rotate_left_right()
elif bal < -1:
if self.right.balance() < 0:
self.rotate_left()
else:
self.rotate_right_left()
def insert(self, data):
if data <= self.data:
if not self.left:
self.left = Node(data)
self.left.parent=self
else:
self.left.insert(data)
else:
if not self.right:
self.right = Node(data)
self.right.parent=self
else:
self.right.insert(data)
self.do_balance()
self.findMax(None)
def print_tree(self, indent = 0):
print " " * indent + str(self.data)
#if self.parent: print self.parent
if self.left:
self.left.print_tree(indent + 2)
if self.right:
self.right.print_tree(indent + 2)
def query(self,q,counter):
global found
if q > self.data[2]:
return
if self.left != None:
self.left.query(q,counter)
if q>=self.data[0] and q<=self.data[1]:
found+=1
if q <= self.data[0]:
return
if self.right != None:
self.right.query(q,counter)
if __name__ == "__main__":
input=os.getcwd()+'/input'
file=open('input')
n=file.readline().split()[2] #get n
triplets=[]
triplets=file.readline().split()
triplets.append(triplets[1]) #3rd element in triplets is right maxima range
tree = Node(map(lambda x: int(x), triplets)) #initialize the class
for line in range(1,int(n)):
triplets=file.readline().split()
triplets.append(triplets[1])
tree.insert(map(lambda x: int(x), triplets))
a=file.readline() #for the space
m=file.readline().split()[2]
q=[] #queries
for line in range(0,int(m)):
q.append(int(file.readline()))
tree.print_tree()
for i in q:
tree.query(i,0)
print found
found=0
'''similar.py by Samrat Jeyaprakash - better to read from __main__ first
Takes a list of twitter users and find similarities amongst them. Just a fun little script using a library from the book "Collective Intelligence"
I haven't the time to work on it - but this is a good start if you are looking for a fun way to do similarity vectors.
This implementation uses Tweepy, Redis & Pearson Correlation Similarity libraries from Collective Intelligence book
To run, run redis on default port
Iteration 1 - Uses two filters (pearson similarity coefficient & "following" users intersection) to find similarity between two users
TODO for Iteration 2 - add clustering and change feed_data to .json for Redis caching. Also implement finding screen_name and add redis.ZADD
Iteration 3 - find independent features a la central themes using non-negative matrix factorization and some bayesian classification
Iteration 4 - implement term-frequency/inverse document frequency for PageRank, and threads - actually use Twisted for asynchronous calls
Iteration 5 - use neural networks for better link tracking
'''
from recommendations import sim_pearson, sim_distance, topMatches
import feedparser, re, urllib, threading, redis,tweepy,json
def getUserMessages(user):
#get rss feeds of user to extract keywords from tweets
url = "http://twitter.com/statuses/user_timeline/" + user + ".rss?count=200" #change to .json in Iter#2
feed_data = feedparser.parse(url)
#print feed_data
return feed_data.get("entries", [])
def getKeywordScores(user, messages):
#this method counts the words from the users' tweets and their frequency if > 1
keywords = {}
stopwords = ["a", "an", "by", "on", "that", "the", "these", "this", "those", "to"]
# and many more words... for iteration 2, consider using zipf's laws' blacklisted words
#consider using http://www.webconfs.com/stop-words.php - also need to consider bad language
stopwords.append(user)
for message in messages:
tweet = message["summary"]
words = re.split(" ", tweet)
for word in words: #stopwords filtering loop
word = re.sub("^\W*", "", word)
word = re.sub("\W*$", "", word)
if word.startswith("http://"):
continue
word = word.lower()
if word in stopwords:
continue
if not word:
continue
count = keywords.get(word, 0)
keywords[word] = count + 1
final_keywords = {} #takes only keywords > 1 appearance
for k in keywords:
if keywords[k] > 1:
final_keywords[k] = keywords[k]
return final_keywords
def processor(first_user,i): #thread helper method... don't use until iteration #4. threads for managing rss calls.
messages = getUserMessages(user = user)
user_keywords[user]=getKeywordScores(user,messages)
print sim_pearson(user_keywords, first_user, user)
return
def twitter_helper(screen_name=0):
auth = tweepy.BasicAuthHandler("YOUR_USERNAME", "YOUR_PASSWORD")
api = tweepy.API(auth)
print '---------------------------------------------'
for user in users:
print "Getting",user,"'s friends now"
friends=api.friends_ids(user)
if (screen_name == 1): #this is expensive to do using the current calls
following=[]
for friend in friends:
url = "http://twitter.com/statuses/user_timeline/" + str(friend) + ".json"
#print url,friend #debug helper - really need to use python debug logger
result=json.load(urllib.urlopen(url))
#print result
following.append(result[0]['user']['screen_name'])
time.sleep(1) #did this to fool twitter, but still think there might be some sort of exponential backoff from twitter side
map(lambda m: r.sadd(user,m),following) # add screen_names to Redis key => user
else:
map(lambda m: r.sadd(user,m),friends) # add user ID's to Redis key => user
def user_intersection(users):
for i in range(len(users)-1):
print users[i],"'s common following with "
for j in range(i+1,len(users)):
print users[j],
print r.sinter([users[i],users[j]]),"\n"
print "___________________________________"
def pearsonScore(users):
#print users
user_keywords = {}
#threads=[] #maybe use it for thread tracking later...
s=[]
for user in users:
print "processing data for:", user
messages = getUserMessages(user = user)
'''t=threading.Thread(target=processor,args=(user[0],user)) #could use some threads for speeding up http calls tremendously
threads.append(t)
t.start()
print s'''
user_keywords[user] = getKeywordScores(user = user, messages = messages)
print "==========================================================================="
for user in users:
print "Top matches for ",user,":"
print topMatches(user_keywords, user,n=len(users),similarity = sim_pearson)
if __name__ == "__main__":
users=['aplusk','justinbieber','billgates','ladygaga','samratjp','ev']
r = redis.Redis(host='localhost', port=6379, db=0)
pearsonScore(users)
'''twitter_helper() #this calls twitter to a user's "following - i.e. ppl they are following" IDs,
this kinda sucks, need to find screen_names, but haven't found one yet. so, expensive solution is to find twitter.com/../user_id.json then find screen_name
that sucks too, because twitter will blacklist you so fast! But, the intersection can be done with ids anyways. In any case, will implement user_name for Iter#2
just need to find other ways. Even the RESTful methods didn't have any'''
twitter_helper(screen_name=0) #because of the blacklisting possibility, use 0 for now. 1 can get upto 24 results before twitter gets clever
user_intersection(users)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment