Skip to content

Instantly share code, notes, and snippets.

@jorendorff
Created July 22, 2016 03:18
Show Gist options
  • Save jorendorff/7b07128d589f740c3fe2adcc5288919a to your computer and use it in GitHub Desktop.
Save jorendorff/7b07128d589f740c3fe2adcc5288919a to your computer and use it in GitHub Desktop.
$ python3 search.py
> ice cream
Saturday Night Live
San Miguel Beermen
Android (operating system)
Durham, North Carolina
Pharrell Williams
New England
Guanajuato
Lafayette, Louisiana
Ithaca, New York
East Lansing, Michigan
Dutch language
Duluth, Minnesota
> Darth Vader
Star Wars
John Williams
Dick Cheney
Suzuki
PlayStation Portable
New York Stock Exchange
Monday Night Football
Minnesota Twins
Garth Brooks
Film score
Disneyland
Bugs Bunny
> wicked stepmother
Bette Davis
> search engine
Google
Android (operating system)
Yahoo!
Firefox
Wikipedia
Windows Phone
MSN
Jimmy Wales
Internet Explorer
Fair use
Talmud
Irish Examiner
#!/bin/env python3
# search.py - the dumbest search engine, lol
# To get the data you'd have to use with this:
# wget https://www.dropbox.com/s/lv44vyl8ia46llx/sample.tar.bz2
# tar xvjf sample.tar.bz2
import os
DIR = "./sample"
files = [os.path.join(DIR, filename) for filename in os.listdir(DIR)]
def handle_query(query):
query = query.lower()
results = []
for doc in files:
with open(doc) as f:
title = f.readline().strip()
body = f.read().lower()
score = 0
if query in title.lower():
score += 2
score += body.count(query)
if score > 0:
results.append((score, title))
results.sort(reverse=True)
del results[12:]
for score, title in results:
print(title)
while True:
query = input("> ").strip()
if query:
handle_query(query)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment