Created
July 22, 2016 03:18
-
-
Save jorendorff/7b07128d589f740c3fe2adcc5288919a to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ python3 search.py | |
> ice cream | |
Saturday Night Live | |
San Miguel Beermen | |
Android (operating system) | |
Durham, North Carolina | |
Pharrell Williams | |
New England | |
Guanajuato | |
Lafayette, Louisiana | |
Ithaca, New York | |
East Lansing, Michigan | |
Dutch language | |
Duluth, Minnesota | |
> Darth Vader | |
Star Wars | |
John Williams | |
Dick Cheney | |
Suzuki | |
PlayStation Portable | |
New York Stock Exchange | |
Monday Night Football | |
Minnesota Twins | |
Garth Brooks | |
Film score | |
Disneyland | |
Bugs Bunny | |
> wicked stepmother | |
Bette Davis | |
> search engine | |
Android (operating system) | |
Yahoo! | |
Firefox | |
Wikipedia | |
Windows Phone | |
MSN | |
Jimmy Wales | |
Internet Explorer | |
Fair use | |
Talmud | |
Irish Examiner |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/env python3 | |
# search.py - the dumbest search engine, lol | |
# To get the data you'd have to use with this: | |
# wget https://www.dropbox.com/s/lv44vyl8ia46llx/sample.tar.bz2 | |
# tar xvjf sample.tar.bz2 | |
import os | |
DIR = "./sample" | |
files = [os.path.join(DIR, filename) for filename in os.listdir(DIR)] | |
def handle_query(query): | |
query = query.lower() | |
results = [] | |
for doc in files: | |
with open(doc) as f: | |
title = f.readline().strip() | |
body = f.read().lower() | |
score = 0 | |
if query in title.lower(): | |
score += 2 | |
score += body.count(query) | |
if score > 0: | |
results.append((score, title)) | |
results.sort(reverse=True) | |
del results[12:] | |
for score, title in results: | |
print(title) | |
while True: | |
query = input("> ").strip() | |
if query: | |
handle_query(query) | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment