Skip to content

Instantly share code, notes, and snippets.

@ceteri
Created October 14, 2012 19:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ceteri/3889626 to your computer and use it in GitHub Desktop.
Save ceteri/3889626 to your computer and use it in GitHub Desktop.
ACM DM - Python exercise
# use git to load ceteri-mapred (simplest as a ZIP)
# https://github.com/ceteri/ceteri-mapred
# cd to your ceteri-mapred download
Pacos-MacBook-Pro:ceteri-mapred ceteri$ ls
README doc graph.gephi src thresh.R
bin graph.csv msgs.tsv stopwords thresh.tsv
Pacos-MacBook-Pro:ceteri-mapred ceteri$ ls src/
map_filter.py map_parse.py map_wc.py red_filter.py red_idf.py red_wc.py util_extract.py util_gephi.py util_walk.py
Pacos-MacBook-Pro:ceteri-mapred ceteri$ head README
## Getting Started on Hadoop
## Paco Nathan <ceteri@gmail.com>
##
## Silicon Valley Cloud Computing Meetup
## http://www.meetup.com/cloudcomputing/calendar/13911740/
## Mountain View, 2010-07-19
GitHub src repo:
http://github.com/ceteri/ceteri-mapred
Pacos-MacBook-Pro:ceteri-mapred ceteri$ head README | python src/map_wc.py
getting 1
started 1
on 1
hadoop 1
paco 1
nathan 1
ceteri 1
gmail 1
com 1
silicon 1
valley 1
cloud 1
computing 1
meetup 1
http 1
www 1
meetup 1
com 1
cloudcomputing 1
calendar 1
13911740 1
mountain 1
view 1
2010-07-19 1
github 1
src 1
repo 1
http 1
github 1
com 1
ceteri 1
ceteri-mapred 1
Pacos-MacBook-Pro:ceteri-mapred ceteri$ head README | python src/map_wc.py | sort
13911740 1
2010-07-19 1
calendar 1
ceteri 1
ceteri 1
ceteri-mapred 1
cloud 1
cloudcomputing 1
com 1
com 1
com 1
computing 1
getting 1
github 1
github 1
gmail 1
hadoop 1
http 1
http 1
meetup 1
meetup 1
mountain 1
nathan 1
on 1
paco 1
repo 1
silicon 1
src 1
started 1
valley 1
view 1
www 1
Pacos-MacBook-Pro:ceteri-mapred ceteri$ head README | python src/map_wc.py | sort | python src/red_wc.py
www 1
computing 1
ceteri 2
calendar 1
ceteri-mapred 1
cloud 1
gmail 1
mountain 1
hadoop 1
meetup 2
valley 1
http 2
on 1
started 1
repo 1
src 1
github 2
nathan 1
cloudcomputing 1
13911740 1
getting 1
silicon 1
2010-07-19 1
paco 1
com 3
view 1
Pacos-MacBook-Pro:ceteri-mapred ceteri$
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment