This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cat /data/anal/ford/ford_tweets| cut -f 3 | egrep -i "((FORD|MERCURY).*(FIESTA|MUSTANG|FUSION|TAURUS|FLEX|EDGE|ESCAPE|EXPEDITION|SPORT.*TRAC|EXPLORER|MILAN|MARINER|MOUNTAINEER|GRANDE?.*MARQUIS|TRACER)|#MAZDA|#FORDDRIVE|(@|#)?WEDDINGROADTRIP|(@|#)?INVISIBLEPEOPLE|(@|#)?PLAIDNATION)" | wc -l |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
case | |
when options[:map] | |
mapper_klass.new(self.options).stream | |
+ when options[:reduce] && options[:reduce_command] | |
+ system options[:reduce_command] | |
when options[:reduce] | |
reducer_klass.new(self.options).stream | |
when options[:run] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
require 'rubygems' | |
require 'wukong' | |
class LetterMapper < Wukong::Streamer::LineStreamer | |
def map_text text | |
h = { } | |
idx = 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<Keyspace Name="UserIDCache"> | |
<KeysCachedFraction>0.01</KeysCachedFraction> | |
<ColumnFamily CompareWith="UTF8Type" Name="UserID"/> | |
<ColumnFamily CompareWith="UTF8Type" Name="UserSearchID"/> | |
<ColumnFamily CompareWith="UTF8Type" Name="UserScreenName"/> | |
<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy> | |
<ReplicationFactor>1</ReplicationFactor> | |
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch> | |
</Keyspace> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
g++ -I XOPSupport -l Madlib.dll XOPStandardHeaders.h MCL_ScanningStage.h Madlib.h MCL_ScanningStage.c MCL_ScanningStageWinCustom.rc MCL_ScanningStage.rc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
# | |
# Installing rsruby gem: | |
# | |
# sudo apt-get install r-base | |
# sudo gem install rsruby -- --with-R-dir=/usr/lib/R --with-R-include=/usr/share/R/include | |
# sudo ln -s /usr/lib/R/lib/libR.so /usr/lib/libR.so | |
# export R_HOME=/usr/lib/R | |
# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# Example azkaban job. Assumes you have two MR jobs to be run sequentially. | |
# | |
type=command | |
command=$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-*streaming*.jar -input /path/to/data -output /path/to/outputA -mapper mapperA.py -reducer reducerA.py | |
command.1=$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-*streaming*.jar -input /path/to/outputA -output /path/to/outputB -mapper mapperB.py -reducer reducerB.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4.2.145 [[0,{"latitude":"38.0000","country_code":"US","longitude":"-97.0000"}],[159,{"latitude":"38.0000","country_code":"US","longitude":"-97.0000"}],[160,{"household_income":"41506","percent_hispanic":"17.98","city":"Irving","percent_semi_permanent":"18.61","percent_asian":"14.45","percent_under_18":"20.06","latitude":"32.8791","per_capita_income":"28214","percent_bs_graduate":"16.34","area_code":"623","country_code":"US","zip_code":"75038","percent_below_poverty":"10.61","people_per_household":"2.0","housing_unit_value":"149500","percent_homeownership":"11.27","housing_units":"13305","work_travel_time":"3.1","percent_dual_race":"3.09","percent_pacific":"0.0","percent_white":"50.03","population":"25191","region_code":"TX","percent_hs_graduate":"16.34","percent_non_english":"36.09","percent_foreign":"28.9","percent_black":"23.75","percent_over_65":"1.87","metro_code":"972","longitude":"-96.9898","households":"12466","percent_under_5":"7.3","percent_native":"0.56","percent_female":"48.11"}],[191,{"household_i |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cat 200_twitspam_2.json| ruby -ne 'puts ["twitter_user_timeline_request", 15491144, 3, "0", "http://twitter.com/statuses/user_timeline/15491144.json?&page=1&count=200", 20100729, 200, "foobar", $_.strip].join("\t")' | ~/Programming/infochimps-data/social/network/twitter/base/parse/parse_twitter_api_requests.rb --map > 200_twitspam_parsed_2.tsv |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bizmarketing4u | 49108829 | 0.094361630 | 1 | |
---|---|---|---|---|
BrowneBig570 | 50190727 | 0.144720420 | 1 | |
Megan___Fox | 49509322 | 0.147560500 | 1 | |
pen2netone | 47910899 | 0.064650595 | 1 | |
dextradyoung | 47502802 | 0.146222870 | 2 | |
mbainstitute | 41130608 | 0.864213050 | 2 | |
mcspartan76 | 17034645 | 0.111582670 | 2 | |
Opereur2u | 65572992 | 0.138023360 | 2 | |
shaunaconway3 | 63460580 | 0.105175970 | 2 | |
BrianaPitts | 69789022 | 0.163385990 | 3 |
OlderNewer