dannguyen/t-nicar16-cli.md

## t-nicar16-cli.md

      
    Raw
  

              t-nicar16-cli.md
            
          
    Using the t and csvkit to quickly collect and analyze #nicar16 tweets from the command-line

The t command-line Twitter tool is a great way to work with Twitter information in a spreadsheet.
Its homepage with good installation instructions is here:
https://github.com/sferik/t
And I've written some related instructions about how to get an authentication token from Twitter:
http://www.compjour.org/tutorials/twitter-app-authentication-process/
Doing a basic query for a term

Once you have it installed and you're authenticated, you can do a basic search for Tweets like this:
$ t search all 'nicar16'
The default behavior is to present the tweets in a human-readable format:
   @mailbackwards
   Good morning Denver, I'm at #NICAR16. Find me and say hi (and then come to 
   our talk on Sunday)

   @tbtprojx
   RT @MarshallProj: And about building your own criminal justice data w 
   @ultracasual @gabrieldance, @kenandavis + more at 3:30 
   https://t.co/mTNK1a1Xox #NICAR16

   @sickmund
   RT @MarshallProj: And about building your own criminal justice data w 
   @ultracasual @gabrieldance, @kenandavis + more at 3:30 
   https://t.co/mTNK1a1Xox #NICAR16

   @tbtprojx
   RT @MarshallProj: #NICAR16: Learn how to keep those news apps skills sharp at 
   11:30, with @gabrieldancehttp://bit.ly/1nwH4Zd

   @rdmurphy
   RT @A_L: Want to learn how to work with satellite data? @esagara and I will 
   be sharing our secrets today at 11:30 #NICAR16

Getting data in CSV format

But you can get them in CSV format using the --csv flag:
$ t search all 'nicar16' --csv


ID
Posted at
Screen name
Text


707951040855982080
2016-03-10 15:27:35 +0000
MaiAndy
RT @nkhensley: Saturday. #NICAR16 https://t.co/IBbqmP8KIo


707950349508739072
2016-03-10 15:24:51 +0000
ashlynstill
RT @Lindzcook: Join @ashlynstill and me in Denver 4 at 9am to learn programming concepts using fun games! Great place to start for newcomers #NICAR16


707950090355216384
2016-03-10 15:23:49 +0000
karanormal
It's a beautiful day to live in Denver... Because #NICAR16.


707949741179428864
2016-03-10 15:22:26 +0000
HBCompass
Starting off #NICAR16 by tilting off a bench just in case everyone didn't know I'm awkward as hell. https://t.co/9HJ1Z6lvFT


707949606831665153
2016-03-10 15:21:53 +0000
nkhensley
Saturday. #NICAR16 https://t.co/IBbqmP8KIo


707949340040548352
2016-03-10 15:20:50 +0000
AlexSecanove
RT @biologypartners: Investigative journalists & data miners: welcome to Colorado. There are some exciting data analytics startups here for you to meet. #NICAR16


707949060238344193
2016-03-10 15:19:43 +0000
natecarlisle
And @TonySemerad and I just landed at DEN. Next stop: #NICAR16


707949028881731585
2016-03-10 15:19:36 +0000
michelleminkoff
Let #nicar16 officially begin -- my uniform is on! It's go time! https://t.co/K2Z2DIfu04


707948651151122433
2016-03-10 15:18:06 +0000
ryanngro
My sixth NICAR conf and the first where I fell asleep before midnight on the first night. Losing my touch. #NICAR16


707948445131268096
2016-03-10 15:17:17 +0000
1GKh
RT @FerretScot: If you're interested in investigative journalism it's worth keeping an eye on #NICAR16 as it unfolds


707948358275444736
2016-03-10 15:16:56 +0000
cjsinner
SUPER excited for my first #NICAR16 😁😁😁


Getting the max number of tweet results

By default, 20 of the most recent tweets are returned. You can change this by using the -n flag; I believe the max nunber of results is capped at 3200, or, however many tweets have been posted in the last 7 days with the queried term.
And of course, you most likely want to be piping this directly into a text file that you can open up in Excel or what have you:
$ t search all 'nicar16' --csv -n 3200 > nicar16tweets.csv
Searching more specific streams

The t search subcommand lets you narrow the query to just your own timeline (t search timeline 'nicar16') or even to a specific list. Run t search help to see the descriptions:
  t search all QUERY               # Returns the 20 most recent Tweets that match the specified query.
  t search favorites [USER] QUERY  # Returns Tweets you've favorited that match the specified query.
  t search help [COMMAND]          # Describe subcommands or one specific subcommand
  t search list [USER/]LIST QUERY  # Returns Tweets on a list that match the specified query.
  t search mentions QUERY          # Returns Tweets mentioning you that match the specified query.
  t search retweets [USER] QUERY   # Returns Tweets you've retweeted that match the specified query.
  t search timeline [USER] QUERY   # Returns Tweets in your timeline that match the specified query.
  t search users QUERY             # Returns users that match the specified query.
Try csvkit

This is also a good time to try out csvkit, rather than using a spreadsheet.
Use csvcut with the -n flag to see the headers:
$ csvcut -n nicar16tweets.csv
  1: ID
  2: Posted at
  3: Screen name
  4: Text
Here's how to get the most frequent users (by screen name) of the hashtag in the set of tweets you've downloaded:
$ csvcut -c 'Screen name' nicar16tweets.csv | sort | uniq -c | sort -rn
  82 BizJournalism
  20 MacDiva
  19 ultracasual
  18 Jeremy_CF_Lin
  17 IRE_NICAR
  15 tbtprojx
  15 RajneeshB
  14 palewire
  13 brentajones
  13 KateReports
  13 DanielleAlberti
  12 seecmb
  12 benlkeith
  12 KarrieKehoe
  12 HacksHackersCO
  11 livlab
  11 dougfisher
  10 wjchat
  10 harrisj
   9 onyxfish

A note about using Excel

If you need yet another example of why you should stay away from Excel (and any other spreadsheet, but mostly Excel on OS X) until you absolutely need a spreadsheet, you will get this inexplicable error when opening up the csv file provided by t if you're on OS X:

The reason? Because when the first letters in a file are ID, this causes Excel to shit itself. It's hard to imagine the logic that went into that decision to hardcode ID as a magic word: https://support.microsoft.com/en-us/kb/215591
ID	Posted at	Screen name	Text
707951040855982080	2016-03-10 15:27:35 +0000	MaiAndy	RT @nkhensley: Saturday. #NICAR16 https://t.co/IBbqmP8KIo
707950349508739072	2016-03-10 15:24:51 +0000	ashlynstill	RT @Lindzcook: Join @ashlynstill and me in Denver 4 at 9am to learn programming concepts using fun games! Great place to start for newcomers #NICAR16
707950090355216384	2016-03-10 15:23:49 +0000	karanormal	It's a beautiful day to live in Denver... Because #NICAR16.
707949741179428864	2016-03-10 15:22:26 +0000	HBCompass	Starting off #NICAR16 by tilting off a bench just in case everyone didn't know I'm awkward as hell. https://t.co/9HJ1Z6lvFT
707949606831665153	2016-03-10 15:21:53 +0000	nkhensley	Saturday. #NICAR16 https://t.co/IBbqmP8KIo
707949340040548352	2016-03-10 15:20:50 +0000	AlexSecanove	RT @biologypartners: Investigative journalists & data miners: welcome to Colorado. There are some exciting data analytics startups here for you to meet. #NICAR16
707949060238344193	2016-03-10 15:19:43 +0000	natecarlisle	And @TonySemerad and I just landed at DEN. Next stop: #NICAR16
707949028881731585	2016-03-10 15:19:36 +0000	michelleminkoff	Let #nicar16 officially begin -- my uniform is on! It's go time! https://t.co/K2Z2DIfu04
707948651151122433	2016-03-10 15:18:06 +0000	ryanngro	My sixth NICAR conf and the first where I fell asleep before midnight on the first night. Losing my touch. #NICAR16
707948445131268096	2016-03-10 15:17:17 +0000	1GKh	RT @FerretScot: If you're interested in investigative journalism it's worth keeping an eye on #NICAR16 as it unfolds
707948358275444736	2016-03-10 15:16:56 +0000	cjsinner	SUPER excited for my first #NICAR16 😁😁😁