Skip to content

Instantly share code, notes, and snippets.

View andy-clarke-uofg's full-sized avatar
👋

Andy Clarke andy-clarke-uofg

👋
View GitHub Profile
@andy-clarke-uofg
andy-clarke-uofg / twitter_query_language.md
Last active February 8, 2023 14:34
Twitter Query Language

🔍 Twitter Query Language

👀

These operators work on Web, Mobile, Tweetdeck.

Adapted from TweetDeck Help, @lucahammer Guide, @eevee Twitter Manual, @pushshift and Twitter / Tweetdeck itself. Contributions / tests, examples welcome!

Class Operator Finds Tweets… Eg:
@andy-clarke-uofg
andy-clarke-uofg / Collecting-Tweets.ipynb
Last active September 21, 2022 06:02
Collecting Tweets from the Twitter API using tweepy.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
lng lat
-74.1179903701844 4.75055095371542
-74.1179903701844 4.75055095371542
-74.1179903701844 4.75055095371542
-74.1179903701844 4.75055095371542
-74.1179903701844 4.75055095371542
-46.632005 -23.519901
-46.725853 -23.548752
-46.615347 -23.650206
-46.508786 -23.482543
@andy-clarke-uofg
andy-clarke-uofg / random-brazil-coordinates.csv
Created April 14, 2022 12:50
Randomly Generated Coordinates in Brazil
We can't make this file beautiful and searchable because it's too large.
lng,lat
-53.64057227283197,-15.150617321621077
-42.99421760836617,-11.145332942844876
-44.41791507114796,-8.157790364362803
-43.66038973901276,-21.148947704964765
-50.294873612911196,-7.517881971758156
-39.23700952309843,-9.933767804515153
-56.39064428015981,-4.383264074730363
-45.240887220743545,-13.103282452658972
-42.645435327462764,-21.757497469260148
@andy-clarke-uofg
andy-clarke-uofg / notebook.ipynb
Last active February 2, 2022 15:05
Reads a list of search queries as generated by Google Adwords Keyword Planner and performs a search for each query, returning the top 10 URLs for each query.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@andy-clarke-uofg
andy-clarke-uofg / trustpilot-scraping.ipynb
Last active January 11, 2022 10:20
Notebook for scraping reviews from Trustpilot
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@andy-clarke-uofg
andy-clarke-uofg / customer-segmentation.csv
Created October 27, 2021 08:11
Customer Segmentation Example Dataset
We can make this file beautiful and searchable if this error is corrected: It looks like row 5 should actually have 29 columns, instead of 27 in line 4.
Recency,MntWines,MntFruits,MntMeatProducts,MntFishProducts,MntSweetProducts,MntGoldProds,NumDealsPurchases,NumWebPurchases,NumCatalogPurchases,NumStorePurchases,NumWebVisitsMonth,Year_Birth,Education,Marital_Status,Income,Kidhome,Teenhome,Dt_Customer,AcceptedCmp3,AcceptedCmp4,AcceptedCmp5,AcceptedCmp1,AcceptedCmp2,Complain,Response,umap_cluster,month_name,weekday_name
58,635,88,546,172,88,88,3,8,10,4,7,1957-01-01T00:00:00Z,Graduation,Single,58138,0,0,2012-09-04T00:00:00Z,false,false,false,false,false,false,true,Cluster 1,January,Tuesday
38,11,1,6,2,1,6,2,1,1,2,5,1954-01-01T00:00:00Z,Graduation,Single,46344,1,1,2014-03-08T00:00:00Z,false,false,false,false,false,false,false,Cluster 12,January,Friday
26,426,49,127,111,21,42,1,8,2,10,4,1965-01-01T00:00:00Z,Graduation,Together,71613,0,0,2013-08-21T00:00:00Z,false,false,false,false,false,false,false,Cluster 2,January,Friday
26,11,4,20,10,3,5,2,2,0,4,6,1984-01-01T00:00:00Z,Graduation,Together,26646,1,0,2014-02-10T00:00:00Z,false,false,false,false,false,false,false,C
@andy-clarke-uofg
andy-clarke-uofg / 💉 VAERS 2021 | Cleaning & Joining.ipynb
Last active June 9, 2021 08:45
A notebook documenting the way a team at Graphext cleaned and joined data from the 2021 VAERS wave.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@andy-clarke-uofg
andy-clarke-uofg / graphext-vaccines-nlp-steps.txt
Created June 8, 2021 17:58
NLP Steps Added to VAERS Study Project
# Configure English as language of text
make_constant(ds["Symptom Description"], {
"value": "en",
"out_type": "category"
}) -> (ds.lang)
# Parse and extract ADJECTIVES from SYMPTOM DESCRIPTION column.
extract_keywords(ds["Symptom Description"], ds.lang, {
"keywords": {
"pos_tags": [
id gender age hypertension heart_disease ever_married work_type Residence_type avg_glucose_level bmi smoking_status stroke
9046 Male 67 0 1 Yes Private Urban 228.69 36.6 formerly smoked 1
51676 Female 61 0 0 Yes Self-employed Rural 202.21 N/A never smoked 1
31112 Male 80 0 1 Yes Private Rural 105.92 32.5 never smoked 1
60182 Female 49 0 0 Yes Private Urban 171.23 34.4 smokes 1
1665 Female 79 1 0 Yes Self-employed Rural 174.12 24 never smoked 1
56669 Male 81 0 0 Yes Private Urban 186.21 29 formerly smoked 1
53882 Male 74 1 1 Yes Private Rural 70.09 27.4 never smoked 1
10434 Female 69 0 0 No Private Urban 94.39 22.8 never smoked 1
27419 Female 59 0 0 Yes Private Rural 76.15 N/A Unknown 1