Skip to content

Instantly share code, notes, and snippets.

View joshstrupp's full-sized avatar

Josh Strupp joshstrupp

  • ISL
  • Washington, DC
View GitHub Profile
@joshstrupp
joshstrupp / Subreddit similarity score.py
Created July 22, 2020 18:45
Generate similarity scores between Subreddit using LDA modeling
import gensim
from gensim.corpora import Dictionary
from gensim.models import ldamodel
from gensim.matutils import hellinger
from gensim.matutils import kullback_leibler
import pandas as pd
import praw
import nltk
from pprint import pprint
@joshstrupp
joshstrupp / Subreddit Sentiment & Keyword Analysis Script.py
Last active July 22, 2020 18:41
Generating keyword and sentiment insights for select Subreddit(s)
import pandas as pd
import praw
import nltk
import random
from pprint import pprint
# Enter your own client_id, client_secret, username and password, or follow this quick start guide: https://github.com/reddit-archive/reddit/wiki/OAuth2-Quick-Start-Example#first-steps
reddit = praw.Reddit(user_agent='Comment Extraction (by /u/USERNAME)',client_id='enter_here',client_secret="enter_here",username='enter_here', password='enter_here')
from textblob import TextBlob
@joshstrupp
joshstrupp / About-ISLX.md
Last active October 25, 2019 16:18
ISL Experiments ReadMe

About Experiments

What are Experiments?

Experiments defined: internally-produced proofs of concept, prototypes, and products.

Why do we do them?

  1. Learn new tricks: To highlight (and expand) ISL’s capabilities.
  2. Stay sharp: Exercise the collective idea muscle of ISL.
  3. Reinforce inventive spirit: Create morale around collaborative passion projects.
  4. Maintain talent appeal: Recruit best of the best talent.
  5. Mega-boost biz dev: Create additional case studies and relevant work for pitches, leads, and general marketing.
from roku import Roku;
import time;
import random;
with open("/Users/josh/Desktop/1-1000.txt") as f:
wordlist = []
for line in f:
wordlist.append(line.strip())
@joshstrupp
joshstrupp / NFL-win-loss-predictor-based-on-game-stats.py
Last active June 18, 2020 13:58
NFL-win-loss-predictor-based-on-game-stats
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.cross_validation import train_test_split
from sklearn import metrics
from math import exp
import numpy as np
import matplotlib.pyplot as plt
nfl2000 = pd.read_csv('nfl2000stats.csv', sep=',') #13-3
@joshstrupp
joshstrupp / gist:6c8b4fad0719d4877d56
Created May 18, 2015 02:46
Titans.py Question - Log. Regression Model May 17
nfl = pd.concat([nfl2000, nfl2001, nfl2002, nfl2003, nfl2004, nfl2005, nfl2006, nfl2007, nfl2008, nfl2009, nfl2010, nfl2011, nfl2012, nfl2013], axis=0)
nfl['WinLoss'] = np.where(nfl.ScoreOff > nfl.ScoreDef, 1, 0)
nfl.columns
feature_cols = ['Date', 'FirstDownDef', 'FirstDownOff', 'FumblesDef', 'FumblesOff', 'Line', 'Opponent', 'PassAttDef', 'PassAttOff', 'PassCompDef', 'PassCompOff', 'PassIntDef', 'PassIntOff', 'PassYdsDef', 'PassYdsOff', 'PenYdsDef', 'PenYdsOff', 'PuntAvgOff', 'RushAttDef', 'RushAttOff', 'RushYdsDef', 'RushYdsOff', 'SackNumDef', 'SackNumOff', 'SackYdsDef', 'SackYdsOff', 'ScoreDef', 'ScoreOff', 'Site', 'TeamName', 'ThirdDownPctDef', 'ThirdDownPctOff', 'TimePossDef', 'TimePossOff', 'TotalLine', 'Totalline', 'Totalline ', 'WinLoss']
X = nfl[feature_cols]
y = nfl
@joshstrupp
joshstrupp / Pandas Homework
Created March 30, 2015 21:28
Pandas Homework
Part 1
Load the data (https://raw.githubusercontent.com/justmarkham/DAT5/master/data/auto_mpg.txt)
into a DataFrame. Try looking at the "head" of the file in the command line
to see how the file is delimited and how to load it.
Note: You do not need to turn in any command line code you may use.
'''
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np