Skip to content

Instantly share code, notes, and snippets.

View wesslen's full-sized avatar

Ryan Wesslen wesslen

View GitHub Profile
@wesslen
wesslen / min-char-rnn.py
Created January 28, 2016 22:28 — forked from karpathy/min-char-rnn.py
Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy
"""
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
BSD License
"""
import numpy as np
# data I/O
data = open('input.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
@wesslen
wesslen / TwitterConvertEST.R
Created July 11, 2016 14:48
Convert (Gnip) Twitter Date-Time GMT Character to Eastern Standard Time
# from http://stackoverflow.com/questions/4696842/convert-twitter-timestamp-in-r
# assume original field is postedTime
str <- strptime(postedTime, "%Y-%m-%dT%H:%M:%S", tz = "GMT")
dt.gmt <- as.POSIXct(str, tz = "GMT")
postedTime <- format(dt.gmt, tz = "EST", usetz = TRUE)
@wesslen
wesslen / LDAvisJsonQuanteda.R
Created July 22, 2016 22:32
Convert LDA (topicmodels) output in R to json using R quanteda package for LDAvis
topicmodels_json_ldavis <- function(fitted, dfm, dtm){
# Required packages
library(topicmodels)
library(dplyr)
library(stringi)
library(quanteda)
library(LDAvis)
# Find required quantities
from pyspark.sql import SQLContext
from pyspark.sql.functions import substring
sqlContext = SQLContext(sc)
jobDir = "/user/rwesslen/tweets/Pres_Tweets/"
jobName = "presTweet"
tweets = sqlContext.read.format('json').load([jobDir + "*.json"])
tweets.coalesce(1).toJSON().saveAsTextFile(jobDir + jobName)
@wesslen
wesslen / hashtag-filter.py
Created March 16, 2017 17:05
PySpark Gnip Filter by Hashtag
# filters all tweets that mention the hashtag #lovetrumpshate from the data frame tweets
activities = tweets.filter((array_contains(tweets.twitter_entities.hashtags.text,"lovetrumpshate")))
activities.count()
@wesslen
wesslen / TwitterGnipParseJson.R
Created June 3, 2017 18:26
Code to parse Gnip JSON to R Dataframe using streamR functions
# install the streamR package the first time -- no need if you already have it installed
#install.packages("streamR")
library(streamR)
# functions
readGnipTweets <- function(tweets, verbose=TRUE){
## checking input is correct
if (is.null(tweets)){
stop("Error: you need to specify file or object where tweets text was stored.")
}
@wesslen
wesslen / r-twitter-example.R
Last active September 1, 2018 02:49
RStudio Tidyverse Docker Demo
# Step 1: Go to http://play-with-docker and create an instance (you will need to sign up for a Docker username/pwd)
# Step 2: Run "docker run -e PASSWORD=<YOUR_PASS> -p 8787:8787 rocker/tidyverse". NOTE: <YOUR_PASS> equals a unique password you set.
# Step 3: Click 8787 link to open in browser. Copy token and press ok. (username/pwd rstudio/<YOUR_PASS>)
# Step 4: Download this file by running:
# download.file("https://gist.githubusercontent.com/wesslen/ae9aca04b491a064764b13239fb17489/raw/8c35e746585f719c62e0437ec095a23c21c44ccb/r-docker.R", destfile = "r-docker.R")
# call tidyverse -- if you get an error, do you have tidyverse installed??
library(tidyverse)
# load tweets
@wesslen
wesslen / r-shiny.R
Last active August 13, 2017 20:24
Shiny demos on Docker
# Step 1: Go to http://play-with-docker and create an instance
# Step 2: Run "docker run -d -p 3838:3838 -p 8787:8787 -e ADD=shiny rocker/rstudio"
# The 2nd part of Step 2 runs silently and takes about 3 minutes to install shiny-server
# Step 3: Click 8888 link to open in browser. Copy token and press ok. (username/pwd rstudio/rstudio)
shiny::runExample()
@wesslen
wesslen / bumper-sticker-experiment.R
Last active January 10, 2018 19:04
Code to run Twitter Bumper Sticker Experiment using twitteR and tweetscores
# install required packages (only need to run once)
# devtools::install_github("pablobarbera/twitter_ideology/pkg/tweetscores")
# install.packages("twitterR")
library(tweetscores); library(twitteR)
# take one full day 1% streaming data for Sept 28, 2017 -- 2.58MM unique users for 3,423,287 tweets
# for an example of how to pull 1% streaming using streamR package,
# see https://github.com/wesslen/summer2017-socialmedia/blob/master/day1/twitter-streaming.Rmd
# the id file only needs to include the user profile (actor.id) of the users you want to ping
id <- readr::read_csv("./data/userid20170928.csv")
@wesslen
wesslen / mongolite-datalake.R
Created February 6, 2018 17:27
R Code to connect to UNCC Vis DataLake (MongoDB) via mongolite
# install mongolite if you do not have it
# see https://github.com/jeroen/mongolite
#install.packages("mongolite")
library(mongolite)
# replace with name of database -- change last folder to database (e.g., "gab")
mongoUrl <- "mongodb://datalake:27017/gab"
# change col to your collection
col <- "id"