Skip to content

Instantly share code, notes, and snippets.

View trinker's full-sized avatar

Tyler Rinker trinker

View GitHub Profile
@trinker
trinker / look_and_say_sequence.R
Last active September 15, 2022 15:35
look and say sequence
# https://mathworld.wolfram.com/LookandSaySequence.html
library(ggplot2)
library(stringi)
tic <- Sys.time()
len <- 68
s <- rep(NA, len)
s[1] <- 1
tm <- rep(NA, len)
@trinker
trinker / topicmodels2LDAvis
Created December 19, 2015 02:42
Convert topicmodels to LDAvis
#' Transform Model Output for Use with the LDAvis Package
#'
#' Convert a \pkg{topicmodels} output into the JSON form required by the \pkg{LDAvis} package.
#'
#' @param model A \code{\link[]{topicmodel}} object.
#' @param \ldots Currently ignored.
#' @seealso \code{\link[LDAvis]{createJSON}}
#' @export
#' @examples
#' \dontrun{
prob <- unlist(lapply(1:1000000, function(i){
candles <- sort(runif(2, 0, 1))
cut <- runif(1, 0, 1)
(cut > candles[1]) & (cut < candles[2])
}))
mean(prob)
@trinker
trinker / aes_geom_explore.R
Last active March 28, 2022 09:01
ggplot2: aesthetics and geoms exploration
In this post I have a few goals:
1. Become (re-)familiar with available geoms
2. Become (re-)familiar with aesthetic mappings in geoms (stroke who knew?)
3. Answer these questions:
<ul>
<li>How often do various geoms appear and how often do they have required aesthetics?</li>
<li>How often do various aesthetics appear and how often are they required?</li>
<li>What geoms are most similar based on mappings?</li>
</ul>
## https://youtu.be/094y1Z2wpJg
library(tidyverse)
collatz <- function(x){
v = c(x)
i = 1
while (v[i] != 1){
@trinker
trinker / optimal_k 2
Last active July 28, 2021 09:25
Find the optimal number of topics in a topic model using the harmonic mean of the log likelihood
#' Find Optimal Number of Topics
#'
#' Iteratively produces models and then compares of the harmonic mean of the log likelihoods in a graphical output.
#'
#' @param x A \code{\link[tm]{DocumentTermMatrix}}.
#' @param max.k Maximum number of topics to fit (start small [i.e., default of 30] and add as necessary).
#' @param burnin Object of class \code{"integer"}; number of omitted Gibbs iterations at beginning, by default equals 0.
#' @param iter Object of class \code{"integer"}; number of Gibbs iterations, by default equals 2000.
#' @param keep Object of class \code{"integer"}; if a positive integer, the log-likelihood is saved every keep iterations.
#' @param method The method to be used for fitting; currently \code{method = "VEM"} or \code{method= "Gibbs"} are supported.
@trinker
trinker / generalized rescaling
Last active May 12, 2021 00:19
generalized rescaling
general_rescale <- function(x, lower, upper){
rng <- range(x, na.rm = TRUE, finite = TRUE)
if (diff(rng) == 0) return(stats::setNames(rep(upper, length(x)), names(x)))
(x - rng[1])/diff(rng) * diff(range(c(lower, upper))) + lower
}
x <- c(NA, 1:10)
@trinker
trinker / fgsub.py
Last active March 29, 2021 19:02
textread fgsub equivalent in python
text = ['df dft sdf', 'sd fdggg sd dfhhh d', 'ddd']
def dbllttrwordrev(match):
match = match.group()
return '<<{}>>'.format(match[::-1])
{
'function': [re.sub("\\b\\w*([a-z])(\\1{2,})\\w*\\b", dbllttrwordrev, x, flags = re.IGNORECASE) for x in text],
'lambda': [re.sub("\\b\\w*([a-z])(\\1{2,})\\w*\\b", lambda x: '<<{}>>'.format(x.group()[::-1]) , x, flags = re.IGNORECASE) for x in text]
@trinker
trinker / likert_odd.R
Created February 16, 2018 05:51
Likert ggplot2 Odd Number of Responses
###############################
## Plotting Likert Type Data ##
###############################
##------------------------------------------------------------------------
## Note: Plotting horizontal stacked bar plots in ggplot2 with Likert type
## data is a non-trivial task. Stacking is not well defined for mixed
## negative/positive values on a bar. This requires splitting the data
## set into two different parts (positive/negative), plotting each side
## separately, and filling the colors manually. This script adds complexity
## for neutral scales.
@trinker
trinker / quanteda_wordcloud.R
Created July 8, 2020 19:11
Wordcloud with quanteda
## Load dependencies
library(quanteda)
library(sentimentr)
library(tidyverse)
library(lexicon)
## Data set from sentimentr package
dat <- presidential_debates_2012
dat
corp <- corpus(dat, text_field = "dialogue")