Skip to content

Instantly share code, notes, and snippets.

@rpietro
rpietro / lsa_hack.r
Created September 3, 2013 23:03
playing with lsa based on post from http://goo.gl/ds2c4p
# script stolen from http://goo.gl/YbQyAQ
# install.packages("tm")
# install.packages("ggplot2")
# install.packages("lsa")
# install.packages("scatterplot3d")
#install.packages("SnowballC")
#if !(require('SnowballC')) then install.packages("SnowballC")
library(tm)
library(ggplot2)
@aronwc
aronwc / lda.py
Last active April 30, 2024 06:54
Example using GenSim's LDA and sklearn
""" Example using GenSim's LDA and sklearn. """
import numpy as np
from gensim import matutils
from gensim.models.ldamodel import LdaModel
from sklearn import linear_model
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import CountVectorizer
import networkx as nx
import numpy as np
import itertools
## We define each S* motif as a directed graph in networkx
motifs = {
'S1': nx.DiGraph([(1,2),(2,3)]),
'S2': nx.DiGraph([(1,2),(1,3),(2,3)]),
'S3': nx.DiGraph([(1,2),(2,3),(3,1)]),
'S4': nx.DiGraph([(1,2),(3,2)]),
@ramnathv
ramnathv / Makefile
Last active January 16, 2021 13:47
R Markdown to IPython Notebook
example.md: example.Rmd
./knit
example.ipynb: example.md
notedown example.md | sed 's/%%r/%%R/' > example.ipynb
@wdullaer
wdullaer / install.sh
Last active April 2, 2024 20:33
Install Latest Docker and Docker-compose on Ubuntu
# Ask for the user password
# Script only works if sudo caches the password for a few minutes
sudo true
# Install kernel extra's to enable docker aufs support
# sudo apt-get -y install linux-image-extra-$(uname -r)
# Add Docker PPA and install latest version
# sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9
# sudo sh -c "echo deb https://get.docker.io/ubuntu docker main > /etc/apt/sources.list.d/docker.list"
@GaelVaroquaux
GaelVaroquaux / mutual_info.py
Last active June 18, 2023 12:25
Estimating entropy and mutual information with scikit-learn: visit https://github.com/mutualinfo/mutual_info
'''
Non-parametric computation of entropy and mutual-information
Adapted by G Varoquaux for code created by R Brette, itself
from several papers (see in the code).
This code is maintained at https://github.com/mutualinfo/mutual_info
Please download the latest code there, to have improvements and
bug fixes.
@tokestermw
tokestermw / visualizing_topic_models.py
Last active September 7, 2021 16:57
visualization topic models in four different ways
import json
import urlparse
from itertools import chain
flatten = chain.from_iterable
from nltk import word_tokenize
from gensim.corpora import Dictionary
from gensim.models.ldamodel import LdaModel
from gensim.models.tfidfmodel import TfidfModel
@sirkkalap
sirkkalap / Install-Docker-on-Linux-Mint.sh
Last active December 8, 2022 18:38
Install Docker on Linux Mint
##########################################
# To run:
# curl -sSL https://gist.githubusercontent.com/sirkkalap/e87cd580a47b180a7d32/raw/d9c9ebae4f5cf64eed4676e8aedac265b5a51bfa/Install-Docker-on-Linux-Mint.sh | bash -x
##########################################
# Check that HTTPS transport is available to APT
if [ ! -e /usr/lib/apt/methods/https ]; then
sudo apt-get update
sudo apt-get install -y apt-transport-https
fi
@karpathy
karpathy / min-char-rnn.py
Last active May 25, 2024 03:02
Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy
"""
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
BSD License
"""
import numpy as np
# data I/O
data = open('input.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)

This hit #rstats today:

Has anyone made a dumbbell dot plot in #rstats, or better yet exported to @plotlygraphs using the API? https://t.co/rWUSpH1rRl

— Ken Davis (@ken_mke) October 23, 2015
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

So, I figured it was worth a cpl mins to reproduce.

While the US gov did give the data behind the chart it was all the data and a pain to work with so I used WebPlotDigitizer to transcribe the points and then some data wrangling in R to clean it up and make it work well with ggplot2.

It is possible to make the top "dumbbell" legend in ggplot2 (but not by using a guide) and color the "All Metro A