Skip to content

Instantly share code, notes, and snippets.

tyokota

Block or report user

Report or block tyokota

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@tyokota
tyokota / README.md
Last active May 2, 2016
2016 primary election
View README.md

###### chart 1. 2016 Hawaii primary election results.

pairing national election data and maps is a hot mess.

Recently, I have come across a few choropleth maps using 2016 primary election data presenting relative proportions of voters by county for a single candidate. One example that stands out is a map of Bernie supporters in Alaska. The heavily colored map feels misleading when considering ~600 democrats participated. More people could fit in a Walmart on Black Friday.

Sometimes, a less sexy visualization tells an interesting story. My modest stacked bar graph shows the proportion of Republican and Democratic voters (portrays Hawaii’s huge imbalance towards the Democratic party) and the relative proportion of votes each candidate garnered.

Choropleth maps can be beautiful and effective, especially when making state-by-state comparisons. Other times, less

@tyokota
tyokota / README.md
Last active May 2, 2016
food radius
View README.md

###### chart 1. my food radius.

food radius

“We purchase and consume 80% of our calories within 5 miles of our home” – Dr. Brian Wansink

The food radius map is the first exercise in Slim by Design: Mindless Eating Solutions for Everyday Life by Dr. Brian Wansink. By creating this map in R, I was able to understand my poorly made food choices.

Markers represent places that I regularly visit to eat; there are many other places not represented in this map. With that said, the food radius map still revealed just how inundated my neighborhood was with fast food. And if I were to think about the proportion of that 80% of caloric intake, it may be safe to say a majority of it comes from fast food places - yikes.

@tyokota
tyokota / README.md
Last active May 6, 2016
generalized additive models
View README.md

###### chart 1. 2016 Hawaii primary election results.

generalized additive model

"Silver bullet" and "predictive modeling" paired together in the same sentence? Do tell. The title sounded like the perfect link bait. Nonetheless, I delved into the article – and I am glad I did.

Mr. Kim Larsen's GAM: The Predictive Modeling Silver Bullet was a well-worth read as it introduced both the Generalized Additive Model and a feature selection technique based on information value (IV).

I quickly appreciated Mr. Larsen's choice of data to elucidate his feature selection method. GAM literally choked when trying to fit the full data set. Despite the reduced training set, GAM still performed almost as well as flavor-of-the-month XGBoost. In fact, the var

View src.R
# thomasyokota[at]gmail.com
# project/purpose: CDC BRFSS in R for free.99
# DEPENDENCIES -----------------------------------------------------------------
install.packages('pacman')
pacman::p_load(RCurl, foreign, downloader, foreign)
# DATA -------------------------------------------------------------------------
source_url("https://raw.githubusercontent.com/ajdamico/asdfree/master/Download%20Cache/download%20cache.R", prompt=F, echo=F)
# download ez-pz brought to you by anthony joseph damico [ajdamico@gmail.com]
@tyokota
tyokota / index.html
Created Jul 16, 2016
Haversine Distance
View index.html
This file has been truncated, but you can view the full file.
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
@tyokota
tyokota / m.html
Created Nov 6, 2018
2018 Polling Map.
View m.html
This file has been truncated, but you can view the full file.
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>leaflet</title>
<script>(function() {
// If window.HTMLWidgets is already defined, then use it; otherwise create a
// new object. This allows preceding code to set options that affect the
// initialization process (though none currently exist).
View concatenate_embeddings.py
def load_embedding(embedding):
print(f'Loading {embedding} embedding..')
def get_coefs(word,*arr): return word, np.asarray(arr, dtype='float32')
if embedding == 'glove':
EMBEDDING_FILE = f'{FILE_DIR}/embeddings/glove.840B.300d/glove.840B.300d.txt'
embeddings_index = dict(get_coefs(*o.split(" ")) for o in open(EMBEDDING_FILE, encoding="utf8"))
elif embedding == 'wiki-news':
EMBEDDING_FILE = f'{FILE_DIR}/embeddings/wiki-news-300d-1M/wiki-news-300d-1M.vec'
embeddings_index = dict(get_coefs(*o.split(" ")) for o in open(EMBEDDING_FILE, encoding="utf8") if len(o)>100)
elif embedding == 'paragram':
View char_ngram.py
text_vectorizer = TfidfVectorizer(
sublinear_tf=True,
strip_accents='unicode',
analyzer='word',
token_pattern=r'\w{1,}',
ngram_range=(1, 1),
max_features=30000)
text_vectorizer.fit(pd.concat([train['comment_text'], test['comment_text']]))
train_word_features = text_vectorizer.fit_transform(train['comment_text'])
You can’t perform that action at this time.