Skip to content

Instantly share code, notes, and snippets.

View amatsuo's full-sized avatar
💭
I may be slow to respond.

Aki Matsuo amatsuo

💭
I may be slow to respond.
View GitHub Profile
@amatsuo
amatsuo / .crostini-setup
Created December 27, 2018 02:13 — forked from tjpalanca/.crostini-setup
Crostini Setup
These scripts set up Crostini on my c223na
@amatsuo
amatsuo / spacyr_bench.R
Created March 27, 2019 10:13
Run and combine separate spacyr benchmarks for two versions
rm(list = ls())
.rs.restartR()
library(spacyr)
#spacy_install(envname = "spacy_condaenv_2.0", version = "2.0.18")
spacy_initialize(condaenv = "spacy_condaenv_2.0", refresh_settings = T)
data_text_irishbudget2010 <- quanteda::texts(quanteda::data_corpus_irishbudget2010)
bench_1 <- microbenchmark::microbenchmark(
v2.0.18 = spacy_tokenize(data_text_irishbudget2010),
@amatsuo
amatsuo / spacyr_tokenizer_scalability.R
Last active March 28, 2019 09:26
Scalability comparison of quanteda and spacyr tokenizers
library(quanteda)
library(spacyr)
library(tidyverse)
library(microbenchmark)
spacy_initialize()
data_text_irishbudget2010 <- texts(data_corpus_irishbudget2010)
data_text_irishbudget2010 <- unname(data_text_irishbudget2010)
library(tidyverse)
num_infected <- seq(1, 100)^2 # 感染者数ベクトル
num_population <- 1e7 # 都民の数
num_shimuras <- c(100, 1000, 3000, 5000) # 志村レベルの有名人の数
lapply(num_shimuras, function(num_celeb){
data <- tibble(num_infected,
prob_pos_shimura = 1 - dbinom(0, num_infected,
num_celeb / num_population),