Skip to content

Instantly share code, notes, and snippets.

@njtierney
Created March 21, 2018 05:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save njtierney/9c7f816ad71e3cb4c5a288031585229f to your computer and use it in GitHub Desktop.
Save njtierney/9c7f816ad71e3cb4c5a288031585229f to your computer and use it in GitHub Desktop.
``` r
library(tidytext)
library(tidyverse)
cran_db <- tools::CRAN_package_db()
cran_tbl <- tibble::as_tibble(cran_db[-65])
cran_tbl_short <- cran_tbl %>%
select(Package,
Description)
tidy_desc <- cran_tbl_short %>%
unnest_tokens(word, Description)
data("stop_words")
cleaned_desc <- tidy_desc %>% anti_join(stop_words)
#> Joining, by = "word"
mda_words <- c("miss",
"missing",
"NA",
"na",
"impute",
"imputed",
"imputes",
"imputation",
"imputations",
"imputing")
miss_pkgs <- cleaned_desc %>%
group_by(Package) %>%
filter(word %in% mda_words) %>%
ungroup()
n_distinct(miss_pkgs$Package)
#> [1] 274
```
Created on 2018-03-21 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment