Skip to content

Instantly share code, notes, and snippets.

@Btibert3
Created February 23, 2021 00:32
Show Gist options
  • Save Btibert3/e10927799b2dad45067d48457e99be25 to your computer and use it in GitHub Desktop.
Save Btibert3/e10927799b2dad45067d48457e99be25 to your computer and use it in GitHub Desktop.
ms = readRDS("mission-statements.rds")
glimpse(ms)
mscorp = corpus(ms$mission)
metadoc(mscorp, "unitid") = ms$unitid
ndoc(mscorp)
sum(ntoken(mscorp))
hist(ntoken(mscorp))
kwic(mscorp, "education", window=3)
msdfm = dfm(mscorp,
stem = T,
remove = stopwords(),
remove_punct = T,
remove_symbols = T) %>%
dfm_trim(min_termfreq = 5,
max_docfreq = .6,
docfreq_type = "prop") %>%
dfm_tfidf()
topfeatures(msdfm)
topfeats = names(topfeatures(msdfm, 50))
fcm(msdfm) %>%
fcm_select(pattern =topfeats) %>%
textplot_network()
mat = as.matrix(msdfm)
k5 = kmeans(mat, 5, nstart = 25)
k5$size
plot(silhouette(k5$cluster, dist=dist(mat)), col=1:5, border=NA)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment