Skip to content

Instantly share code, notes, and snippets.

@tomer-ben-david
Last active January 27, 2018 15:06
Show Gist options
  • Save tomer-ben-david/bd9b1433a79072d2c04e1dae4d1d6b2d to your computer and use it in GitHub Desktop.
Save tomer-ben-david/bd9b1433a79072d2c04e1dae4d1d6b2d to your computer and use it in GitHub Desktop.
create matrix table r and plot load dataframe #R
df <- data.frame(x=c("spam", "spam", "ham"), y=c("some mail", "some other mail", "some third mail"))
names(df) <- c("Label", "Text")
df$Label <- as.factor(df$Label) // Fill by label would not work if not factor.
df$TextLength <- nchar(as.character(df$Text))
View(df)
ggplot(df, aes(x = TextLength, fill = Label)) + theme_bw() +
geom_histogram(binwidth = 5) +
labs(y = "Text Count", x = "Length of Text", title = "Distribution of text on labels")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment