Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save gireeshkbogu/cf37ccc151590e2eee2ad003aba2804a to your computer and use it in GitHub Desktop.
Save gireeshkbogu/cf37ccc151590e2eee2ad003aba2804a to your computer and use it in GitHub Desktop.
How to plot a big dataset
# How to plot a BIG data set (600 million rows/values with 8555 keys)
# Use the follow example!!
library(dplyr)
library(ggplot2)
data(diamonds)
# plot density of different keys
ggplot(diamonds, aes(x=depth)) + geom_line(aes(color= cut), stat="density", size=0.4, alpha=0.4)
# extract different keys individually
Fair <- filter(diamonds, cut == "Fair")
Good <- filter(diamonds, cut == "Good")
VeryGood <- filter(diamonds, cut == "Very Good")
Premium <- filter(diamonds, cut == "Premium")
Ideal <- filter(diamonds, cut == "Ideal")
# plot each key one by one
ggplot() +
geom_line(data=Fair, aes(x=depth), stat="density", size=0.4, alpha=0.4, color = "red") +
geom_line(data=Good, aes(x=depth), stat="density", size=0.4, alpha=0.4, color = "green") +
geom_line(data=VeryGood, aes(x=depth), stat="density", size=0.4, alpha=0.4, color = "blue") +
geom_line(data=Premium, aes(x=depth), stat="density", size=0.4, alpha=0.4, color = "black") +
geom_line(data=Ideal, aes(x=depth), stat="density", size=0.4, alpha=0.4, color = "purple")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment