Skip to content

Instantly share code, notes, and snippets.

@srowley
Last active February 14, 2022 12:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save srowley/57dacc86445c9d5bf03cfa771f1221fd to your computer and use it in GitHub Desktop.
Save srowley/57dacc86445c9d5bf03cfa771f1221fd to your computer and use it in GitHub Desktop.
GGity Examples Livebook

GGity By Example - The Livebook

Dependencies

To use GGity in Livebook, we need to install GGity, and also Kino, which renders image output in a livebook.

Mix.install([:ggity, :kino])

Scatter Plot

GGity ships with several sample datasets commonly used with the R language. We will use the mtcars dataset, which includes fuel consumption and other data describing design and performance for 32 automobiles (1973-74 models) from 1974 Motor Trend U.S. magazines.

Let's take a quick look at the dataset first.

data = GGity.Examples.mtcars()
Kino.DataTable.new(data)

Now let's explore the data with a scatterplot - we will structure it as a function so that we can toggle what variable goes on the x axis, what goes on the y axis, and more. This function will return a Plot struct that we can render or hang on to for further modification.

alias GGity.Plot

scatterplot = fn x_variable, y_variable, color_variable, color_palette, plot_title ->
  data
  |> Plot.new(%{x: x_variable, y: y_variable})
  |> Plot.geom_point(%{color: color_variable})
  |> Plot.scale_color_viridis(option: color_palette)
  |> Plot.labs(title: plot_title)
end

GGity.plot/1 returns an iolist, but Kino renders binary data. Let's define a utility function to which we can feed our Plot structs and get them rendered.

render = fn plot ->
  plot
  |> Plot.plot()
  |> to_string()
  |> Kino.Image.new(:svg)
end

Let's get some inputs.

x =
  Kino.Input.select("X Variable",
    wt: "wt",
    mpg: "mpg",
    qsec: "qsec",
    disp: "disp"
  )
  |> Kino.render()
  |> Kino.Input.read()

y =
  Kino.Input.select("Y Variable",
    wt: "wt",
    mpg: "mpg",
    qsec: "qsec",
    disp: "disp"
  )
  |> Kino.render()
  |> Kino.Input.read()

color =
  Kino.Input.select("Color Variable",
    cyl: "cyl",
    am: "am",
    gear: "gear"
  )
  |> Kino.render()
  |> Kino.Input.read()

palette =
  Kino.Input.select("Color Palette",
    viridis: "viridis",
    plasma: "plasma",
    magma: "magma",
    inferno: "inferno",
    cividis: "cividis"
  )
  |> Kino.render()
  |> Kino.Input.read()

title =
  Kino.Input.text("Plot Title", default: "Motor Trend")
  |> Kino.render()
  |> Kino.Input.read()

Kino.nothing()

mtcars_scatterplot = scatterplot.(x, y, color, palette, title)
render.(mtcars_scatterplot)

Now we will change the plot formatting to a lighter theme.

import GGity.Element.{Line, Rect, Text}

theme = [
  axis_line: element_line(color: "gray", size: 0.25),
  legend_key: element_rect(fill: "white"),
  panel_background: element_rect(fill: "white"),
  panel_grid: element_line(color: "lightgray"),
  panel_grid_major: element_line(size: 0.5)
]

mtcars_scatterplot
|> Plot.theme(theme)
|> render.()

render.(mtcars_scatterplot)

Bar Chart

Continuing with the cars theme, now we will explore the mpg dataset with a bar chart. This dataset describes city/highway mileage data for 235 makes and models of vehicles.

Let's look at the data.

data = GGity.Examples.mpg()
Kino.DataTable.new(data)

Now we will create our bar chart; this one won't be configurable, but the process for making it so would be comparable to our approach in the scatterplot above.

In this example, we will plot the number of models in each class by manufacturer.

data
|> Enum.filter(fn record ->
  record["manufacturer"] in [
    "chevrolet",
    "audi",
    "ford",
    "nissan",
    "subaru",
    "toyota"
  ]
end)
|> Plot.new(%{x: "manufacturer"})
|> Plot.geom_bar(%{fill: "class"})
|> Plot.scale_y_continuous(labels: &floor/1)
|> Plot.labs(
  title: "Product Line Analysis",
  y: "Number of Models",
  x: "Manufacturer",
  fill: "Vehicle Class"
)
|> render.()

Audi, why do you hate big cars so much?

Boxplots

Using the same data, let's draw some boxplots. In this example we will use static colors for some of the elements (instead of black, the default).

Here we will plot the distribution of highway mileage by vehicle class.

data
|> Plot.new(%{x: "class", y: "hwy"})
|> Plot.geom_boxplot(fill: "white", color: "#3366FF")
|> render.()

While the relative medians across classes are no surprise, it is interesting to compare the medians of subcompacts to compacts and midsize vehicles - not as different as one might assume, with a wide variety of highway mileage across the subcompact class.

Of course, the blue outline and white fill are great too.

Line Chart

For our line chart example, we will use the economics dataset. This data describes certain economic indicators over the past several decades.

data = GGity.Examples.economics()
Kino.DataTable.new(data)

It is easy for us to plot a line for one of these variables, say, unemployment.

data
|> Plot.new(%{x: "date", y: "unemploy"})
|> Plot.geom_line()
|> render.()

Quick tangent - note the y-axis labels. The default Elixir format for printed floats is rarely satisfying for large numbers. We can fix that, and make the line dotted and purple while we are at it.

plot =
  data
  |> Plot.new(%{x: "date", y: "unemploy"})
  |> Plot.geom_line(color: "purple", linetype: :dotted)
  |> Plot.scale_y_continuous(labels: :commas)
  |> render.()

Here we used the name of a built-in labeling function, :commas (passed to Plot.scale_y_continuous/2), but any function that takes the label value as an argument and returns the desired label text will work.

to_thous = fn value -> "#{round(value / 1000)} thou" end

plot =
  data
  |> Plot.new(%{x: "date", y: "unemploy"})
  |> Plot.geom_line(color: "limegreen", linetype: :dotted)
  |> Plot.scale_y_continuous(labels: to_thous)
  |> render.()

Back to real work - there are bunch of variables in this time series data; what if we want to see how each of them moved over time on the same plot?

Enter the economics_long dataset, which normalizes each observation to a value between zero and one, and presents that number in the value01 variable. The name of the variable is stored in the... variable variable.

Take a look:

data = GGity.Examples.economics_long()
Kino.DataTable.new(data)

This data shape allows us to assign a chart aesthetic (in this example, the color of the line) to the the variable variable, and GGity will group those observations automatically.

data
|> Plot.new(%{x: "date", y: "value01", color: "variable"})
|> Plot.geom_line()
|> render.()

We could use linetype instead of color if we desire, although in this case it does not seem desirable.

Since we are making the plot uglier in that regard, let's use custom date formatting to at least simplify the x axis.

data
|> Plot.new(%{x: "date", y: "value01", linetype: "variable"})
|> Plot.geom_line()
|> Plot.scale_x_date(date_labels: "%Y")
|> render.()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment