srowley/ggity_by_example.livemd

## ggity_by_example.livemd

      
    Raw
  

              ggity_by_example.livemd
            
          
    GGity By Example - The Livebook

Dependencies

To use GGity in Livebook, we need to install GGity, and also Kino, which renders image output in a livebook.
Mix.install([:ggity, :kino])
Scatter Plot

GGity ships with several sample datasets commonly used with the R language.
We will use the mtcars
dataset,
which includes fuel consumption and other data describing design and
performance for 32 automobiles (1973-74 models) from 1974 Motor Trend U.S. magazines.
Let's take a quick look at the dataset first.
data = GGity.Examples.mtcars()
Kino.DataTable.new(data)
Now let's explore the data with a scatterplot - we will structure it
as a function so that we can toggle what variable goes on the x axis,
what goes  on the y axis, and more. This function will return a Plot
struct that we can render or hang on to for further modification.
alias GGity.Plot

scatterplot = fn x_variable, y_variable, color_variable, color_palette, plot_title ->
  data
  |> Plot.new(%{x: x_variable, y: y_variable})
  |> Plot.geom_point(%{color: color_variable})
  |> Plot.scale_color_viridis(option: color_palette)
  |> Plot.labs(title: plot_title)
end
GGity.plot/1 returns an iolist, but Kino renders binary data. Let's define a utility function to which we can feed our Plot
structs and get them rendered.
render = fn plot ->
  plot
  |> Plot.plot()
  |> to_string()
  |> Kino.Image.new(:svg)
end
Let's get some inputs.

x =
  Kino.Input.select("X Variable",
    wt: "wt",
    mpg: "mpg",
    qsec: "qsec",
    disp: "disp"
  )
  |> Kino.render()
  |> Kino.Input.read()

y =
  Kino.Input.select("Y Variable",
    wt: "wt",
    mpg: "mpg",
    qsec: "qsec",
    disp: "disp"
  )
  |> Kino.render()
  |> Kino.Input.read()

color =
  Kino.Input.select("Color Variable",
    cyl: "cyl",
    am: "am",
    gear: "gear"
  )
  |> Kino.render()
  |> Kino.Input.read()

palette =
  Kino.Input.select("Color Palette",
    viridis: "viridis",
    plasma: "plasma",
    magma: "magma",
    inferno: "inferno",
    cividis: "cividis"
  )
  |> Kino.render()
  |> Kino.Input.read()

title =
  Kino.Input.text("Plot Title", default: "Motor Trend")
  |> Kino.render()
  |> Kino.Input.read()

Kino.nothing()

mtcars_scatterplot = scatterplot.(x, y, color, palette, title)
render.(mtcars_scatterplot)
Now we will change the plot formatting to a lighter theme.
import GGity.Element.{Line, Rect, Text}

theme = [
  axis_line: element_line(color: "gray", size: 0.25),
  legend_key: element_rect(fill: "white"),
  panel_background: element_rect(fill: "white"),
  panel_grid: element_line(color: "lightgray"),
  panel_grid_major: element_line(size: 0.5)
]

mtcars_scatterplot
|> Plot.theme(theme)
|> render.()

render.(mtcars_scatterplot)
Bar Chart

Continuing with the cars theme, now we will explore the mpg dataset with
a bar chart. This dataset describes city/highway mileage data for 235
makes and models of vehicles.
Let's look at the data.
data = GGity.Examples.mpg()
Kino.DataTable.new(data)
Now we will create our bar chart; this one won't be configurable, but
the process for making it so would be comparable to our approach in the
scatterplot above.
In this example, we will plot the number of models in each class by
manufacturer.
data
|> Enum.filter(fn record ->
  record["manufacturer"] in [
    "chevrolet",
    "audi",
    "ford",
    "nissan",
    "subaru",
    "toyota"
  ]
end)
|> Plot.new(%{x: "manufacturer"})
|> Plot.geom_bar(%{fill: "class"})
|> Plot.scale_y_continuous(labels: &floor/1)
|> Plot.labs(
  title: "Product Line Analysis",
  y: "Number of Models",
  x: "Manufacturer",
  fill: "Vehicle Class"
)
|> render.()
Audi, why do you hate big cars so much?
Boxplots

Using the same data, let's draw some boxplots. In this example we will use
static colors for some of the elements (instead of black, the default).
Here we will plot the distribution of highway mileage by vehicle class.
data
|> Plot.new(%{x: "class", y: "hwy"})
|> Plot.geom_boxplot(fill: "white", color: "#3366FF")
|> render.()
While the relative medians across classes are no surprise, it
is interesting to compare the medians of subcompacts to compacts
and midsize vehicles - not as different as one might assume, with
a wide variety of highway mileage across the subcompact class.
Of course, the blue outline and white fill are great too.
Line Chart

For our line chart example, we will use the economics dataset.
This data describes certain economic indicators over the past several decades.
data = GGity.Examples.economics()
Kino.DataTable.new(data)
It is easy for us to plot a line for one of these variables, say, unemployment.
data
|> Plot.new(%{x: "date", y: "unemploy"})
|> Plot.geom_line()
|> render.()
Quick tangent - note the y-axis labels. The default Elixir format for printed
floats is rarely satisfying for large numbers. We can fix that, and make the
line dotted and purple while we are at it.
plot =
  data
  |> Plot.new(%{x: "date", y: "unemploy"})
  |> Plot.geom_line(color: "purple", linetype: :dotted)
  |> Plot.scale_y_continuous(labels: :commas)
  |> render.()
Here we used the name of a built-in labeling function, :commas (passed
to Plot.scale_y_continuous/2), but any function that takes the label
value as an argument and returns the desired label text will work.
to_thous = fn value -> "#{round(value / 1000)} thou" end

plot =
  data
  |> Plot.new(%{x: "date", y: "unemploy"})
  |> Plot.geom_line(color: "limegreen", linetype: :dotted)
  |> Plot.scale_y_continuous(labels: to_thous)
  |> render.()
Back to real work - there are bunch of variables in this time series data;
what if we want to see how each of them moved over time on the same plot?
Enter the economics_long dataset, which normalizes each observation to
a value between zero and one, and presents that number in the value01 variable.
The name of the variable is stored in the... variable variable.
Take a look:
data = GGity.Examples.economics_long()
Kino.DataTable.new(data)
This data shape allows us to assign a chart aesthetic (in this example,
the color of the line) to the the variable variable, and GGity will group
those observations automatically.
data
|> Plot.new(%{x: "date", y: "value01", color: "variable"})
|> Plot.geom_line()
|> render.()
We could use linetype instead of color if we desire, although in
this case it does not seem desirable.
Since we are making the plot uglier in that regard, let's use custom date
formatting to at least simplify the x axis.
data
|> Plot.new(%{x: "date", y: "value01", linetype: "variable"})
|> Plot.geom_line()
|> Plot.scale_x_date(date_labels: "%Y")
|> render.()