Skip to content

Instantly share code, notes, and snippets.

@dmbates
Created May 3, 2022 18:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dmbates/5b99dcbc3c3cc19b19a7c09a315e1fe3 to your computer and use it in GitHub Desktop.
Save dmbates/5b99dcbc3c3cc19b19a7c09a315e1fe3 to your computer and use it in GitHub Desktop.
Notebook related to MixedModels.jl issue 611
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
---
title: "Issue611"
author: "Douglas Bates"
jupyter: julia-1.7
date: today
date-format: iso
execute:
keep-ipynb: true
---
## Load packages and data
```{julia}
using CSV, MixedModels, ProgressMeter, Tables
ProgressMeter.ijulia_behavior(:clear);
```
Recent developments in Julia make it convenient to name the weights vector `wts`, which is the name of the named argument in the `LinearMixedModel` constructor,
```{julia}
dat = CSV.read("./data/model_ready_data_small.csv", columntable)
wts = CSV.read("./data/weights.csv", columntable).weight_vector
```
Also, assign a `contrasts` specification of `Grouping` to the grouping factors
```{julia}
contrasts = Dict(
:brand_category_channel_type => Grouping(),
:channel_type_sub_brand => Grouping(),
);
```
With the `Grouping` contrast there is no need to change from an Integer to a Categorical type.
In general I would recommend prepending a character to the number in the CSV file so that the column is read as a String or a `PooledArray`, which is like a Categorical array.
For example,
```{julia}
brand = string.('B', lpad.(dat.brand_category_channel_type, 2, '0'))
```
Recently I do any permanent storage of data frames as arrow files (see [Arrow.jl](https://github.com/apache/arrow-julia)) because they retain the metadata and you don't need to go through the conversion from CSV to an internal representation followed by transformations.
## Initial model fits
Before going to vector-valued random effects I would try just random intercepts to see what kinds of fits they produce.
I write this now as a let block so that I can define the formula but it doesn't clutter up the namespace.
```{julia}
m1 = let
form = @formula(
volume ~ 1 + ppi + base_price +
(1|channel_type_sub_brand)
)
fit(MixedModel, form, dat; contrasts, wts)
end
```
As you can see, it converges on the boundary with a variance component of zero.
```{julia}
println(m1)
```
```{julia}
m2 = let
form = @formula(
volume ~ 1 + ppi + base_price +
zerocorr(1 + ppi | channel_type_sub_brand)
)
fit(MixedModel, form, dat; contrasts, wts)
end
```
```{julia}
m2a = let
form = @formula(
volume ~ 1 + ppi + base_price +
(0 + ppi | channel_type_sub_brand)
)
fit(MixedModel, form, dat; contrasts, wts)
end
println(m2a)
```
```{julia}
println(m2)
```
```{julia}
m3 = let
form = @formula(
volume ~ 1 + ppi + base_price +
(1 | channel_type_sub_brand) +
(1 | brand_category_channel_type)
)
fit(MixedModel, form, dat; contrasts, wts)
end
```
```{julia}
m4 = let
form = @formula(
volume ~ 1 + ppi + base_price +
(1 | channel_type_sub_brand) +
zerocorr(1 + ppi | brand_category_channel_type)
)
LinearMixedModel(form, dat; contrasts, wts)
end
try
updateL!(m4)
catch
@warn "Cholesky factorization failed"
end
```
```{julia}
m5 = let
form = @formula(
volume ~ 1 + ppi + base_price +
(0 + ppi | channel_type_sub_brand) +
(0 + ppi | brand_category_channel_type)
)
fit(MixedModel, form, dat; contrasts, wts)
end
```
```{julia}
println(m5)
```
So the model seems to come down to a random effect for the slope w.r.t. `ppi` by `channel_type_sub_brand`, model `m2a`, and not much else.
[deps]
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
Chain = "8be319e6-bccf-4806-a6f7-6fae938471bc"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
MixedModels = "ff71e718-51f3-5ec2-a782-8ffcbfa3c316"
ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment