-
-
Save mwaskom/fa3638f52e92fa1a0778785bd5fa21b9 to your computer and use it in GitHub Desktop.
This does expand the scope of seaborn away from pretty-much-only tidy data.
Many functions handle do wide-form data (see the "Current Behavior" section), but that's not widely (heh) appreciated. Partially that is because the handling is a little bit idiosyncratic across the library. And any function that integrates with FacetGrid
currently requires long-from data; fixing that is part of this refactor.
The reason that long-form data is preferred is that the mapping from variables to semantics is very explicit and predictable. The main goal here is to make the implicit mappings that you get with wide-form data formal, so they can be more predictable.
Also, seaborn is moving to keyword-only arguments but I am increasingly leaning towards the generic function signature being func(data, *, ...)
so that func(data)
does something useful for almost any data structure one might have at hand.
A big open question is whether to allow mixing of wide-form data
inputs and explicit semantics, e.g. sns.boxplot(data=iris, hue="species")
. On the one hand, that seems fairly handy. On the other hand, it starts to blur the distinction between wide/long data in a way that could breed confusion.
This approach looks good and makes a lot of sense to me. I think it'd make a great addition to seaborn.
This does expand the scope of seaborn away from pretty-much-only tidy data. Is the hope to reduce the support burden from users who actually just need help tidying their data?