Skip to content

Instantly share code, notes, and snippets.

@vankesteren
Created June 20, 2017 08:38
Show Gist options
  • Save vankesteren/fd22860add5ce4fc85669d6932f44403 to your computer and use it in GitHub Desktop.
Save vankesteren/fd22860add5ce4fc85669d6932f44403 to your computer and use it in GitHub Desktop.
Proposal for the split functionality

Split functionality in Descriptives

Introduction

Goal

We want to allow users to view descriptives split by a certain factor, such as gender, experimental condition, or age group. This will aid general data exploration and analysis.

Thoughts

A start has already been made with the boxplots. The split by functionality of this part should be moved to the main variables widget and it should act on all the descriptives of a certain analysis.

The split functionality should not interfere/overlap with the filter functionality that is being created as a data editing feature. Indeed, this is certainly not such a feature and could even be used to complement such a feature: in order to filter on a certain variable, you'll often want to check the descriptives of the level that you are filtering out.

This project is a good opportunity to also refactor the code a bit so that it will be easier to understand for programmers looking at it for the first time. This means there will be adequate handling of the state system (currently not in place for the corrplot), informative comments, and general alignment of the structure with other analyses.

Proposal

Below, the solutions for each component of the descriptives analysis are described.

Top widget

Move the "split by" box from boxplot to the top widget. Rename to "split"

Descriptive statistics table

Upon split, add overTitles to the table, similarly to how confidence interval titles are usually made. This overTitle will contain the variable name and the main titles will be the levels of the factor upon which the split is made.

Frequencies table

Upon split, add a column in front of each table with the levels of the split factor:

Frequencies for variableOfInterest

splitFactor Value Frequency ..
Level A 0 51 ..
1 49 ..
2 23 ..
Level B 0 22 ..
1 44 ..
2 33 ..

Distribution plots

Upon split, create muliple plots for each variable, one for each level of the split factor.

Correlation plots

Upon split, create multiple plots, one for each level of the split factor.

Boxplots

Nothing needs to be changed for the boxplots.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment