Skip to content

Instantly share code, notes, and snippets.

@johnmyleswhite
Last active August 29, 2015 14:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save johnmyleswhite/ad5305ecaa9de01e317e to your computer and use it in GitHub Desktop.
Save johnmyleswhite/ad5305ecaa9de01e317e to your computer and use it in GitHub Desktop.
0.4 Roadmap for the Tabular Data Ecosystem
  • Replace NAtype with Nullable{T}
    • This will substantially change the semantics of missing data
      • You will have to fully specify how missing values are handled
      • You cannot leave off thinking about missing values to the first time you hit one
      • Your code will become type-stable and will run much faster (~100x)
      • The name NA will be replaced by NULL
        • This implies isna will become isnull
      • There will be no more attempts to define functions over NA
        • This includes Booleans: there will be no more three-valued logic
  • Replace DataArray with NullableArray
    • This will change the semantics of scalar access to produce Nullable objects
  • Replace PooledDataArray with CategoricalArray and OrdinalArray
    • This will make clear that these data structures are the Julia form of R's factors
    • This will change the semantics of scalar access to produce Nullable{CategoricalVariable} and Nullable{OrdinalVariable}
  • Standardize DataFrames by formalizing API and finalizing core primitives
@nalimilan
Copy link

Sounds great, except:

This will change the semantics of scalar access to produce Nullable{CategoricalVariable} and Nullable{OrdinalVariable}

As I said elsewhere, calling them CategoricalValue and OrdinalValue would be much clearer. :-)

Finally, I'm not sure what you mean exactly with the last point.

@prcastro
Copy link

Shouldn't this roadmap be inside DataFrame repo (on a file or on the Wiki), and issues be opened for all items here (and attached with a milestone)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment