- Replace
NAtype
withNullable{T}
- This will substantially change the semantics of missing data
- You will have to fully specify how missing values are handled
- You cannot leave off thinking about missing values to the first time you hit one
- Your code will become type-stable and will run much faster (~100x)
- The name
NA
will be replaced byNULL
- This implies
isna
will becomeisnull
- This implies
- There will be no more attempts to define functions over
NA
- This includes Booleans: there will be no more three-valued logic
- This will substantially change the semantics of missing data
- Replace
DataArray
withNullableArray
- This will change the semantics of scalar access to produce
Nullable
objects
- This will change the semantics of scalar access to produce
- Replace
PooledDataArray
withCategoricalArray
andOrdinalArray
- This will make clear that these data structures are the Julia form of R's factors
- This will change the semantics of scalar access to produce
Nullable{CategoricalVariable}
andNullable{OrdinalVariable}
- Standardize DataFrames by formalizing API and finalizing core primitives
Last active
August 29, 2015 14:06
-
-
Save johnmyleswhite/ad5305ecaa9de01e317e to your computer and use it in GitHub Desktop.
0.4 Roadmap for the Tabular Data Ecosystem
Shouldn't this roadmap be inside DataFrame repo (on a file or on the Wiki), and issues be opened for all items here (and attached with a milestone)?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Sounds great, except:
As I said elsewhere, calling them
CategoricalValue
andOrdinalValue
would be much clearer. :-)Finally, I'm not sure what you mean exactly with the last point.