Skip to content

Instantly share code, notes, and snippets.

@sean-mcclure
Last active November 7, 2021 00:23
Show Gist options
  • Save sean-mcclure/50c68e80f0bae05700866e7fb40479d0 to your computer and use it in GitHub Desktop.
Save sean-mcclure/50c68e80f0bae05700866e7fb40479d0 to your computer and use it in GitHub Desktop.
Configuration Name Description Operators
Default TPOT TPOT will search over a broad range of preprocessors, feature constructors, feature selectors, models, and parameters to find a series of operators that minimize the error of the model predictions. Some of these operators are complex and may take a long time to run, especially on larger datasets. Note: This is the default configuration for TPOT. To use this configuration, use the default value (None) for the config_dict parameter. Classification Regression
TPOT light TPOT will search over a restricted range of preprocessors, feature constructors, feature selectors, models, and parameters to find a series of operators that minimize the error of the model predictions. Only simpler and fast-running operators will be used in these pipelines, so TPOT light is useful for finding quick and simple pipelines for a classification or regression problem. This configuration works for both the TPOTClassifier and TPOTRegressor. Classification Regression
TPOT MDR TPOT will search over a series of feature selectors and Multifactor Dimensionality Reduction models to find a series of operators that maximize prediction accuracy. The TPOT MDR configuration is specialized for genome-wide association studies (GWAS), and is described in detail online here. Note that TPOT MDR may be slow to run because the feature selection routines are computationally expensive, especially on large datasets. Classification Regression
TPOT sparse TPOT uses a configuration dictionary with a one-hot encoder and the operators normally included in TPOT that also support sparse matrices. This configuration works for both the TPOTClassifier and TPOTRegressor. Classification Regression
TPOT NN TPOT uses the same configuration as "Default TPOT" plus additional neural network estimators written in PyTorch (currently only `tpot.builtins.PytorchLRClassifier` and `tpot.builtins.PytorchMLPClassifier`). Currently only classification is supported, but future releases will include regression estimators. Classification
TPOT cuML TPOT will search over a restricted configuration using the GPU-accelerated estimators in RAPIDS cuML and DMLC XGBoost. This configuration requires an NVIDIA Pascal architecture or better GPU with compute capability 6.0+, and that the library cuML is installed. With this configuration, all model training and predicting will be GPU-accelerated. This configuration is particularly useful for medium-sized and larger datasets on which CPU-based estimators are a common bottleneck, and works for both the TPOTClassifier and TPOTRegressor. Classification Regression
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment