Skip to content

Instantly share code, notes, and snippets.

@jrzaurin
Last active June 13, 2021 09:19
Show Gist options
  • Save jrzaurin/838d70a0ec6e6dcf9f25c4df533a1992 to your computer and use it in GitHub Desktop.
Save jrzaurin/838d70a0ec6e6dcf9f25c4df533a1992 to your computer and use it in GitHub Desktop.
Results for the NYC Taxi rode duration dataset with TabTransformer
embed_dropout full_embed_dropout shared_embed add_shared_embed frac_shared_embed input_dim n_heads n_blocks dropout ff_hidden_dim transformer_activation mlp_hidden_dims mlp_activation mlp_batchnorm mlp_batchnorm_last mlp_linear_first with_wide lr batch_size weight_decay optimizer lr_scheduler base_lr max_lr div_factor final_div_factor n_cycles val_loss_or_metric
0.0 False False False 8 16 4 4 0.1 relu None relu False False False False 0.01 1024 0.0 Adam ReduceLROnPlateau 0.001 0.01 25 10000.0 5 180162.4086
0.0 False False False 8 16 4 4 0.1 relu None relu False False False False 0.01 256 0.0 Adam ReduceLROnPlateau 0.001 0.01 25 10000.0 5 186017.1888
0.0 False False False 8 16 4 4 0.1 relu None relu False False False False 0.01 512 0.0 Adam ReduceLROnPlateau 0.001 0.01 25 10000.0 5 196144.0674
0.0 False False False 8 32 8 4 0.4 relu None relu False False False False 0.01 1024 0.0 Adam ReduceLROnPlateau 0.001 0.01 25 10000.0 5 357869.3703
0.0 False False False 8 64 16 4 0.4 relu None relu False False False False 0.01 512 0.0 Adam ReduceLROnPlateau 0.001 0.01 25 10000.0 5 357884.9043
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment