Skip to content

Instantly share code, notes, and snippets.

@p16i
Last active April 6, 2020 13:52
Show Gist options
  • Save p16i/b2ec53d2d9fa28505f81bf02bc364db4 to your computer and use it in GitHub Desktop.
Save p16i/b2ec53d2d9fa28505f81bf02bc364db4 to your computer and use it in GitHub Desktop.

UKB Training and Hyperparameter Optimization

# Step 1: Spliting Training and Test Sets
training_set, test_set = train_test_split(UKB, test_size=0.3, stratify=True, random_seed=42)

# Step 2 (Outer Loop): Hyperparameter optimization with 20 iterations
for i in 20:
  hyperparameters = ChooseHyperParameters(...)

  # Step 2.1 (Inner Loop)
  for inloop_training_set, inloop_validation_set from KFoldSplit(folds=5, random_state=0):
    model = FitLogisticRegression(
      hyperparameters, inloop_training
    ) 
  # Step 2.2: Computing statistic for each run
  auc = ComputeAUC(model, inloop_validation_set)

run_statistic = Average(all aucs from Step 2.1)

# Step 2.3: Choosing the best values of hyperparameters
best_hyperparameters = hypermeters from Run i whose run_statistic is the highest

# Step 3: Fitting the final model
final_model = FitLogisticRegression(
  best_hyperparameters, training_set
) 

# Step 4: Computing Final Statistics
final_auc = ComputeAUC(final_model, test_set)

Remarks

  • Step 1 is done only once. All models use the same training and testing sets;
  • Step 2-4 are repeated for each model;
  • test_set is used only once (in Step 4).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment