Skip to content

Instantly share code, notes, and snippets.

@specifics
Last active February 25, 2018 21:59
Show Gist options
  • Save specifics/4b08b11944efb79173ea496b7e1af024 to your computer and use it in GitHub Desktop.
Save specifics/4b08b11944efb79173ea496b7e1af024 to your computer and use it in GitHub Desktop.
Notes for "Fun with Hyper-Parameter Tuning" presentation on 25-Feb-2018

Fun with Hyper Parameter Tuning

Presentation by John Burt, Ph.D @ PSU Business Accelerator on 25 Feb 2018 @ 1pm

  • CV = cross-validation

Code Example 1

  • Done mainly using the sklearn library
  • Count vectorizor: each word occurrence is a vector
  • Text data > Feature Engineering: TfidVectorizer > Classifier: SGDClassifier > Hyper-parameter tuning: GridsearchCV

Process

  1. Default variables
  2. Specify params for Gridsearch to check. Gridsearch takes estimator but you can pass it other things such as a normalizer or other data pre-processor
  3. Run Gridsearch: pass it training data and target data
  4. Gridsearch then outputs best score and best set of params
  5. Generate an accuracy heatmap of X:penalty and Y:number of iterations

Code example 2

  • Pipeline is another estimator that is passed to Gridsearch

Process

  1. Pipeline consists of defining the two objects: TfidVectorizer and SGDClassifier
  2. Instead of passing the classifier object, the pipeline is passed into Gridsearch
  3. Pipeline does all param vectorization

Optimization results

  • No tuning w/ default params: 93.5% acc
  • Classifier tuning: 94.3%
  • Vectorizer + classifier tuning: 95.3%
  • Can't just vectorize everything, need to experiment tuning methods and which params to tune for better results
  • Find models > test > model is done > tune parameters

Challenges

  • Get example code working, try different parmas: both TfidVectorizer and SGDClassifier have lots of params!
  • Implement Gridsearch to optimizer your classifier for last session'ss Wikipedia toxicity data

Links

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment