orthogonalization: know what to tune to achieve what effect; for this would help to have orthogonal controls (steering wheel, acceleration, braking; well defined impact); however that's not usually the case in machine learning
assumptions we always made in ML:
- fit training set well on cost function (human like): knobs would be: bigger network, better optimization algorithm (adam)
- hope it does well in dev set: knobs would be: bigger (training) data set, regularization
- hope it does well in test set: knob would be: bigger dev set
- performs well in real world: k: change dev set or cost function
precision vs recall
to have a single metric we use F1 score = 2 / (1 / precision + 1 / recall) also known as harmonic mean, averages precision and recall
in machine learning: a good dev set (precision and recall) plus a single number evaluation metric