Github repo for the Course: Stanford Machine Learning (Coursera)
Quiz Needs to be viewed here at the repo (because the image solutions cant be viewed as part of a gist)
Answer | Explanation |
---|---|
Choose k to be the smallest value so that at least 99% of the variance is retained | This maintains the structure of the data while maximally reducing its dimension. |
Answer | Explanation |
---|---|
It is just a formula. |
Answer | Explanation |
---|---|
If you do not perform mean normalization, PCA will rotate the data in a possibly undesired way. | |
Not sure yet |
True or False | Statement | Explanation |
---|---|---|
False | Data visualization: To take 2D data, and find a different way of plotting it in 2D (using k=2) | None needed |
False | As a replacement for (or alternative to) linear regression: For most learning applications, PCA and linear regression give substantially similar results | PCA is not linear regression. They have different goals (and cost functions), so they give different results. |
True | Data compression: Reduce the dimension of your input data x(i), which will be used in a supervised learning algorithm (i.e., use PCA so that your supervised learning algorithm runs faster) | If your learning algorithm is too slow because the input dimension is too high, then using PCA to speed it up is a reasonable choice. |
True | Data compression: Reduce the dimension of your data, so that it takes up less memory/disk space. | If memory or disk space is limited, PCA allows you to save space in exchange for losing a little of the data's information. This can be a reasonable tradeoff. |
This is not the complete collection of the questions. If you decide to retake the quiz, the some of the questions will be different.