mGalarnyk/PrincipleComponentAnalysisQuiz.md

## PrincipleComponentAnalysisQuiz.md

      
    Raw
  

              PrincipleComponentAnalysisQuiz.md
            
          
    Machine Learning Week 8 Quiz 2 (Principle Component Analysis) Stanford Coursera

Github repo for the Course: Stanford Machine Learning (Coursera) 

Quiz Needs to be viewed here at the repo (because the image solutions cant be viewed as part of a gist)
Question 1


Answer
Explanation


The maximal variance is along the y = x line, so this option is correct.


The maximal variance is along the y = x line, so the negative vector along that line is correct for the first principal component


Question 2


Answer
Explanation


Choose k to be the smallest value so that at least 99% of the variance is retained
This maintains the structure of the data while maximally reducing its dimension.


Question 3


Answer
Explanation


It is just a formula.


Question 4


Answer
Explanation


If you do not perform mean normalization, PCA will rotate the data in a possibly undesired way.


Not sure yet


Question 5


True or False
Statement
Explanation


False
Data visualization: To take 2D data, and find a different way of plotting it in 2D (using k=2)
None needed


False
As a replacement for (or alternative to) linear regression: For most learning applications, PCA and linear regression give substantially similar results
PCA is not linear regression. They have different goals (and cost functions), so they give different results.


True
Data compression: Reduce the dimension of your input data x⁽ⁱ⁾, which will be used in a supervised learning algorithm (i.e., use PCA so that your supervised learning algorithm runs faster)
If your learning algorithm is too slow because the input dimension is too high, then using PCA to speed it up is a reasonable choice.


True
Data compression: Reduce the dimension of your data, so that it takes up less memory/disk space.
If memory or disk space is limited, PCA allows you to save space in exchange for losing a little of the data's information. This can be a reasonable tradeoff.
Answer	Explanation
	The maximal variance is along the y = x line, so this option is correct.
	The maximal variance is along the y = x line, so the negative vector along that line is correct for the first principal component
Answer	Explanation
	If you do not perform mean normalization, PCA will rotate the data in a possibly undesired way.
	Not sure yet
True or False	Statement	Explanation
False	Data visualization: To take 2D data, and find a different way of plotting it in 2D (using k=2)	None needed
False	As a replacement for (or alternative to) linear regression: For most learning applications, PCA and linear regression give substantially similar results	PCA is not linear regression. They have different goals (and cost functions), so they give different results.
True	Data compression: Reduce the dimension of your input data x⁽ⁱ⁾, which will be used in a supervised learning algorithm (i.e., use PCA so that your supervised learning algorithm runs faster)	If your learning algorithm is too slow because the input dimension is too high, then using PCA to speed it up is a reasonable choice.
True	Data compression: Reduce the dimension of your data, so that it takes up less memory/disk space.	If memory or disk space is limited, PCA allows you to save space in exchange for losing a little of the data's information. This can be a reasonable tradeoff.