Skip to content

Instantly share code, notes, and snippets.

@rmitsch
Last active November 20, 2017 14:51
Show Gist options
  • Save rmitsch/4cbadc4be22cdd86751abd85d0ee5db5 to your computer and use it in GitHub Desktop.
Save rmitsch/4cbadc4be22cdd86751abd85d0ee5db5 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exercise 5-1: PCA"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[-3, -2, -1, 0, 1, 2, -2, -1, 0, 1, 2, -2, -1, 0, 1, 2, 3],\n",
" [-2, -1, 0, 1, 2, 3, -2, -1, 0, 1, 2, -3, -2, -1, 0, 1, 2]])"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy as np\n",
"x = np.array( [ (-3,-2), (-2,-1), (-1,0), (0,1), (1,2), (2,3), (-2,-2), (-1,-1), (0,0), (1,1), (2,2), (-2,-3), (-1,-2), (0,-1), (1,0), (2,1), (3,2)]).T\n",
"x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*(a)* Compute the covariance matrix M."
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 3. , 2.625],\n",
" [ 2.625, 3. ]])"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cov_m = np.cov(x, rowvar=True)\n",
"cov_m"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*(b)* Compute the eigenvalues and eigenvectors of M ."
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[ 0.375 5.625]\n",
"[[-0.70710678 0.70710678]\n",
" [ 0.70710678 0.70710678]]\n"
]
}
],
"source": [
"eigenvalues, normalized_eigenvectors = np.linalg.eigh(cov_m)\n",
"print(eigenvalues)\n",
"print(normalized_eigenvectors)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*(c)* Find the smallest eigenvalue and find the related eigenvector as well. The resulted eigenvector builds the\n",
"basis for the new subspace.\n",
"Note: Shouldn't we pick the biggest eigenvalue(s) for PCA?"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"min_eigenvalue_index = np.argmin(eigenvalues)\n",
"min_eigenvalue_index"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*(d)* Transform vectors of X in this new subspace. $y = W^T \\times x$"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([-0.70710678, 0.70710678])"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"W = normalized_eigenvectors[min_eigenvalue_index]\n",
"W"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0.70710678, 0.70710678, 0.70710678, 0.70710678, 0.70710678,\n",
" 0.70710678, 0. , 0. , 0. , 0. ,\n",
" 0. , -0.70710678, -0.70710678, -0.70710678, -0.70710678,\n",
" -0.70710678, -0.70710678])"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"W.T.dot(x)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 @ /development/datamining",
"language": "python",
"name": "datamining"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment