obengwilliam/ Isomap

## Isomap
Isomap is a nonlinear dimensionality reduction method.
The algorithm provides a simple method for estimating the intrinsic geometry of a data manifold based on a rough estimate
of each data point’s neighbours

## isomap-README.md

      
    Raw
  

              isomap-README.md
            
          
    Python examples of isomap algorithm


isomap_faces_tenenbaum:
Replicate Joshua Tenenbaum's - the primary creator of the isometric feature mapping algorithm -  canonical, dimensionality reduction research experiment for visual perception.
isomap_aloi:

Whatever your high-dimensional samples are, be they images, sound files, or thoughtfully collected attributes, they can all be considered single points in a high dimensional feature-space. Each one of your observations is just a single point. Even with a high dimensionality, it's possible that most or all your samples actually lie on a lower dimension surface. Isomap aims to capture that embedding, which is essentially the motion in the underlying, non-linear degrees of freedom.
By testing isomap on a carefully constructed dataset, you will be able to visually confirm its effectiveness, and gain a deeper understanding of how and why each parameter acts the way it does.
The ALOI, Amsterdam Library of Object Images, hosts a huge collection of 1000 small objects that were photographed in such a controlled environment, by systematically varying the viewing angle, illumination angle, and illumination color for each object separately.
Manifold extraction, and isomap specifically are really good with vision recognition problems, speech problems, and many other real-world tasks, such as identifying similar objects, or objects that have undergone some change.


## isomap_aloi.py
"""
 The ALOI, Amsterdam Library of Object Images, hosts a huge collection of 1000 small objects that were photographed in such a controlled
 environment, by systematically varying the viewing angle, illumination angle, and illumination color for each object separately.
 It can be accessed here: http://aloi.science.uva.nl/
 It shows that the isomap embedding appears to follow an easily traversable, 3D spline
"""
import pandas as pd

from scipy import misc
from sklearn import manifold

import matplotlib.pyplot as plt

import os

# Look pretty...
plt.style.use('ggplot')


#
# Start by creating a regular old, plain, "vanilla"
# python list.
#
samples = []
colours = []

#
# for-loop that iterates over the images in the
# Datasets/ALOI/32/ folder, appending each of them to
# your list. Each .PNG image should first be loaded into a
# temporary NDArray.
#
# Optional: Resample the image down by a factor of two if you
# have a slower computer. You can also convert the image from
# 0-255  to  0.0-1.0  if you'd like, but that will have no
# effect on the algorithm's results.
#
directory = "Datasets/ALOI/32/"
for fname in os.listdir(directory):
  fullname = os.path.join(directory, fname)
  img = misc.imread(fullname)
  # samples.append(  (img[::2, ::2] / 255.0).reshape(-1)  )   RESAMPLE
  samples.append( (img).reshape(-1) )
  colours.append('b') # blue colour

#
# appends to your list the images
# in the /Datasets/ALOI/32_i directory.
#
directory = "Datasets/ALOI/32i/"
for fname in os.listdir(directory):
  fullname = os.path.join(directory, fname)
  img = misc.imread(fullname)
  # samples.append(  (img[::2, ::2] / 255.0).reshape(-1)  ) RESAMPLE
  samples.append( (img).reshape(-1) )
  colours.append('r')  # red colour

#
# Convert the list to a dataframe
#
df = pd.DataFrame( samples )


#
# Implement Isomap here. Reduce the dataframe df down
# to three components, using K=6 for your neighborhood size
#
iso = manifold.Isomap(n_neighbors=6, n_components=3)
iso.fit(df)

my_isomap = iso.transform(df)


#
# Create a 2D Scatter plot to graph your manifold. You
# can use either 'o' or '.' as your marker. Graph the first two
# isomap components
#
fig = plt.figure()
ax = fig.add_subplot(111)
ax.set_title("ISO transformation 2D")

ax.scatter(my_isomap[:,0], my_isomap[:,1], marker='.', c=colours)

#
# Create a 3D Scatter plot to graph your manifold. You
# can use either 'o' or '.' as your marker:
#

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_title("ISO transformation 3D")

ax.scatter(my_isomap[:,0], my_isomap[:,1], my_isomap[:,2], marker='.', c=colours)

plt.show()


## isomap_faces_tenenbaum.py
"""
Replicate Joshua Tenenbaum's - the primary creator of the isometric feature mapping algorithm -  canonical, dimensionality reduction
research experiment for visual perception.
His original dataset from December 2000 consists of 698 samples of 4096-dimensional vectors.
These vectors are the coded brightness values of 64x64-pixel heads that have been rendered facing various directions and lighted from
many angles.
Can be accessed here: https://web.archive.org/web/20160913051505/http://isomap.stanford.edu/datasets.html
-Applying both PCA and Isomap to the 698 raw images to derive 2D principal components and a 2D embedding of the data's intrinsic
 geometric structure.
-Project both onto a 2D and 3D scatter plot, with a few superimposed face images on the associated samples.
"""
import pandas as pd
import scipy.io
import random, math

import matplotlib.pyplot as plt

from sklearn.decomposition import PCA
from sklearn import manifold

def Plot2D(T, title, x, y, num_to_plot=40):
  # This method picks a bunch of random samples (images in our case)
  # to plot onto the chart:
  fig = plt.figure()
  ax = fig.add_subplot(111)
  ax.set_title(title)
  ax.set_xlabel('Component: {0}'.format(x))
  ax.set_ylabel('Component: {0}'.format(y))
  x_size = (max(T[:,x]) - min(T[:,x])) * 0.08
  y_size = (max(T[:,y]) - min(T[:,y])) * 0.08
  for i in range(num_to_plot):
    img_num = int(random.random() * num_images)
    x0, y0 = T[img_num,x]-x_size/2., T[img_num,y]-y_size/2.
    x1, y1 = T[img_num,x]+x_size/2., T[img_num,y]+y_size/2.
    img = df.iloc[img_num,:].reshape(num_pixels, num_pixels)
    ax.imshow(img, aspect='auto', cmap=plt.cm.gray, interpolation='nearest', zorder=100000, extent=(x0, x1, y0, y1))

  # It also plots the full scatter:
  ax.scatter(T[:,x],T[:,y], marker='.',alpha=0.7)


# A .MAT file is a .MATLAB file.
mat = scipy.io.loadmat('Datasets/face_data.mat')
df = pd.DataFrame(mat['images']).T
num_images, num_pixels = df.shape
num_pixels = int(math.sqrt(num_pixels))

# Rotate the pictures, so we don't have to crane our necks:
for i in range(num_images):
  df.loc[i,:] = df.loc[i,:].reshape(num_pixels, num_pixels).T.reshape(-1)


#
# Implement PCA here. Reduce the dataframe df down
# to THREE components. Once you've done that, call Plot2D.
#
# The format is: Plot2D(T, title, x, y, num_to_plot=40):
# T is your transformed data, NDArray.
# title is your chart title
# x is the principal component you want displayed on the x-axis, Can be 0 or 1
# y is the principal component you want displayed on the y-axis, Can be 1 or 2
#
pca = PCA(n_components=3)
pca.fit(df)

T = pca.transform(df)

Plot2D(T, "PCA transformation", 1, 2, num_to_plot=40)

#
# Implement Isomap here. Reduce the dataframe df down
# to THREE components.
#
iso = manifold.Isomap(n_neighbors=8, n_components=3)
iso.fit(df)
manifold = iso.transform(df)

Plot2D(manifold, "ISO transformation", 1, 2, num_to_plot=40)


#
# draw your dataframes in 3D
#
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_xlabel('0')
ax.set_ylabel('1')
ax.set_zlabel('2')

ax.scatter(manifold[:,0], manifold[:,1], manifold[:,2], c='red')


plt.show()
	Isomap is a nonlinear dimensionality reduction method.
	The algorithm provides a simple method for estimating the intrinsic geometry of a data manifold based on a rough estimate
	of each data point’s neighbours
	"""
	The ALOI, Amsterdam Library of Object Images, hosts a huge collection of 1000 small objects that were photographed in such a controlled
	environment, by systematically varying the viewing angle, illumination angle, and illumination color for each object separately.
	It can be accessed here: http://aloi.science.uva.nl/
	It shows that the isomap embedding appears to follow an easily traversable, 3D spline
	"""
	import pandas as pd

	from scipy import misc
	from sklearn import manifold

	import matplotlib.pyplot as plt

	import os

	# Look pretty...
	plt.style.use('ggplot')


	#
	# Start by creating a regular old, plain, "vanilla"
	# python list.
	#
	samples = []
	colours = []

	#
	# for-loop that iterates over the images in the
	# Datasets/ALOI/32/ folder, appending each of them to
	# your list. Each .PNG image should first be loaded into a
	# temporary NDArray.
	#
	# Optional: Resample the image down by a factor of two if you
	# have a slower computer. You can also convert the image from
	# 0-255 to 0.0-1.0 if you'd like, but that will have no
	# effect on the algorithm's results.
	#
	directory = "Datasets/ALOI/32/"
	for fname in os.listdir(directory):
	fullname = os.path.join(directory, fname)
	img = misc.imread(fullname)
	# samples.append( (img[::2, ::2] / 255.0).reshape(-1) ) RESAMPLE
	samples.append( (img).reshape(-1) )
	colours.append('b') # blue colour

	#
	# appends to your list the images
	# in the /Datasets/ALOI/32_i directory.
	#
	directory = "Datasets/ALOI/32i/"
	for fname in os.listdir(directory):
	fullname = os.path.join(directory, fname)
	img = misc.imread(fullname)
	# samples.append( (img[::2, ::2] / 255.0).reshape(-1) ) RESAMPLE
	samples.append( (img).reshape(-1) )
	colours.append('r') # red colour

	#
	# Convert the list to a dataframe
	#
	df = pd.DataFrame( samples )


	#
	# Implement Isomap here. Reduce the dataframe df down
	# to three components, using K=6 for your neighborhood size
	#
	iso = manifold.Isomap(n_neighbors=6, n_components=3)
	iso.fit(df)

	my_isomap = iso.transform(df)


	#
	# Create a 2D Scatter plot to graph your manifold. You
	# can use either 'o' or '.' as your marker. Graph the first two
	# isomap components
	#
	fig = plt.figure()
	ax = fig.add_subplot(111)
	ax.set_title("ISO transformation 2D")

	ax.scatter(my_isomap[:,0], my_isomap[:,1], marker='.', c=colours)

	#
	# Create a 3D Scatter plot to graph your manifold. You
	# can use either 'o' or '.' as your marker:
	#

	fig = plt.figure()
	ax = fig.add_subplot(111, projection='3d')
	ax.set_title("ISO transformation 3D")

	ax.scatter(my_isomap[:,0], my_isomap[:,1], my_isomap[:,2], marker='.', c=colours)

	plt.show()
	"""
	Replicate Joshua Tenenbaum's - the primary creator of the isometric feature mapping algorithm - canonical, dimensionality reduction
	research experiment for visual perception.
	His original dataset from December 2000 consists of 698 samples of 4096-dimensional vectors.
	These vectors are the coded brightness values of 64x64-pixel heads that have been rendered facing various directions and lighted from
	many angles.
	Can be accessed here: https://web.archive.org/web/20160913051505/http://isomap.stanford.edu/datasets.html
	-Applying both PCA and Isomap to the 698 raw images to derive 2D principal components and a 2D embedding of the data's intrinsic
	geometric structure.
	-Project both onto a 2D and 3D scatter plot, with a few superimposed face images on the associated samples.
	"""
	import pandas as pd
	import scipy.io
	import random, math

	import matplotlib.pyplot as plt

	from sklearn.decomposition import PCA
	from sklearn import manifold

	def Plot2D(T, title, x, y, num_to_plot=40):
	# This method picks a bunch of random samples (images in our case)
	# to plot onto the chart:
	fig = plt.figure()
	ax = fig.add_subplot(111)
	ax.set_title(title)
	ax.set_xlabel('Component: {0}'.format(x))
	ax.set_ylabel('Component: {0}'.format(y))
	x_size = (max(T[:,x]) - min(T[:,x])) * 0.08
	y_size = (max(T[:,y]) - min(T[:,y])) * 0.08
	for i in range(num_to_plot):
	img_num = int(random.random() * num_images)
	x0, y0 = T[img_num,x]-x_size/2., T[img_num,y]-y_size/2.
	x1, y1 = T[img_num,x]+x_size/2., T[img_num,y]+y_size/2.
	img = df.iloc[img_num,:].reshape(num_pixels, num_pixels)
	ax.imshow(img, aspect='auto', cmap=plt.cm.gray, interpolation='nearest', zorder=100000, extent=(x0, x1, y0, y1))

	# It also plots the full scatter:
	ax.scatter(T[:,x],T[:,y], marker='.',alpha=0.7)



	# A .MAT file is a .MATLAB file.
	mat = scipy.io.loadmat('Datasets/face_data.mat')
	df = pd.DataFrame(mat['images']).T
	num_images, num_pixels = df.shape
	num_pixels = int(math.sqrt(num_pixels))

	# Rotate the pictures, so we don't have to crane our necks:
	for i in range(num_images):
	df.loc[i,:] = df.loc[i,:].reshape(num_pixels, num_pixels).T.reshape(-1)


	#
	# Implement PCA here. Reduce the dataframe df down
	# to THREE components. Once you've done that, call Plot2D.
	#
	# The format is: Plot2D(T, title, x, y, num_to_plot=40):
	# T is your transformed data, NDArray.
	# title is your chart title
	# x is the principal component you want displayed on the x-axis, Can be 0 or 1
	# y is the principal component you want displayed on the y-axis, Can be 1 or 2
	#
	pca = PCA(n_components=3)
	pca.fit(df)

	T = pca.transform(df)

	Plot2D(T, "PCA transformation", 1, 2, num_to_plot=40)

	#
	# Implement Isomap here. Reduce the dataframe df down
	# to THREE components.
	#
	iso = manifold.Isomap(n_neighbors=8, n_components=3)
	iso.fit(df)
	manifold = iso.transform(df)

	Plot2D(manifold, "ISO transformation", 1, 2, num_to_plot=40)


	#
	# draw your dataframes in 3D
	#
	fig = plt.figure()
	ax = fig.add_subplot(111, projection='3d')
	ax.set_xlabel('0')
	ax.set_ylabel('1')
	ax.set_zlabel('2')

	ax.scatter(manifold[:,0], manifold[:,1], manifold[:,2], c='red')


	plt.show()