Skip to content

Instantly share code, notes, and snippets.

@mick001
Created August 12, 2018 22:54
Show Gist options
  • Save mick001/e859743772c76a5683d96946368b9cd6 to your computer and use it in GitHub Desktop.
Save mick001/e859743772c76a5683d96946368b9cd6 to your computer and use it in GitHub Desktop.
Format Olivetti faces dataset image data in R
################################################################################
# This script converts the images in the following format:
#
# X -> converted to a 399x4096 matrix. Each row represents an image of 64x64
# greyscale pixels
#
# y -> labels of the images
#
################################################################################
#-------------------------------------------------------------------------------
# Setup
require(dplyr)
# X: features
X_df <- read.csv("olivetti_X.csv") #%>% as.matrix()
# y: labels
y_df <- read.csv("olivetti_y.csv") %>% as.matrix()
# Function to plot image data
plt_img <- function(x){ image(x, col=grey(seq(0, 1, length=256)))}
#-------------------------------------------------------------------------------
# Format images
# First vector
img1 <- as.numeric(X_df[1, ])
# Image upside down
img2 <- matrix(img1, nrow=64, byrow=T)
# Image corrected
img3 <- t(img2)[, nrow(img2):1]
# Vector containing the image
img4 <- as.numeric(t(img3))
# plt_img(img2)
# plt_img(img3)
# Output data matrix (features)
out <- matrix(img4, nrow = 1)
# Format all the other images
for(i in 2:nrow(X_df))
{
# Vector
img1 <- as.numeric(X_df[i, ])
# Image upside down
img2 <- matrix(img1, nrow=64, byrow=T)
# Image corrected
img3 <- t(img2)[, nrow(img2):1]
# Vector containing the image
img4 <- as.numeric(t(img3))
out <- rbind(out, img4)
}
#-------------------------------------------------------------------------------
# Package the output and save an .Rdata file
# Remove row names
rownames(out) <- NULL
# Check a plot of the first image
plt_img(matrix(out[1, ], ncol=64, byrow=T))
# Save the output
save(out, y_df, file="images_formatted.Rdata")
@riberagorka
Copy link

where could I find the "olivetti" dataset in .csv format?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment