Created
October 20, 2014 14:00
-
-
Save patricksnape/78b80a633354ee563bba to your computer and use it in GitHub Desktop.
Seminar 20/10/2014
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "", | |
"signature": "sha256:cb92b8d7df4b94f7250fc3d3f1e62856f65c5a6a986b711c9e6f9687dd29fc19" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Facial Feature Point Detection\n", | |
"## Recognising eyes from noses using Menpo\n", | |
"## By Patrick Snape" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# A little about me\n", | |
"\n", | |
" - Member of the [Intelligent Behaviour Understanding Group](http://ibug.doc.ic.ac.uk/)\n", | |
" - Focusing on recovery of *dense* shape from images\n", | |
" - Given an image, try recover a 3D mesh that represents their true facial shape\n", | |
" - Preferably only using a single image!\n", | |
" - If you are interested, my papers/work are hosted at my [website](http://patricksnape.github.io/)\n", | |
" - Core developer of the [Menpo project](http://menpo.io)\n", | |
" - Keen interest in facial feature point detection!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Cheeky Announcement\n", | |
"We are conducting a large scale experiment on **spontaneous human emotion**.\n", | |
"\n", | |
"Want to be a part of it? \n", | |
"\n", | |
"You will be recorded **in 3D at 60 FPS** and all you have to do is watch some videos! \n", | |
"\n", | |
"We will even send you a *copy of your facial mesh* if you want one!\n", | |
"\n", | |
"<img src=\"patrick.png\" style=\"height: 250px; margin: auto; display: block\" />\n", | |
"\n", | |
"## Email me at [p.snape@imperial.ac.uk](mailto:p.snape@imperial.ac.uk?subject=4DFAB%20Experiment&) with \"4DFAB Experiment\" in the title." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# What I am covering today\n", | |
"\n", | |
" - What is the difference between detection, feature point detection, tracking and recognition?\n", | |
" - What is facial feature point detection (FFPD)?\n", | |
" - What are the major FFPD techniques?\n", | |
" - How are these implemented within Menpo?\n", | |
" - What about the current state-of-the-art?" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"slide_type": "subslide" | |
}, | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# What is the Menpo Project (briefly)?\n", | |
"\n", | |
" - The Menpo project is a Python project focusing on making our research easier\n", | |
" - Particularly useful if you spend a lot of time processing images!\n", | |
" - Strong high level abstractions to make image loading, processing and viewing simple\n", | |
" - Key components written in C++" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_number": 5, | |
"slide_helper": "subslide_end" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"## What does this have to do with facial feature point detection (FFPD)?\n", | |
"\n", | |
" - On top of the Menpo core we have implemented popular FFPD algorithms!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 5, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# So what is facial feature point detection (FFPD)?\n", | |
"\n", | |
" - In short, FFPD involves recovering a set of sparse feature points on a face\n", | |
" - These points correspond to well-defined points on the face\n", | |
" - Should be easily indentified by a human annotator\n", | |
"\n", | |
"**Before diving into FFPD, we need to clarify a few key *concepts*!**" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 5, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Important Terminology\n", | |
"\n", | |
" - There are a lot of terms that are often used when referring to facial analysis\n", | |
" - It is important to know the difference between them!\n", | |
" - Many concepts require previous levels\n", | |
" - Lets make clear what the pipeline is!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 5, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"<img src=\"Figures/Figures.001.png\" style=\"height: 600px; margin: auto; display: block\" />" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 5, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Detection\n", | |
"\n", | |
" - Given an image, find an object inside of it\n", | |
" - This detection generally amounts to a region with a high probability of containing the object\n", | |
" - Usually, this takes the form of a **bounding box**\n", | |
" \n", | |
"<img src=\"takeo_detection.png\" style=\"height: 400px; margin: auto; display: block\" />" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 5, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Feature Point Detection\n", | |
"\n", | |
" - Given an image, find a sparse set of well-defined points\n", | |
" - These points should have a well defined semantic meaning\n", | |
" - For example, the tip of the nose\n", | |
" - **Almost all** current techniques use a local initialisation\n", | |
" - Expect to be initialised close to the correct result\n", | |
" - Therefore, feature point detection normally occurs *after* detection\n", | |
" \n", | |
"<img src=\"takeo_ffpd.png\" style=\"height: 350px; margin: auto; display: block\" />" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 5, | |
"slide_type": "subslide" | |
}, | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Analysis\n", | |
" - Once a sparse set of points have been found, you can now analyse the face!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 12 | |
}, | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"## Face recognition\n", | |
" - Align the face using the points and extract features\n", | |
" - Alignment helps remove pose error\n", | |
" - Features attempt to provide robustness to illumination, occlusion etc.\n", | |
" - Compare the features to your gallery (known faces) and choose closest results!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 13, | |
"slide_helper": "subslide_end" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"## Emotion Classification\n", | |
" - Use the texture and sparse points to try classify emotion\n", | |
" - Shape of areas such as the mouth are highly discriminative" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 13, | |
"slide_type": "subslide" | |
}, | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Tracking\n", | |
" - Tracking is distinct from detection/FFPD\n", | |
" - Tracking involves an extra step\n", | |
" - Detecting a loss of tracking!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 15, | |
"slide_helper": "subslide_end" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"<br/> \n", | |
" 1. Local initialisation\n", | |
" - Initialise from previous frame\n", | |
" 2. Every fixed number of frames\n", | |
" - Try classify if we are still tracking a face!\n", | |
" 3. In even of loss\n", | |
" - Re-detect" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 15, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# What is facial feature point detection (FFPD)?\n", | |
"\n", | |
" - Recap: Detect a set of *sparse* points on a face\n", | |
" - These points relate to well-defined locations on all faces\n", | |
" - e.g. the tip of the nose\n", | |
" \n", | |
"<img src=\"takeo_ffpd.png\" style=\"height: 400px; margin: auto; display: block\" />" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 15, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# How many points should we detect?\n", | |
"\n", | |
" - At your discretion!\n", | |
" - In IBUG, we use **68 points**\n", | |
" \n", | |
"<img src=\"figure_1_68.jpg\" style=\"height: 250px; margin: auto; display: block\" />\n", | |
"\n", | |
" - But there are many schemes!\n", | |
" - A face that has been labelled in this way is usually called an **annotated** image\n", | |
" - **Annotations** are also called **landmarks**" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 15, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# How to detect facial points?\n", | |
"\n", | |
" - There have been many, many proposed approaches\n", | |
" - In general, you attempt to minimise some error between your current estimated\n", | |
" points and the image\n", | |
" - Four major bodies of work\n", | |
" - Constrained Local Models (CLMs)\n", | |
" - Active Appearance Models (AAMs)\n", | |
" - Regression based methods\n", | |
" - Other (Graphical Models, Deep Learning, Independant Detection, Joint Detection)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 15, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"<img src=\"techniques.png\" style=\"height: 700px; margin: auto; display: block\" />\n", | |
"\n", | |
"```\n", | |
"[1] Facial Feature Point Detection: A Comprehensive Survey \n", | |
" Nannan Wang, Xinbo Gao, Dacheng Tao, Xuelong Li \n", | |
" http://arxiv.org/abs/1410.1037\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 15, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Where to start?\n", | |
"\n", | |
" - The breadth and depth of the literature is overwhelming\n", | |
" - Attempting to understand and implement even a single algorithm completely is daunting" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 15, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Enter Menpo\n", | |
"\n", | |
" - We implemented many state-of-the-art techiques on top of the core of Menpo\n", | |
" - This gives a unified view of three of the major areas of FFPD\n", | |
" - Active Appearance Models (AAMs)\n", | |
" - Constrained Local Models (CLMs)\n", | |
" - Regression-based techniques\n", | |
" - For the sake of brevity, in this talk we will **concentrate on AAMs**\n", | |
" - Lets use Menpo *interactively* to demonstrate the key ideas behind these techniques!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 15, | |
"slide_type": "subslide" | |
}, | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Active Appearance Models (AAMs)\n", | |
"The most popular FFPD techniques are **supervised** learning methods\n", | |
" \n", | |
" - Given a set of pre-annotated images\n", | |
" - Whose annotations are often called the **ground truth**\n", | |
" - How we can learn what a face looks like?\n", | |
" - AAMs learn two separate models: **shape** and **appearance**\n", | |
" - In AAMs these models are **generative**" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"## Separate the appearance from the shape variation\n", | |
"\n", | |
" - But what exactly is the difference between the **appearance** (texture) and the **shape**?\n", | |
" - Well, first we need to load some **annotated** data to train our models from!\n", | |
" \n", | |
"Lets use Menpo to find out!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Loading images using Menpo" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import menpo.io as mio\n", | |
"\n", | |
"training_path = '/Users/pts08/Downloads/lfpw/trainset/*'\n", | |
"training_images = []\n", | |
"\n", | |
"# Load annotated images for training\n", | |
"for i in mio.import_images(training_path,\n", | |
" max_images=100,\n", | |
" verbose=True):\n", | |
" # Crop image to save memory\n", | |
" i.crop_to_landmarks_proportion_inplace(0.1)\n", | |
" # Convert it to greyscale if needed\n", | |
" if i.n_channels == 3:\n", | |
" i = i.as_greyscale(mode='luminosity')\n", | |
" # Append the image to the list\n", | |
" training_images.append(i)" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%matplotlib inline\n", | |
"from menpo.visualize import visualize_images\n", | |
"\n", | |
"visualize_images(training_images)" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Building a simple AAM in Menpo" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"from menpo.fitmultilevel.aam import AAMBuilder\n", | |
"from menpo.feature import no_op\n", | |
"\n", | |
"# Create a factory object for building AAMs\n", | |
"aam_builder = AAMBuilder(features=no_op,\n", | |
" normalization_diagonal=150,\n", | |
" n_levels=1)\n", | |
"\n", | |
"# Build the AAM\n", | |
"aam = aam_builder.build(training_images, verbose=True)" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# The Shape Model\n", | |
" - A shape model consists of a *linear* basis that can *generate* shapes\n", | |
" - These shapes all look like faces!\n", | |
" - This is because we learnt them from faces\n", | |
" - Lets take a look at what a shape model looks like" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%matplotlib inline\n", | |
"from menpo.visualize import visualize_shape_model\n", | |
"\n", | |
"visualize_shape_model(aam.shape_models)" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# The Appearance Model\n", | |
" - An appearance model consists of a *linear* basis that can *generate* textures\n", | |
" - These textures all look like faces!\n", | |
" - This is because we learnt them from faces\n", | |
" - However, these textures are **shape-free**\n", | |
" - Shape has been removed from them as it is generated by the *shape model*\n", | |
" - All textures appear in the reference frame *of the mean face*\n", | |
" - Lets take a look at what an appearance model looks like" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%matplotlib inline\n", | |
"from menpo.visualize import visualize_appearance_model\n", | |
"\n", | |
"visualize_appearance_model(aam.appearance_models)" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# The AAM as a single entity\n", | |
" - The AAM consists of combining a **shape model** and an **appearance model**\n", | |
" - These two models can then by jointly or alternately optimised to fit an image\n", | |
" - What would that look like inside Menpo?" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%matplotlib inline\n", | |
"\n", | |
"aam.view_widget()" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Loading testing data" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import menpo.io as mio\n", | |
"\n", | |
"# Load testing images\n", | |
"testing_path = '/Users/pts08/Downloads/lfpw/testset/*'\n", | |
"test_images = []\n", | |
"\n", | |
"for im in mio.import_images(testing_path, \n", | |
" max_images=5,\n", | |
" verbose=True):\n", | |
" # Crop image to save memory\n", | |
" im.crop_to_landmarks_proportion_inplace(0.5)\n", | |
" # Convert the image to grayscale if needed\n", | |
" if im.n_channels == 3:\n", | |
" im = im.as_greyscale(mode='luminosity')\n", | |
" # Append the image to the list\n", | |
" test_images.append(im)" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Preparing an AAM for fitting images\n", | |
"We use another factory method that takes the **appearance** and **shape** models and builds us an object that knows how to fit images.\n", | |
"\n", | |
"In particular, we can fine tune the **variance** of the model by trimming the number of bases we keep from the models." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"from menpo.fitmultilevel.aam import LucasKanadeAAMFitter\n", | |
"\n", | |
"# define Lucas-Kanade based AAM fitter\n", | |
"fitter = LucasKanadeAAMFitter(aam, n_shape=0.9, n_appearance=0.9)" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 23, | |
"slide_type": "subslide" | |
}, | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Initialising the AAM for fitting\n", | |
" - Initialising an AAM is a crucial step to the success of the algorithm\n", | |
" - Place the mean shape (zero appearance and shape model vectors) on the image" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 40 | |
}, | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
" 1. Detect face\n", | |
" 2. Scale mean model to bounding box\n", | |
" 3. Begin fitting!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 41, | |
"slide_helper": "subslide_end" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"Lets cheat a bit and initialise from the **ground truth** for simplicity!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import numpy as np\n", | |
"# Let's make these 'random' perturbations deterministic!\n", | |
"np.random.seed(1)\n", | |
"\n", | |
"fitting_results = []\n", | |
"# Loop over and fit the five images\n", | |
"for j, test_image in enumerate(test_images):\n", | |
" # Obtain ground truth (original) landmarks\n", | |
" gt_shape = test_image.landmarks['PTS'].lms\n", | |
" \n", | |
" # Generate initialization landmarks by perturbing\n", | |
" # the ground truth with some noise!\n", | |
" initial_shape = fitter.perturb_shape(gt_shape)\n", | |
" \n", | |
" # Fit image\n", | |
" fr = fitter.fit(test_image, \n", | |
" initial_shape, \n", | |
" gt_shape=gt_shape)\n", | |
" \n", | |
" # append fitting result to list\n", | |
" fitting_results.append(fr)\n", | |
" \n", | |
" # print image numebr\n", | |
" print('Image: {}'.format(j))\n", | |
" \n", | |
" # Print fitting result!\n", | |
" print(fr)" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 41, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%matplotlib inline\n", | |
"\n", | |
"# View the first fitting result\n", | |
"fitting_results[0].view_widget()" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 41, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%matplotlib inline\n", | |
"\n", | |
"# View the third fitting result\n", | |
"fitting_results[2].view_widget()" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 41, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%matplotlib inline\n", | |
"\n", | |
"# View the second fitting result\n", | |
"fitting_results[1].view_widget()" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 41, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 41, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Underwhelming results?\n", | |
"\n", | |
" - State-of-the-art use **features**\n", | |
" - **Features** are usually hand-engineered to be invariant to common image problems\n", | |
" - Illumination changes\n", | |
" - Large pose variation\n", | |
" - Occlusions" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 41, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"source": [ | |
"# Features inside Menpo\n", | |
" - Lets look at a popular feature in the literature\n", | |
" - **Histogram of Oriented Gradients** (HOGs)\n", | |
" - HOG models take quite a long time to train **(~ 10-15 minutes)**\n", | |
" - Therefore..." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%matplotlib inline\n", | |
"from seminar import blue_peter\n", | |
"\n", | |
"hog_aam = blue_peter()" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 41, | |
"slide_type": "subslide" | |
}, | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"from menpo.fitmultilevel.aam import LucasKanadeAAMFitter\n", | |
"\n", | |
"# Build the AAM from the factory as before\n", | |
"fitter = LucasKanadeAAMFitter(hog_aam, \n", | |
" n_shape=[15, 15, 15],\n", | |
" n_appearance=200)" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 49, | |
"slide_helper": "subslide_end" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%matplotlib inline\n", | |
"import menpo.io as mio\n", | |
"\n", | |
"# Load a built in asset\n", | |
"breaking_bad = mio.import_builtin_asset.breakingbad_jpg()\n", | |
"# Crop it for memory purposes\n", | |
"breaking_bad.crop_to_landmarks_proportion_inplace(0.5)\n", | |
"braeking_bad = breaking_bad.as_greyscale()\n", | |
"\n", | |
"# View it!\n", | |
"breaking_bad.view_widget() " | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 49, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import numpy as np\n", | |
"# Let's make these 'random' perturbations deterministic!\n", | |
"np.random.seed(1)\n", | |
"\n", | |
"gt_shape = breaking_bad.landmarks['PTS'].lms\n", | |
" \n", | |
"# Generate initialization landmarks by perturbing\n", | |
"# the ground truth with some noise!\n", | |
"initial_shape = fitter.perturb_shape(gt_shape)\n", | |
"\n", | |
"# Fit image\n", | |
"fitting_result = fitter.fit(breaking_bad, \n", | |
" initial_shape, \n", | |
" gt_shape=gt_shape)\n", | |
"\n", | |
"# Print fitting result!\n", | |
"print(fitting_result)" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 49, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "subslide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%matplotlib inline\n", | |
"\n", | |
"# View the fitting result\n", | |
"fitting_result.view_widget()" | |
], | |
"language": "python", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 49, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "subslide" | |
} | |
}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 49, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# What else does Menpo provide?\n", | |
" - Menpo implements \n", | |
" - Active Appearance Models (AAMs)\n", | |
" - Constrained Local Models (CLMs)\n", | |
" - Regression techniques\n", | |
" - Supervised Descent Method (SDM)\n", | |
" - Menpo is a great **playground for image based research**\n", | |
" - Image warping\n", | |
" - Powerful transformations (Piecewise Affine, Thin-Plate Splines, ...)\n", | |
" - Importing images and 3D meshes\n", | |
" - Advanced visualizations of images and meshes\n", | |
" - Mesh rasterization\n", | |
" - All objects (images, meshes) have **landmarks**\n", | |
" - Automatically imported\n", | |
" - Automatically transformed (warped, rotated etc.)\n", | |
" - ..." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 49, | |
"slide_type": "subslide" | |
}, | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# What is the current state-of-the-art?\n", | |
" - Although Menpo implements many key techniques from the literature, research moves quickly\n", | |
" - Current state-of-the-art are regression based techiques\n", | |
" \n", | |
"```\n", | |
"One Millisecond Face Alignment with an Ensemble of Regression Trees.\n", | |
"Vahid Kazemi and Josephine Sullivan\n", | |
"CVPR 2014\n", | |
"```\n", | |
"\n", | |
"```\n", | |
"Face Alignment at 3000 FPS via Regressing Local Binary Features.\n", | |
"Shaoqing Ren, Xudong Cao, Yichen Wei and Jian Sun\n", | |
"CVPR 2014\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 55, | |
"slide_helper": "subslide_end" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
" - These techniques are **fast** and very accurate!\n", | |
" - They still struggle with occluded images (hands in front of faces)\n", | |
" - AAMs still excel in this area" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"internals": { | |
"frag_helper": "fragment_end", | |
"frag_number": 55, | |
"slide_helper": "subslide_end", | |
"slide_type": "subslide" | |
}, | |
"slide_helper": "slide_end", | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Thank you for listening\n", | |
"# Any questions?" | |
] | |
} | |
], | |
"metadata": {} | |
} | |
] | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def blue_peter(): | |
import menpo.io as mio | |
import h5it | |
from menpo.visualize.image import glyph | |
from menpo.feature import hog | |
import matplotlib.pyplot as plt | |
# Loading the pre-built HOG AAM | |
import cPickle as pickle | |
with open('/Users/pts08/hog_lfpw_aam.pkl', 'rb') as f: | |
hog_aam = pickle.load(f) | |
#hog_aam = h5it.load('/Users/pts08/sparse_hog.hdf5') | |
print('Here is one I made earlier!') | |
bp = mio.import_image('blue_peter.jpg') | |
hog_blue_peter = hog(bp) | |
plt.figure() | |
plt.subplot(121) | |
bp.view() | |
plt.axis('off') | |
plt.gcf().set_size_inches(11, 11) | |
plt.title('RGB') | |
plt.subplot(122) | |
glyph(hog_blue_peter).view() | |
plt.axis('off') | |
plt.gcf().set_size_inches(11, 11) | |
plt.title('HOG') | |
return hog_aam |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment