Skip to content

Instantly share code, notes, and snippets.

@patricksnape
Created October 20, 2014 14:00
Show Gist options
  • Save patricksnape/78b80a633354ee563bba to your computer and use it in GitHub Desktop.
Save patricksnape/78b80a633354ee563bba to your computer and use it in GitHub Desktop.
Seminar 20/10/2014
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "",
"signature": "sha256:cb92b8d7df4b94f7250fc3d3f1e62856f65c5a6a986b711c9e6f9687dd29fc19"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Facial Feature Point Detection\n",
"## Recognising eyes from noses using Menpo\n",
"## By Patrick Snape"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# A little about me\n",
"\n",
" - Member of the [Intelligent Behaviour Understanding Group](http://ibug.doc.ic.ac.uk/)\n",
" - Focusing on recovery of *dense* shape from images\n",
" - Given an image, try recover a 3D mesh that represents their true facial shape\n",
" - Preferably only using a single image!\n",
" - If you are interested, my papers/work are hosted at my [website](http://patricksnape.github.io/)\n",
" - Core developer of the [Menpo project](http://menpo.io)\n",
" - Keen interest in facial feature point detection!"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Cheeky Announcement\n",
"We are conducting a large scale experiment on **spontaneous human emotion**.\n",
"\n",
"Want to be a part of it? \n",
"\n",
"You will be recorded **in 3D at 60 FPS** and all you have to do is watch some videos! \n",
"\n",
"We will even send you a *copy of your facial mesh* if you want one!\n",
"\n",
"<img src=\"patrick.png\" style=\"height: 250px; margin: auto; display: block\" />\n",
"\n",
"## Email me at [p.snape@imperial.ac.uk](mailto:p.snape@imperial.ac.uk?subject=4DFAB%20Experiment&) with \"4DFAB Experiment\" in the title."
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# What I am covering today\n",
"\n",
" - What is the difference between detection, feature point detection, tracking and recognition?\n",
" - What is facial feature point detection (FFPD)?\n",
" - What are the major FFPD techniques?\n",
" - How are these implemented within Menpo?\n",
" - What about the current state-of-the-art?"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# What is the Menpo Project (briefly)?\n",
"\n",
" - The Menpo project is a Python project focusing on making our research easier\n",
" - Particularly useful if you spend a lot of time processing images!\n",
" - Strong high level abstractions to make image loading, processing and viewing simple\n",
" - Key components written in C++"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_number": 5,
"slide_helper": "subslide_end"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"## What does this have to do with facial feature point detection (FFPD)?\n",
"\n",
" - On top of the Menpo core we have implemented popular FFPD algorithms!"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 5,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# So what is facial feature point detection (FFPD)?\n",
"\n",
" - In short, FFPD involves recovering a set of sparse feature points on a face\n",
" - These points correspond to well-defined points on the face\n",
" - Should be easily indentified by a human annotator\n",
"\n",
"**Before diving into FFPD, we need to clarify a few key *concepts*!**"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 5,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Important Terminology\n",
"\n",
" - There are a lot of terms that are often used when referring to facial analysis\n",
" - It is important to know the difference between them!\n",
" - Many concepts require previous levels\n",
" - Lets make clear what the pipeline is!"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 5,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"Figures/Figures.001.png\" style=\"height: 600px; margin: auto; display: block\" />"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 5,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"# Detection\n",
"\n",
" - Given an image, find an object inside of it\n",
" - This detection generally amounts to a region with a high probability of containing the object\n",
" - Usually, this takes the form of a **bounding box**\n",
" \n",
"<img src=\"takeo_detection.png\" style=\"height: 400px; margin: auto; display: block\" />"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 5,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"# Feature Point Detection\n",
"\n",
" - Given an image, find a sparse set of well-defined points\n",
" - These points should have a well defined semantic meaning\n",
" - For example, the tip of the nose\n",
" - **Almost all** current techniques use a local initialisation\n",
" - Expect to be initialised close to the correct result\n",
" - Therefore, feature point detection normally occurs *after* detection\n",
" \n",
"<img src=\"takeo_ffpd.png\" style=\"height: 350px; margin: auto; display: block\" />"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 5,
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"# Analysis\n",
" - Once a sparse set of points have been found, you can now analyse the face!"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 12
},
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"## Face recognition\n",
" - Align the face using the points and extract features\n",
" - Alignment helps remove pose error\n",
" - Features attempt to provide robustness to illumination, occlusion etc.\n",
" - Compare the features to your gallery (known faces) and choose closest results!"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 13,
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"## Emotion Classification\n",
" - Use the texture and sparse points to try classify emotion\n",
" - Shape of areas such as the mouth are highly discriminative"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 13,
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"# Tracking\n",
" - Tracking is distinct from detection/FFPD\n",
" - Tracking involves an extra step\n",
" - Detecting a loss of tracking!"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 15,
"slide_helper": "subslide_end"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"<br/> \n",
" 1. Local initialisation\n",
" - Initialise from previous frame\n",
" 2. Every fixed number of frames\n",
" - Try classify if we are still tracking a face!\n",
" 3. In even of loss\n",
" - Re-detect"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 15,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# What is facial feature point detection (FFPD)?\n",
"\n",
" - Recap: Detect a set of *sparse* points on a face\n",
" - These points relate to well-defined locations on all faces\n",
" - e.g. the tip of the nose\n",
" \n",
"<img src=\"takeo_ffpd.png\" style=\"height: 400px; margin: auto; display: block\" />"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 15,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# How many points should we detect?\n",
"\n",
" - At your discretion!\n",
" - In IBUG, we use **68 points**\n",
" \n",
"<img src=\"figure_1_68.jpg\" style=\"height: 250px; margin: auto; display: block\" />\n",
"\n",
" - But there are many schemes!\n",
" - A face that has been labelled in this way is usually called an **annotated** image\n",
" - **Annotations** are also called **landmarks**"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 15,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# How to detect facial points?\n",
"\n",
" - There have been many, many proposed approaches\n",
" - In general, you attempt to minimise some error between your current estimated\n",
" points and the image\n",
" - Four major bodies of work\n",
" - Constrained Local Models (CLMs)\n",
" - Active Appearance Models (AAMs)\n",
" - Regression based methods\n",
" - Other (Graphical Models, Deep Learning, Independant Detection, Joint Detection)"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 15,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"<img src=\"techniques.png\" style=\"height: 700px; margin: auto; display: block\" />\n",
"\n",
"```\n",
"[1] Facial Feature Point Detection: A Comprehensive Survey \n",
" Nannan Wang, Xinbo Gao, Dacheng Tao, Xuelong Li \n",
" http://arxiv.org/abs/1410.1037\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 15,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"# Where to start?\n",
"\n",
" - The breadth and depth of the literature is overwhelming\n",
" - Attempting to understand and implement even a single algorithm completely is daunting"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 15,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"# Enter Menpo\n",
"\n",
" - We implemented many state-of-the-art techiques on top of the core of Menpo\n",
" - This gives a unified view of three of the major areas of FFPD\n",
" - Active Appearance Models (AAMs)\n",
" - Constrained Local Models (CLMs)\n",
" - Regression-based techniques\n",
" - For the sake of brevity, in this talk we will **concentrate on AAMs**\n",
" - Lets use Menpo *interactively* to demonstrate the key ideas behind these techniques!"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 15,
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Active Appearance Models (AAMs)\n",
"The most popular FFPD techniques are **supervised** learning methods\n",
" \n",
" - Given a set of pre-annotated images\n",
" - Whose annotations are often called the **ground truth**\n",
" - How we can learn what a face looks like?\n",
" - AAMs learn two separate models: **shape** and **appearance**\n",
" - In AAMs these models are **generative**"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"## Separate the appearance from the shape variation\n",
"\n",
" - But what exactly is the difference between the **appearance** (texture) and the **shape**?\n",
" - Well, first we need to load some **annotated** data to train our models from!\n",
" \n",
"Lets use Menpo to find out!"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Loading images using Menpo"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import menpo.io as mio\n",
"\n",
"training_path = '/Users/pts08/Downloads/lfpw/trainset/*'\n",
"training_images = []\n",
"\n",
"# Load annotated images for training\n",
"for i in mio.import_images(training_path,\n",
" max_images=100,\n",
" verbose=True):\n",
" # Crop image to save memory\n",
" i.crop_to_landmarks_proportion_inplace(0.1)\n",
" # Convert it to greyscale if needed\n",
" if i.n_channels == 3:\n",
" i = i.as_greyscale(mode='luminosity')\n",
" # Append the image to the list\n",
" training_images.append(i)"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline\n",
"from menpo.visualize import visualize_images\n",
"\n",
"visualize_images(training_images)"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Building a simple AAM in Menpo"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from menpo.fitmultilevel.aam import AAMBuilder\n",
"from menpo.feature import no_op\n",
"\n",
"# Create a factory object for building AAMs\n",
"aam_builder = AAMBuilder(features=no_op,\n",
" normalization_diagonal=150,\n",
" n_levels=1)\n",
"\n",
"# Build the AAM\n",
"aam = aam_builder.build(training_images, verbose=True)"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# The Shape Model\n",
" - A shape model consists of a *linear* basis that can *generate* shapes\n",
" - These shapes all look like faces!\n",
" - This is because we learnt them from faces\n",
" - Lets take a look at what a shape model looks like"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline\n",
"from menpo.visualize import visualize_shape_model\n",
"\n",
"visualize_shape_model(aam.shape_models)"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# The Appearance Model\n",
" - An appearance model consists of a *linear* basis that can *generate* textures\n",
" - These textures all look like faces!\n",
" - This is because we learnt them from faces\n",
" - However, these textures are **shape-free**\n",
" - Shape has been removed from them as it is generated by the *shape model*\n",
" - All textures appear in the reference frame *of the mean face*\n",
" - Lets take a look at what an appearance model looks like"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline\n",
"from menpo.visualize import visualize_appearance_model\n",
"\n",
"visualize_appearance_model(aam.appearance_models)"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# The AAM as a single entity\n",
" - The AAM consists of combining a **shape model** and an **appearance model**\n",
" - These two models can then by jointly or alternately optimised to fit an image\n",
" - What would that look like inside Menpo?"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline\n",
"\n",
"aam.view_widget()"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Loading testing data"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import menpo.io as mio\n",
"\n",
"# Load testing images\n",
"testing_path = '/Users/pts08/Downloads/lfpw/testset/*'\n",
"test_images = []\n",
"\n",
"for im in mio.import_images(testing_path, \n",
" max_images=5,\n",
" verbose=True):\n",
" # Crop image to save memory\n",
" im.crop_to_landmarks_proportion_inplace(0.5)\n",
" # Convert the image to grayscale if needed\n",
" if im.n_channels == 3:\n",
" im = im.as_greyscale(mode='luminosity')\n",
" # Append the image to the list\n",
" test_images.append(im)"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"# Preparing an AAM for fitting images\n",
"We use another factory method that takes the **appearance** and **shape** models and builds us an object that knows how to fit images.\n",
"\n",
"In particular, we can fine tune the **variance** of the model by trimming the number of bases we keep from the models."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from menpo.fitmultilevel.aam import LucasKanadeAAMFitter\n",
"\n",
"# define Lucas-Kanade based AAM fitter\n",
"fitter = LucasKanadeAAMFitter(aam, n_shape=0.9, n_appearance=0.9)"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 23,
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Initialising the AAM for fitting\n",
" - Initialising an AAM is a crucial step to the success of the algorithm\n",
" - Place the mean shape (zero appearance and shape model vectors) on the image"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 40
},
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
" 1. Detect face\n",
" 2. Scale mean model to bounding box\n",
" 3. Begin fitting!"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 41,
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Lets cheat a bit and initialise from the **ground truth** for simplicity!"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import numpy as np\n",
"# Let's make these 'random' perturbations deterministic!\n",
"np.random.seed(1)\n",
"\n",
"fitting_results = []\n",
"# Loop over and fit the five images\n",
"for j, test_image in enumerate(test_images):\n",
" # Obtain ground truth (original) landmarks\n",
" gt_shape = test_image.landmarks['PTS'].lms\n",
" \n",
" # Generate initialization landmarks by perturbing\n",
" # the ground truth with some noise!\n",
" initial_shape = fitter.perturb_shape(gt_shape)\n",
" \n",
" # Fit image\n",
" fr = fitter.fit(test_image, \n",
" initial_shape, \n",
" gt_shape=gt_shape)\n",
" \n",
" # append fitting result to list\n",
" fitting_results.append(fr)\n",
" \n",
" # print image numebr\n",
" print('Image: {}'.format(j))\n",
" \n",
" # Print fitting result!\n",
" print(fr)"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 41,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline\n",
"\n",
"# View the first fitting result\n",
"fitting_results[0].view_widget()"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 41,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline\n",
"\n",
"# View the third fitting result\n",
"fitting_results[2].view_widget()"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 41,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline\n",
"\n",
"# View the second fitting result\n",
"fitting_results[1].view_widget()"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 41,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 41,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Underwhelming results?\n",
"\n",
" - State-of-the-art use **features**\n",
" - **Features** are usually hand-engineered to be invariant to common image problems\n",
" - Illumination changes\n",
" - Large pose variation\n",
" - Occlusions"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 41,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"# Features inside Menpo\n",
" - Lets look at a popular feature in the literature\n",
" - **Histogram of Oriented Gradients** (HOGs)\n",
" - HOG models take quite a long time to train **(~ 10-15 minutes)**\n",
" - Therefore..."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline\n",
"from seminar import blue_peter\n",
"\n",
"hog_aam = blue_peter()"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 41,
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from menpo.fitmultilevel.aam import LucasKanadeAAMFitter\n",
"\n",
"# Build the AAM from the factory as before\n",
"fitter = LucasKanadeAAMFitter(hog_aam, \n",
" n_shape=[15, 15, 15],\n",
" n_appearance=200)"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 49,
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline\n",
"import menpo.io as mio\n",
"\n",
"# Load a built in asset\n",
"breaking_bad = mio.import_builtin_asset.breakingbad_jpg()\n",
"# Crop it for memory purposes\n",
"breaking_bad.crop_to_landmarks_proportion_inplace(0.5)\n",
"braeking_bad = breaking_bad.as_greyscale()\n",
"\n",
"# View it!\n",
"breaking_bad.view_widget() "
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 49,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import numpy as np\n",
"# Let's make these 'random' perturbations deterministic!\n",
"np.random.seed(1)\n",
"\n",
"gt_shape = breaking_bad.landmarks['PTS'].lms\n",
" \n",
"# Generate initialization landmarks by perturbing\n",
"# the ground truth with some noise!\n",
"initial_shape = fitter.perturb_shape(gt_shape)\n",
"\n",
"# Fit image\n",
"fitting_result = fitter.fit(breaking_bad, \n",
" initial_shape, \n",
" gt_shape=gt_shape)\n",
"\n",
"# Print fitting result!\n",
"print(fitting_result)"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 49,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline\n",
"\n",
"# View the fitting result\n",
"fitting_result.view_widget()"
],
"language": "python",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 49,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 49,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# What else does Menpo provide?\n",
" - Menpo implements \n",
" - Active Appearance Models (AAMs)\n",
" - Constrained Local Models (CLMs)\n",
" - Regression techniques\n",
" - Supervised Descent Method (SDM)\n",
" - Menpo is a great **playground for image based research**\n",
" - Image warping\n",
" - Powerful transformations (Piecewise Affine, Thin-Plate Splines, ...)\n",
" - Importing images and 3D meshes\n",
" - Advanced visualizations of images and meshes\n",
" - Mesh rasterization\n",
" - All objects (images, meshes) have **landmarks**\n",
" - Automatically imported\n",
" - Automatically transformed (warped, rotated etc.)\n",
" - ..."
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 49,
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# What is the current state-of-the-art?\n",
" - Although Menpo implements many key techniques from the literature, research moves quickly\n",
" - Current state-of-the-art are regression based techiques\n",
" \n",
"```\n",
"One Millisecond Face Alignment with an Ensemble of Regression Trees.\n",
"Vahid Kazemi and Josephine Sullivan\n",
"CVPR 2014\n",
"```\n",
"\n",
"```\n",
"Face Alignment at 3000 FPS via Regressing Local Binary Features.\n",
"Shaoqing Ren, Xudong Cao, Yichen Wei and Jian Sun\n",
"CVPR 2014\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 55,
"slide_helper": "subslide_end"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
" - These techniques are **fast** and very accurate!\n",
" - They still struggle with occluded images (hands in front of faces)\n",
" - AAMs still excel in this area"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"frag_helper": "fragment_end",
"frag_number": 55,
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Thank you for listening\n",
"# Any questions?"
]
}
],
"metadata": {}
}
]
}
def blue_peter():
import menpo.io as mio
import h5it
from menpo.visualize.image import glyph
from menpo.feature import hog
import matplotlib.pyplot as plt
# Loading the pre-built HOG AAM
import cPickle as pickle
with open('/Users/pts08/hog_lfpw_aam.pkl', 'rb') as f:
hog_aam = pickle.load(f)
#hog_aam = h5it.load('/Users/pts08/sparse_hog.hdf5')
print('Here is one I made earlier!')
bp = mio.import_image('blue_peter.jpg')
hog_blue_peter = hog(bp)
plt.figure()
plt.subplot(121)
bp.view()
plt.axis('off')
plt.gcf().set_size_inches(11, 11)
plt.title('RGB')
plt.subplot(122)
glyph(hog_blue_peter).view()
plt.axis('off')
plt.gcf().set_size_inches(11, 11)
plt.title('HOG')
return hog_aam
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment