changx03/notes_DeepCAPTCHA.md

## notes_DeepCAPTCHA.md

      
    Raw
  

              notes_DeepCAPTCHA.md
            
          
    Notes for No Bot Expects the DeepCAPTCHA


Author: Margarita Osadchy, Julio Hernandez-Castro, Stuart Gibson, Orr Dunkelman, Daniel Pérez-Cabo
DeepCAPTCHA: a new secure CAPTCHA scheme based on adversarial examples
CAPTCHA: Completely Automated Public Turing tests to tell Computers and Humans Apart

By adding noise to the organic input, causing the targeted machine learning model to misclassify the input.
The vulnerability exists in  a wide range of machine learning algorithms, not limited to neural network. For example: linear models (logistic regression, softmax regression, SVM), decision tree, KNN, RL (untargeted), DNN (Rectified linear unit, maxout, sigmoid, LSTM)
Property: adversarial examples are difficult for DL and easy for humans.
immutable adversarial noise cannot be removed by a preprocessing algorithm.
Introduction to CAPTCHA


Motivation for CAPTCHA: mitigating the impact of Distributed Denial of Service (DDoS) attacks, slowing down automatic registration of free email addresses or spam posting to forums, defending against automatic scraping of web contents
1st generation: uses deformation of written text. Less popular due to its susceptibility to segmentation attacks
Increasing distortion level by using character overlapping. Frequently unreadable by humans.
2nd generation: Image based CAPTCHA: more resilient to automated attacks
3rd generation: Google's reCAPTCHA v3 Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks - Ian Goodfellow 2014

Adversarial examples


Challenge 1: readability to human
Challenge 2: localized noise can be easily removed by spatial filters
"No solution for a large scale (+1000 categories) multi-class recognition problem that is robust to adversarial examples."
Immutable Adversarial Noise (IAN)
Median filter of size 5x5 was the most successful in reverting the noise
Fast Gradient Sign Method (FGSM)
noise magnitude epsilon

Dataset - ILSVRC-2012


validation and test data consist of 150,000 photographs, 1000 object categories.
A subset of ImageNet containing the 1000 categories and 1.2 million images.

Related paper: "The End is Nigh: Generic Solving of Text-based CAPTCHAs" (2014)


Elie Bursztein from Google
Solved 2 years prior to the paper, but the paper did not provide a defense.
Previous approach: pre-processing -> segmentation -> post-segmentation -> recognition -> post-recognition
Weakness: Segmentation requires to find an invariant to exploit. e.g: colour, cluster size, shape, # of characters
No need for manual segmentation.
Cut-points detection -> Slice -> Scorer -> Arbiter
Build a large graph, traverse all possible character sequences
Reinforce learning
Heuristic: Reduce # of cut-points, 2 way: left-to-right, right-to-left
How about lines? lines are known shape. Train classifier to recognise them as empty character.