Wednesday, January 29, 2014

Averaging Congress Correctly

Today I saw another interesting article from the Huffington post that averaged the faces of all the members of congress together. You can find it here.

Unfortunately, its not a good averaging for many reasons I'll state below. So, I created my own, shown here:

I noticed a couple things about the Huffington Post image. It was a really cool idea, but it was extremely blurry so I investigated. It seems the website the creator got the images from has over 800 images for members of the house (I'll let you figure out what is wrong with that). It turns out it had a bunch of copies and black and white images mixed in. Also none of them were aligned. When averaging images, especially facial ones, it is always important to align them.

So I wrote a quick crawler in bash just to get the images from the wikipedia page for congress. Then I wrote some C++ code using OpenCV and my own modification of flandmark[1] to align the facial images. Once aligned, I then averaged them.

You can find the crawler code, the original and aligned images, and the alignment and averaging code on my Github at I used CMake, OpenCV, and Boost if you want to test it.

Feel free to use this image as long as you cite this page.

I was asked who was the closest and furthest away from the mean. The results are as follows:
Closest to the mean (left being the closest):

Stephen Fincher
John Shimkus
Xavier Becerra

Furthest from the mean (left being the furthest):

Terri A. Sewell
Corrine Brown
Betty McCollum

Consider donating to further my tinkering.

Places you can find me

[1] M. Uricar, V. Franc and V. Hlavac, Detector of Facial Landmarks Learned by the Structured Output SVM, VISAPP '12: Proceedings of the 7th International Conference on Computer Vision Theory and Applications, 2012.

Wednesday, January 22, 2014

Hacking Snapchat's people verification in less than 100 lines

I woke up this morning and saw this article detailing Snapchat's new people verification system.

(Credit: Screenshot by Lance Whitney/CNET)
You may have seen it but if not I will summarize it for you. They basically have you choose each image with the Snapchat ghost to prove you are a person. It is kind of like a less annoying captcha.

The problem with this is that the Snapchat ghost is very particular. You could even call it a template. For those of you familiar with template matching (what they are asking you to do to verify your humanity), it is one of the easier tasks in computer vision.

This is an incredibly bad way to verify someone is a person because it is such an easy problem for a computer to solve.

After I read this, I spent around ~30 minutes writing up some code in order to make a computer do this. Now there are many ways of solving this problem, HoG probably would have been best or even color thresholding and PCA for blob recognition but it would take more time and I'm lazy (read: efficient). I ended up using OpenCV and going with simple thresholding, SURF keypoints and FLANN matching with a uniqueness test to determine that multiple keypoints in the training image weren't being singularly matched in the testing image.

First, I extract the different images from the slide above, then I threshold them and the ghost template to find objects that are that color. Next, I extract feature points and descriptors from the test image and the template using SURF and match them using FLANN. I only use the "best" matches using a distance metric and then check all the matches for uniqueness to verify one feature in the template isn't matching most of the test features. If the uniqueness is high enough and enough features are found, we call it a ghost.

With very little effort, my code was able to "find the ghost" in the above example with 100% accuracy. I'm not saying it is perfect, far from it. I'm just saying that if it takes someone less than an hour to train a computer to break an example of your human verification system, you are doing something wrong. There are a ton of ways to do this using computer vision, all of them quick and effiective. It's a numbers game with computers and Snapchat's verification system is losing.

Below is an example of the output of the code:
Found a Ghost!

And you can find the code on my github here:

Note: This just relates to Snapchat's new captcha system. The reason they implemented this was because of the past hacks by people who put way more time and thought into a harder problem. They can be found here:
Graham Smith -
Adam Caudill -
Gibson Security -

Consider donating to further my tinkering.

Places you can find me