Google isn’t just about text anymore. The search giant is making great strides in understanding and indexing images. Google’s GoogLeNet project was one of the winning teams in the 2014 ImageNet large-scale visual recognition challenge (ILSVRC), an annual competition to measure improvements in machine visual technology.
Do you like this picture of the dog in a hat? You glance at it, and you understand what it is. From a computing perspective, “understanding” images comes in many guises, but at it’s simplest, it means detecting, locating and classifying objects in the image.
For example, the photo on the left contains two objects: A hat and a dog; and from the looks of things, he’s on holiday in South America.
What underpins the ability to index images are neural networks, crunching through huge amounts of data, looking for common elements and patterns. These networks are attempting to mimic the way our brains work: filtering out the meaningless noise in what we see, and focusing on the meaningful signals.
The question is: with all the computing expertise in the word, how hard is it to figure out that this is a dog in a hat?
The answer: very hard.
It’s Time to Test Your Image Indexing Skills: What’s This?
As part of the image classification task, Andrj Karpathy has created a website that he called the image labelling interface that provides a benchmark, comparing how well humans index images to the accuracy rate of the machine indexing. In short, it can test how well we humans do compared to the computers in terms of recognising and labelling images.
I labelled the above image as bread, whilst GoogLeNet thought it was either bread, a thimble, or velvet. In the end, we were both wrong. This is a chest.
How Does Google Handle Images at the Moment?
Historically, Google has referred to the textual information surrounding an image to gain an understanding of the content. Optimising images involves signals including the keywords in the text, the ALT tags, the nature of the links connecting to the image, and technical considerations like image site maps.
Over the years, Google has experimented with improving the quality of its image search results with games like Google Image Labeler. Sadly no longer available, this game invited humans to pair up with another player over the web and simultaneously suggest keywords that describe a random image.
Most recently, Google has made available Google Reverse Image Lookup, which uses image identification technology, rather than keywords, to find where images are being used on the web. This tool can be used to find similar images, and to discover how images have been modified. See also: TinEye
Of Course, It Isn’t Just Google
Why is the understanding and indexing of images important to companies like Google?
Obviously it will improve Google’s image search facility. Perhaps less obvious is how it will help Google to better understand the content contained within YouTube videos. And thinking a bit more left field, better image processing could be used in other Google applications, such as Google’s self driving car.
Thinking even further left field, Yahoo Labs have created an algorithm that can tell if a portrait is beautiful of not.
Understanding images is big business. Facebook has its Facebook Artificial Intelligence Research Labs. Baidu, the Chinese search giant, has an image recognition system that they claim is better than Google’s, and close to the human level. And others are jumping on the image recognition bandwagon. Imagine how Twitter will make use of it’s acquisition of the image search and recognition startup, MadBits.
That all this research is taking place suggests that there is serious money to be made from intelligent image indexing.