"Sometimes a picture really is worth a thousand words" - Shailesh Nalawadi, Product Manager for Google Goggles.
Google’s new Goggles project allows users to gain access to information about an item or location simply by pointing their phone at it. So the phone can connect to reviews of a restaurant, the history of a landmark, or price comparisons for a book – all without any text having been inputted.
The technology works in conjunction with a mobile phone camera; the user takes a photograph of an object and the application scans it, comparing elements of that digital image against its database of images. When it finds a match, Google tells the user the name of what they’re looking at, and provides a list of results linking through to the relevant web pages and news stories.
The results can then be saved as a history, allowing the user to refer back to these links of interest. As the results are programmed to be relevant and are adjusted to each object: if the user takes a photo of an artwork, the results include the artist's biography; whereas for a landmark, the phone provides historical background information.
Google Goggles also uses optical character recognition to identify text, allowing items such as business cards to be snapped and scanned to make phone calls and to add as a contact in your phone directory. Some results don’t even require a photo to be taken due to integration of GPS, augmented reality and digital compass technology. Simply pointing the phone at a location (a business or shop for example) allows the app to place a button with the company name at the bottom of the screen. This can then be touched to load information from a web search.
Google Goggles demonstrates the potential for computer vision technology, but it is not at its full strength yet (hence it is being released by Google Labs). At the moment users will be able to lookup things like CD, DVD and book covers, wines, barcodes, businesses, artworks, logos and landmarks with great success but other objects will not work so well. Cars, animals and food are still in need of development to be photographically understood. Despite the immaturity of the technology, Google states that Goggles can recognise tens of millions of objects and places.
Google also claims that the technology has the potential for face recognition. So in theory a mobile phone could provide personal information on anyone in its viewfinder. Clearly this raises some pretty major privacy issues – and there are currently no plans to release this feature of Goggles. As Vic Gundotra, Google’s Vice-President of Engineering, has said, “We still want to work on the issues of user opt-in and control. We have the technology to do the underlying face recognition, but we decided to delay that until safeguards are in place.”
With this new technology comes exciting prospects for education. Visual search allows for a more interactive and creative form of learning; education can be taken outside the classroom without the need to carry text books for reference. And the fact that these searches can be stored in a history allows for retaining and referring back to this knowledge later.
For example, a class could visit an art gallery on a school trip and simply take photos of the exhibits without having to make a note of the artist. This allows for a liberated experience not tied to pens and paper. Web links generated by these photos would allow a student to purchase a(n e-)book about the artist before they have even left the gallery.
This mobile learning style could engender a sense of adventure and exploration while still linking learners to reference material. Classes could strolls around a new city, capturing images to discover the history of buildings and landmarks. Google Labs state in their accompanying video that they envisage Google Goggles being able to discover the species of plant from a leaf. An added bonus to this visual search ensures that the students need not worry about spelling mistakes and the phrasing of searches in order to gain the results that they require.
Neither the technology behind the application nor the concept is entirely new. Quick Response (QR) codes are two-dimensional barcodes which link to online content when the user takes a photo of one on their camera phone. A simple piece of software enables the phone read the URL encoded within the QR code, and the user is taken directly to that site in the mobile browser.
Image-based searching isn't completely new either. Prior attempts at the technology include Nokia's Point and Find and Amazon’s image recognition search released in October. The most similar product on the market is an application called IQ Engines. But this has a much more commercial focus – connecting mobile users with reviews, prices and purchase links. It remains to be seen whether Google can bring the technology into the mainstream.