Innovative technology could revolutionize Web searches, management of archived digital
images
Scientists at Xerox Research Center Europe in Grenoble, France have developed a powerful
new system for "recognizing" generic, everyday objects in digital images, such as a
photograph of a car, and categorizing them.
When used in document and content management systems, this breakthrough technology would
allow people to filter and search for images as well as text. It would enable efficient
storage and management of electronic images and could significantly extend Web searching
capabilities, which are currently based upon text only.
"Although there has been phenomenal growth in the use of digital cameras and images, the
use of technology to categorize image content is in its infancy," said Christopher Dance,
senior scientist and image processing manager at the Grenoble center. "It is currently only
used in applications such as face recognition in the security industry."
However, scientists at XRCE developed a generic technique for the identification of
images, allowing the categorization of everyday image content types such as buildings,
animals, airplanes, books and faces. It is the first generic image categorization
technology that is robust, fast and simple to use.
The technology, which results from fundamental research at XRCE, melds the lab's expertise
in image processing, computer vision and machine learning. Image categorization is
analogous to text categorization, which looks at the content of a document to find
key words. To categorize images, Xerox identifies the key features of an object, which
it calls "patches." The system works by training a computer to map the patches and to
classify sets of these patches. This classification in effect assigns an image to a
particular category or categories.
The scientists had to solve some knotty problems, Dance said. For example, early versions
of the system could confuse an image of a stack of tires and an image of a car, as they
both contain some of the same patches. In order to overcome this, the program examines
key patches in the context of other areas of the picture. In this example, a stack of
tires would not get confused with a car because the machine would recognize other key
patches such as headlights or windows were missing.
"Images play a key role in most documents, but in the past document repositories have
only been able to search for and categorize text," Dance said. In addition to developing
this software for different applications, Xerox will continue to extend its categorizer
to handle more visual categories and to incorporate difficult cases where the object of
interest occupies only a small fraction of the field of view, Dance said.
|