 |

Xerox Scientists Invent Software That Automatically Indexes, Categorizes, Routes Electronic Documents
|
Scientists at Xerox Corporation have invented powerful software that's clever enough
to "read" an electronic document, decide how it should be classified by subject, then route
it to the right person's e-mail address or online document management system Ü all
completely automatically.
The software, which is a categorizing tool, is intended to help businesses keep their
e-document collections orderly and easily accessible, and it is available for licensing
from Xerox.
"A misshelved book in a library might as well be lost. It's the same with documents that
haven't been properly categorized; the document itself may have to be recreated," said Eric
Gaussier, a research scientist at the Xerox Research Centre Europe in Grenoble, France.
"Our new software can help save time and money and increase productivity. It will ensure
that documents are properly classified for future retrieval and that the right information
gets into the right hands as quickly as possible."
Categorizing tools currently available in the market treat each subject category
independently of each other and are considered "flat." For example, although it might seem
obvious to humans that biochemistry and biophysics are related categories of information,
a flat categorization system wouldn't make the connection. But the Xerox system, based on
patented technologies, uses a hierarchical model that is able to understand the dependency
between those two categories and therefore make a more informed decision when classifying
a document.
According to data gathered from a pilot test of the software, people found the right
documents more often and faster because the software understood relationships between
documents and categories.
Anne-Lise Veuthey, a senior researcher at the Swiss Institute of Bioinformatics, an
academic nonprofit foundation that researches and develops technology used in biology,
participated in the pilot program. "We've found it to be extremely accurate in
identifying documents containing the very specific information we need to conduct our
research on human genes," Veuthey said.
Technology Highlights
Three integrated functions make the Xerox categorization technology unique:
- The system can start right away. Using advanced machine-learning techniques, with
only a few examples it quickly learns by itself how to hierarchically classify documents
in existing categories.
- The technology is easy to use and helps people create a comprehensive way to turn
unorganized e-files into cleanly labeled document collections.
- The system can learn entirely new categories on its own. The categorization technology
detects new or emerging topics and dynamically suggests new categories to the people who are
using the system.
The Right Routing
The Xerox categorizer system can handle documents written in up to 20 languages and can be
easily adapted for specific customer requirements. The software intelligently routes
documents to the right person based on a pre-set user profile.
"This can be used, for example, to route incoming mail to the person responsible for a
given topic and eliminate mail in your inbox you aren't interested in," said Gaussier.
"Imagine clients' complaints going directly to the person responsible for handling them
and your e-mail inbox containing only what you are interested in."
The categorization technology was developed by XRCE researchers based on their deep
expertise in linguistic analysis and machine-learning techniques. The software is written
in Java and can be deployed on multiple platforms including UNIX, Linux and Windows. The
company anticipates the technology to be licensed by software vendors or corporations who
wish to incorporate it into document systems focused on areas such as customer relationship
management, information retrieval and data management.
|
|
|

|
|
 |
| 2008
|
 |
| Xerox Makes Environmental Remediation Patents Available to All Through Eco-Patent Commons
|
 |
| Scientists Develop 3-D Document Visualization for "No Surprises" Printing
|
 |
| DARPA program builds on PARC foundation in printing large-area, flexible electronics
|
 |
| Xerox Joins IORG
|
 |
| Xerox Research Centre Europe coordinates EU CACAO project to provide cross-language access to online catalogues and libraries
|
 |
| Incubating Inside Xerox Labs: Innovation that Benifits the Workplace, Healthcare, and the Environment
|
 |
| Robert Loce Elected SPIE Fellow
|
 |
| Rochester Engineering Society Celebrates Technical Excellence
|
 |
| Xerox is Among the World's Best Analyst Competing to Win the Edelman Prize for Achievemnt in Operations Research & Analytics
|
 |
| Patent Powerhouse: Xerox Boasts 101 Inventors with 50 or More Patents
|
 |
| 2007
|
 |
| Xerox Reveals Breakthrough Software that Categorizes Text and Images at the Same Time
|
 |
| Xerox funds new services laboratory at NC State University
|
 |
| The Science Consultant Program: Bringing Science to Life for 40 Years
|
 |
| Xerox Technology Tricks Counterfeiters
|
 |
| Xerox Opens Its Labs to Journalists on TechDay
|
 |
| R&D Magazine Lauds Xerox FreeFlow VI Software Suite
|
 |
| Getting to 100 before 50; Xerox scientist Bob Loce Reaches Patent Milestone
|
 |
| Xerox to Fund Green, Nano, Imaging Fellowships at MIT School of Engineering
|
 |
| Know-How Results in breakthrough paper: saves trees and money
|
 |
| Xerox Funds 11 New University Research Projects
|
 |
| Surpassing Search: New Xerox text mining software goes beyond "keywords" to deliver more relevant information
|
 |
| Xerox receives the National Medal of Technology
|
 |
| Now You See It, Now You Don't: Xerox Scientists Develop Fluorescent Writing To Deter Counterfeiting
|
 |
| Xerox Scientist Creates 'Color Language' Making Color Matching as Easy as Describing a Color
|
 |
| PARC Scientist Stu Card Wins Franklin Institute Bower Award for Achievement in Science
|
 |
| Inside Innovation at Xerox: Scientists Create a Rainbow of Custom Blended Colors for DocuTech Highlight Color Systems
|
 |
| Xerox's Santokh Badesha Reaches Rare Milestone; Inventor Awarded 150th Patent
|
 |
| Content Centric Networking
|
 |
| Groundbreaking Canadian Nanotechnology Partnership Lays Foundation For Big Success From Tiny Tech
|
 |
| Xerox Awarded 27 Percent More Patents In 2006
|
 |
| 2006
|
 |
| 2005
|
 |
| 2004
|
 |
| 2003
|
 |
| 2002
|
 |
| 2001
|
 |
 |
Contact Us: for questions about Xerox research and innovation, patents or technology
licensing, scientific work and related inquiries, please email:
xigwebmaster@xerox.com
Outside Submissions: Xerox encourages and welcomes unsolicited ideas and suggestions. More information on submitting your ideas to Xerox for review can be found here.
If you have any questions, please don't hesitate to contact us by email at Outsidesubmissions@xerox.com.
For all other inquiries, please use the appropriate contacts listed at Contact Xerox.
|
|