The University of Wisconsin-Madison has developed a computer that performed as well as or better than humans when it comes to the task of extracting data from scientific journals and entering it into a database. This interesting information came from Red Orbit in their article, “Man Vs Machine: Computerized Scientific Indexing System Outperforms Humans.” This new mechanical reading and indexing system is known as PaleoDeepDive.
The researchers built the DeepDive machine reading system at Stanford and the HTCondor open-source distributed job management framework to create PaleoDeepDive. The competition was between it and human scientists who had manually entered data into the Paleobiology Database, a repository that contains research data from paleontology studies funded by the National Science Foundation (NSF) and international agencies.
Computers can often have difficulties deciphering even the most simple-sounding statements. But being able to calculate the odds of interpretation gives it a major advantage over people.
A controlled vocabulary is needed to ensure that the machine-assisted or fully automated indexing is comprehensive, regardless of what you are indexing. Access Innovations is one of a very small number of companies able to help its clients generate ANSI/ISO/W3C-compliant taxonomies to make their information findable.
Melody K. Smith
Sponsored by Access Innovations, the world leader in taxonomies, metadata, and semantic enrichment to make your content findable.