Access Innovations, Inc. has announced that the Data Harmony Metadata Extractor is available as an extension of MAIstro™, the flagship thesaurus and indexing application in the company’s Data Harmony software line. Metadata Extractor is a managed Web-based service for revealing the hidden structure in an organization’s content, through superior data mining of publication elements, to normalize and automate document metadata tagging for the benefit of the organization.

Data Harmony Version 3.9 software achieves user-friendly integration of a taxonomy (or thesaurus) with an existing content platform or publishing pipeline. Patented indexing algorithms generate terms that describe what documents are really about, and precise keywords are attached for retrieving those content objects later, under different conditions. Among other benefits, deploying Data Harmony for subject tagging throughout a document collection creates a better search experience for users, because the results they get are closer to the point – there’s less extraneous material.

Leveraging a patented approach to text analysis for better keyword tagging is only one of the advantages to be gained from implementing the new Metadata Extractor Web service.

Quality Metadata Is Essential for Effective Content Management

To enhance the quality of metadata, this Data Harmony extension generates a complete bibliographic citation, creates an auto-summarized abstract of an article’s content, handles author parsing, and assigns subject keywords automatically. Metadata Extractor takes an unstructured or semi-structured article as input and returns an XML document with richer, more descriptive information captured in the metadata elements.

The Metadata Extractor extension identifies descriptive information in a document, distilling and normalizing it in a method far more sophisticated than merely matching keywords in text. The extension attaches this enhanced metadata to boost long-term value of the content object. It’s been shown that high quality metadata, consistently applied, reduces a common source of user frustration: not finding the appropriate document at the right time, in an oversized, disorganized file system.

Publishers Stand to Gain From Implementation

“Metadata Extractor is an essential addition to the Data Harmony software lineup for scholarly publishers, especially,” said Marjorie M. K. Hlava, President of Access Innovations, when asked to comment on its release. “Since every publication style sheet requires a targeted approach to leverage the most appropriate fields, Access Innovations provides customization supporting each new implementation. The result is a highly specialized output of accurate, consistent metadata for client documents, with subject keywords applied from their own unique vocabulary.”

M.A.I.™ Sets This Metadata Tool Apart from the Rest

“The extraction process uses element-based semantic algorithms mediated by M.A.I., the Machine Aided Indexer,” said Bob Kasenchak, Access Innovations’ Production Manager. “It draws on a set of Data Harmony programs that harness natural language processing (NLP) for targeted text analysis. During configuration, elements in the document schema are specified for metadata extraction, to reflect the structure of input articles. Then, whenever someone processes an article with Metadata Extractor, M.A.I. algorithms go to work surfacing crucial pieces of information to identify that document, and that document only.”

The graphical user interfaces (GUIs) and input elements for the Metadata Extractor Web service are adjustable based on the nature of incoming data and user needs.

Data Harmony Extension Modules

Access Innovations offers an expanding selection of Web-based service extension modules that are opening up new avenues between content management platforms and the innovative Data Harmony core applications: Thesaurus Master® and M.A.I.™ (Machine Aided Indexer).

To supplement an organization’s publishing pipeline or document collection with great tools for knowledge discovery, the Data Harmony Web service extensions operate on the basis of rigorous taxonomy structures, creative data extraction methods, patented text analytics, and flexible implementation options. All Data Harmony software is designed for excellent cross-platform interoperability, offering convenient opportunities for integration in all kinds of computing environments and content management systems (CMSs).

Visit the Data Harmony Products page to explore the range of focused solutions that are presented by Data Harmony Version 3.9 extension modules.

About Access Innovations, Inc. –,,

Founded in 1978, Access Innovations has extensive experience with Internet technology applications, master data management, database creation, thesaurus/taxonomy creation, and semantic integration. Access Innovations’ Data Harmony software includes machine aided indexing, thesaurus management, an XML Intranet System (XIS), and metadata extraction for content creation developed to meet production environment needs. Data Harmony is used by publishers, governments, and corporate clients throughout the world.