Access Innovations, Inc. is pleased to announce the completion of an automatic semantic tagging rulebase for the Centers for Disease Control and Prevention (CDC) Mining Safety and Health Thesaurus (MSHT). The new, state-of-the-art system increases indexing accuracy to 93 percent and provides greater access to the mining safety and health database, thereby aiding researchers and statisticians in improving safety for the mining industry.

“Access Innovations is glad to offer our services to such an important research outlet,” says Access Innovations president Marjorie M.K. Hlava. “The Data Harmony M.A.I. system will help immensely with term disambiguation and will make data tagged against the Mining Safety and Health Thesaurus findable and discoverable to the full depth and breadth of the thesaurus.”

The MSHT contains terms related to safety techniques and devices, workplace hazards, and miner ailments and conditions associated with different types of mining and methods. The interconnected terms and concepts required precise disambiguation by the Access Innovations team to further enable semantic search.

“The mining industry, like many other areas, is full of industry jargon, which made this project very interesting,” remarks project manager Bob Kasenchak. “Many natural language words have different context and meaning in mining terminology. For example, ‘damp’ refers to ‘gases’ or ‘fumes’ instead of ‘slightly wet.’”

Research of mining industry resources and CDC content guided the construction of the rule base. After creation of a foundation of just 300 rules, the team then utilized the Test M.A.I. function of Data Harmony’s patented MAIstro platform to identify and refine the concepts to produce accurate indexing.

“It’s important for each thesaurus to reflect the content that will be applied to it,” Kasenchak says. “The rulebases, too, should be constructed with great care to reflect the linguistic usage patterns found in mining research and data so that content can be easily found accurately and consistently.”

The Mining Safety and Health Thesaurus is maintained by the Centers for Disease Control and Prevention via Data Harmony’s MAIstro software platform. Access Innovations will provide ongoing software support and looks forward to a continued partnership with the CDC.


About Access Innovations –,,
Founded in 1978, Access Innovations has extensive experience with Internet technology applications, master data management, database creation, thesaurus/taxonomy creation, and semantic integration. The Access Innovations Data Harmony software includes automatic indexing, thesaurus management, an XML Intranet System (XIS), and metadata extraction for content creation developed to meet production environment needs. Data Harmony is used by publishers, governments, and corporate clients throughout the world.


About the Office of Mine Safety and Health Research–
Founded in 1910 by Congressional act as the U.S. Bureau of Mines, the mission of the Office of Mine Safety and Health Research (OMSHR) is to eliminate fatalities, injuries, and illnesses through research and prevent. UMSHR is a division of the National Institute for Occupational Safety and Health (NIOSH), which is part of the Centers for Disease Control and Prevention in the U.S. Department of Health and Human Services.