Changes and trends in information technology are perhaps best discovered by examining what users of IT say about how they’re using it. One of those opportunities is happening soon. Every year, Access Innovations hosts a meeting for users of the company’s Data Harmony software products. The theme of this year’s Data Harmony Users Group meeting is “Then and Now: Addressing the Changing Needs of Semantic Enrichment.” This begs some questions: Just what are the new challenges that semantic enrichment needs to meet? And why are the needs changing? Some preliminary answers may be found in the DHUG meeting agenda.
On Monday, February 10, Jay Ven Eman (Access Innovations’ CEO) will give a presentation on “Leveraging Your Semantic Enrichment Investment.” He asks (and presumably will answer, at least in part), “How far can you push semantic enrichment? … What processes and tools are needed?” It is easy to imagine numerous organizations asking the same questions in one form or another. Those who have some familiarity with semantic enrichment can sense the scope of possibilities. They are asking semantic enrichment to accomplish more for their information assets. They want to push semantic enrichment technologies. These expectations put demands on the related processes and tools, which in turn are given expanded capabilities. And with new possibilities, the cycle continues.
On Tuesday, February 11, Marcie Zaharee of The MITRE Corporation will present a case study, “Taxonomies as a Tool to Increase Discovery of Intelligence Community Data Assets.” Access Innovations’ taxonomy/thesaurus software, Thesaurus Master, was used as a taxonomy management tool. The taxonomy that MITRE built was used as an aid in populating metadata fields, with the immediate goal of increasing discovery of data assets. The bigger goal was to promote data visibility, accessibility, and understandability for Intelligence Surveillance Reconnaissance (ISR) data. Accomplishing this, in a very complex modern world, calls for taxonomy development capabilities that provide detailed hierarchical relationships, software recognition of the many ways in which each concept and each piece of equipment can be described, and the ability to specify associations among related concepts.
That presentation will be followed by another case study, “Public Library of Science Thesaurus: Year One.” Rachel Drysdale of the Public Library of Science (PLOS) will discuss the process of building the thesaurus, and of integrating it into the PLOS journals workflow and publication platform. These activities are representative of those of the various online publication and research platforms that have been springing up since the establishment of the Internet. The growing volume of digital materials in digital journals and databases requires excellent search capabilities, which require the integration of semantic enrichment tools into publication platforms.
Large volumes of publishing data beg for analysis and interpretation, in order to manage that data and the underlying platform. Kevin Boyack of SciTech Strategies will show how a comprehensive map of the scientific literature was used to visualize the PLOS thesaurus. The visualization, based on semantic enrichment in the form of the metadata associated with each article, can be used to show coverage and trends for various entities such as journals, institutions, and even individual authors. This is a new way of utilizing the data made available by semantic enrichment.
On Wednesday, February 12, John Kuranz and Kathryn Brown of Access Integrity will discuss the automated use of taxonomy-based metadata for medical coding and verification purposes. This new technology combines a semantic enrichment, rule-based taxonomy tool with comprehensive medical coding datasets. The IntegraCoder concept extractor software analyzes text in electronic medical records and recommends relevant codes, significantly increasing coder productivity and efficiency. (For further information, see http://integracoder.com/, or view a Youtube video.) The development of this technology answers needs caused by the increasing complexity of the coding systems required for billing and statistical purposes. In turn, this increasing complexity has been a result of the rapid advances in medical diagnosis and treatment in recent decades.
Next, Bob Kasenchak, Access Innovations’ production coordinator, will give a presentation on “Leveraging Semantic Fingerprinting for Building Author Networks.” As online publishing platforms and their content assets have grown, and as researchers have proliferated, so have the problems resulting from multiple authors with similar or identical names, and from the many individual authors who have gone by various names. Similarly, specification of author affiliations has become difficult, because of the variety of names that any given institution may have gone through by this point in history. Semantic enrichment techniques that identify authors and institutions now need to rise to the challenge. Bob points out that “With the rise of ORCID and other universal databases of researchers and institutions, it is increasingly crucial for publishers to sort out their own data containing named entities.” His talk will explain Access Innovations’ approach to clarifying author identities so that a solid foundation exists for author networks.
Kirk Sanders, an editorial services project manager at Access Innovations will present a case study on “Data Harmony Custom Features as Implemented for Triumph Learning.” The Triumph Learning project, which required representation of the new Common Core educational standards and associated concepts in a thesaurus, called for complex mapping techniques to establish connections among the various metadata values. The nature of the metadata required expanded display options, which required a custom export format. The new semantic enrichment techniques represented in this project can be seen as meeting the challenges represented by advances in educational standards and philosophies.
A common thread in these presentations is that our world is changing. It is becoming more complex and, in many ways, more advanced. New techniques for semantic enrichment are a response to the needs created by these changes.
Barbara Gilles, Taxonomist