When you use a thesaurus for indexing context covering multiple disciplines, the need for disambiguation of terms is increased. This fact of thesaurus life was well illustrated in a presentation at this year’s DHUG (Data Harmony Users Group) meeting. The presentation, by Rachel Drysdale, Taxonomy Manager of the Public Library of Science (PLOS), was titled “The PLOS Thesaurus: the first year.”
While Rachel discussed a variety of aspects of thesaurus implementation and maintenance, what caught my interest and sympathy as a fellow taxonomist was her description of what she called “taxonomy funnies.” Anyone who has been a taxonomist for a period of time has run into such funnies, or problems that are chuckle-worthy but need some sort of dealing with.
In the talk, Rachel discussed the refinement of indexing rules. PLOS maintains its thesaurus in a Data Harmony software application, MAIstro that includes integration of a taxonomy management tool, Thesaurus Master with M.A.I., an indexing application in which a “rule base” of indexing rules is maintained. In MAIstro, when a term is added to a thesaurus, a simple identity rule is automatically created in the associated rule base. So when the Animals branch was being developed, the addition of “Pumas” caused the creation of a rule that looked like this:
Text to Match [in the text being read and parsed by M.A.I.]: pumas
USE [Indexing term] Pumas
M.A.I. also recognizes singular and plural variants. In the absence of any rule or condition to the contrary, the rule above would cause the automatic assignment, or suggestion to a human editor, of the indexing term “Pumas” when coming across the text string “puma”.
PLOS content has good coverage of zoological topics, but is also especially heavy on molecular biology, particularly genetics. The PLOS wordsmiths were mystified when they found that multitudes of genetics articles were being indexed with the term “Pumas”. True, there might have been a sprinkling of articles about wild feline genetics, but this would not account for the number of articles that boasted the “Pumas” descriptor.
The taxonomists at PLOS looked at the articles in question and found the culprit. “PUMA” was appearing in those articles, as an acronym for a gene whose full name is “p53 upregulated modulator of apoptosis.” (I can’t blame the geneticists for using an acronym for that one. The full name isn’t very conversation friendly.) And it’s not specific to pumas; humans have it, and so do such diverse creatures as fish and frogs. So the PLOS taxonomists modified the indexing rule, adding conditions that required at least one other word or phrase having to do with the world of wild feline creatures to be present before “Pumas” could be assigned or suggested. The addition of a few synonyms and quasi-synonyms for pumas made the rule richer and better able to disambiguate pumas from PUMAs. The rule ended up looking like this:
Text to Match: pumas
IF (MENTIONS “feline*” OR MENTIONS “jaguar*” … OR MENTIONS “panther*” OR MENTIONS “cougar*” OR MENTIONS “catamount*” …)
The next indexing run was much better. Alas, there were still some articles inappropriately indexed with “Pumas”. What was wrong? The PLOS editors did some more detective work.
It turned out that some of the problem articles were about the toxoplasma parasite, which has many variant strains and is found in a wide variety of organisms, including people, frogs, and cats. One of those variant strains is known as COUGAR. A conceptual relationship with actual cougar critters does exist; the variant was first discovered in a group of Canadian cougars. That’s rather tangential, though. The toxoplasma articles in question aren’t really about cougars. The problem was that as far as animals (and the PLOS rule base) were concerned, “Cougars” is a synonym of “Pumas”. So when the indexing system read “COUGAR” in the text, “Pumas” got popped onto the list of subject terms for each of those toxoplasma articles.
The next critter slithering amok through the PLOS records was the snail. What would make snails unruly? The real culprit is once again a gene in disguise, in this case SNAI1, naturally referred to frequently as SNAIL. Once such a culprit is properly identified, it’s a straightforward matter to modify a rule that prevents the wrong term from being suggested or assigned, by considering likely contexts and reflecting those in the rule conditions. One bonus of the situation is that the same rule can be further modified to enable indexing of the formerly problematic document with a more appropriate term.
There’s no reason to be afraid of the wild animals in your thesaurus, as long as you stay alert for them. You can tame the mighty mountain lion and the slithery snail.
Barbara Gilles, Taxonomist
“Criteria for inclusion vary, but all companies have things in common. Access Innovations, Inc. has proven to define the spirit of practical innovation by blending sparkling technology with a deep, fundamental commitment to customer success,” says Hugh McKellar, KMWorld editor-in-Chief.
Marjorie M. K. Hlava, president of Access Innovations, says she is honored by her company’s accolade. “Access Innovations prides itself on pushing the edges of technology to meet the needs of the next generation of knowledge management,” she says. “It’s challenging and rewarding to be at the cutting edge of knowledge management, and it’s delightful to be recognized as a leader in the field, making content findable for our customers and their users,”
The Top 100 Companies That Matter list is compiled annually by editorial colleagues, analysts, theorists and practitioners. Unlike many other trade lists, inclusion is not purchased and is at the sole discretion of KMWorld’s editors.
For a full list of the Top 100 Companies That Matter in Knowledge Management, pick up the March issue of KMWorld, which is available on newsstands now, or click here to view the online article.
About Access Innovations, Inc.
Founded in 1978, Access Innovations has extensive experience with Internet technology applications, master data management, database creation, thesaurus/taxonomy creation, and semantic integration. Access Innovations’ Data Harmony software includes automatic indexing, thesaurus management, an XML Intranet System (XIS), and metadata extraction for content creation developed to meet production environment needs. Data Harmony is used by publishers, governments, and corporate clients throughout the world.
The leading information provider serving the knowledge, document, and content management systems market, KMWorld informs more than 45,000 subscribers about the components and processes—and subsequent success stories—that together offer solutions for improving business performance. KMWorld is a publishing unit of Information Today, Inc.
This past week, Access Innovations hosted the annual Data Harmony Users Group (DHUG) meeting, which has included a variety of presentations. We presented a case study titled “Proven Technology in a New Market: the Data Harmony Suite of Products — A Game Changer in the Medical Claims Compliance Space.” Here’s a quick summary:
In 2014, the following challenges (which healthcare information professionals in the United States will recognize as major new regulatory requirements) will face all medical providers and payers:
• Meaningful Use Stage 2 requirements
• Revised form 1500 from Center for Medicare/Medicaid Services (CMS)
• 2014 changes to Current Procedure Technology (CPT)
• Billing changes for Non-Physician Practitioner (NPP)
• New edits for Correct Coding Initiatives (CCI)
The approaching tsunami in the medical compliance industry has resulted in a perfect fit for Data Harmony’s semantically enriched, rule-based concept extractor software solution. Our first order of business was to investigate at what point in the medical transaction process Data Harmony’s suite of time-tested software products would be the best fit and what would be the most cost-effective results. Research indicated that our technology would be a game changer in the following areas:
- Reduce the risk of compliance audit
- Expedite claim submission
- Accelerate cash flow… fewer denials and rejections
- Improve coder productivity
- Allow real-time analysis of electronic medical records (EMRs)
The technology transferred from electronic publishing to medical claims compliance has instituted a disruptive innovation within a new industry. The rule-based approach provides very high accuracy and consistent, productive, efficient, and cost-effective EMR analysis throughout the medical billing and transaction sphere.
About the authors/presenters:
John Kuranz brings more than 30 years of experience in electronic publishing and internet applications in research and content creation to the Access Integrity team. Most recently, Kuranz spent three years as managing director of Dallas, Texas-based Chemical Information Services, a B2B subscription-based chemical database firm. Prior to joining Access Integrity, Kuranz was executive vice president of global business strategy at Herndon, Va.-based Apex CoVantage, an outsourcing company with more than 3,500 employees worldwide. In this role, Kuranz was instrumental in positioning the company as a leader in providing knowledge-based solutions to the information and publishing industry. Kuranz has a bachelor’s of science degree in biochemistry from Texas A&M University and did his MBA studies at Northern Illinois University. Kuranz and his wife, Karen, reside in Santa Fe, N.M.
Kathryn (Kathy) Brown is an editor and taxonomist with Access Innovations. In addition to taxonomy and thesaurus creation and maintenance, her job responsibilities include indexing, rule building, and creating and maintaining databases. She has diplomas in nursing and in paralegal studies. A former nurse and paralegal, Kathy brings her medical and legal knowledge to healthcare-related and legal projects. Her outside interests include poker and mandolin playing.
After critical business information has been identified at a high level and a focal has been assigned, best practices from complementary disciplines can be incorporated.
Identify the main subjects for a business-specific controlled vocabulary
Each company or organization develops its own language for talking about what it does. Like all languages, organizational languages are based on a common way of seeing and thinking. A technology or farm machinery company may use alphanumeric designations to identify thousands of products. An entertainment firm may use cryptic acronyms in discussing thousands of events or programs. An agricultural organization may talk about plans and events as they relate to “the harvest.” Even within one company, the language varies between departments. A finance department is likely to have a language that is different from the language used in research or operations departments.
Controlled vocabularies provide the key for translating organizational language between departments, between new and experienced employees, and between internal and external stakeholders. Controlled vocabularies also provide the basis for consistent analysis, visualization, and reporting, as well as effective search, retrieval, and distribution. Maintaining effective, business-specific controlled vocabularies provides a competitive advantage. They can also provide operational advantages by supporting the translation of business concepts into rapidly evolving IT technical concepts.
Creating and maintaining controlled vocabularies, including relationships and cross-references, has been a best practice in library science, information science, and records management for a very long time. Over time, effective principles, practices, and standards have been developed for them, but currently marketed tools do not always use them.
At the beginning, it is important to identify the main business subject areas that might benefit from a controlled vocabulary. They may be specific to one or more industries, to a discipline, or to a technique. Existing vocabularies and standards can then be identified as building blocks or goals for future cooperative efforts. Industry and subject vocabularies can usually be found through associations, through research, or through vocabulary lists such as Access Innovations’ TaxoBank. General standards for creating and maintaining vocabularies can be found through ANSI and ISO and apply more generally than technology-specific standards.
Once pertinent subjects, vocabularies and standards are identified, basic policies should be established regarding their use and upkeep. Like all languages, organizational languages change and evolve with use. Because cultural and business environments are rapidly evolving, vocabulary policies need to support rapid innovation and creativity.
Define the general types of needed metadata
The metadata needed to identify and track business critical information and data is specific to an organization and is a corporate asset. Defining and maintaining it in a consistent, reliable, useful form is essential, even when a tool provides “OOB (Out Of the Box)” implementations or automated discovery. At a minimum, tools must be configurated and business vocabularies, codes, users, and processes retrofitted to the tool and then maintained. Ongoing tool success and return on investment require significant effort and investment in the definition of policies, standards, vocabularies, processes and procedures. Usually this involves changes in work, roles, and responsibilities for which planning and ongoing management are essential and need to be added to tool costs.
As with vocabulary creation and maintenance, many effective principles, practices, and standards have been developed over time for metadata definition and maintenance, but tools do not always use them.
At the beginning, it is important to identify the main areas of business concern and vulnerability, such as regulatory compliance, product liability, cross-departmental standardization and communication, fulfillment of marketing strategies, or a variety of business specific customer and product-related issues. Each of these areas requires specific tracking techniques and processes that dictate specific metadata. ANSI, ISO, and technology specific standards, such as those designed for the Internet, may be applicable to a business. Determining which standards are applicable will require research.
Governance Level Understanding of Information and Data Needs
Developing a governance level understanding of information and data needs, consisting of the four steps outlined in this and a previous blog posting in this series can be handled as time bounded projects. This high level understanding will be invaluable in providing a business-oriented basis for prioritizing and managing additional work, scoping and justifying the creation of an information and data governance program, and evaluating and efficiently implementing cost-effective new technologies.
Watch future blog postings for more on this subject.
Judith Gerber (guest blogger), JGG Enterprises
Sponsored by Access Innovations, the world leader in taxonomies, metadata, and semantic enrichment to make your content findable.
Today is the first day of a week full of training, information sharing, and networking at Access Innovations’ 10th Annual Data Harmony Users Group (DHUG) meeting. A full-day overview of the Data Harmony software suite takes place today to provide a solid foundation for deeper understanding of the core meeting sessions on Tuesday and Wednesday.
Thursday, February 13 features hands-on training in the Data Harmony software. The morning session will focus on Thesaurus Master, and the afternoon session will focus on the M.A.I. rule-building module.
On Friday, February 14, attendees will take part in a “sandbox” session for more hands-on training in Thesaurus Master and M.A.I. This session will have an individual hands-on approach. Activities including manipulation of a dummy taxonomy and rule-building practice. Our taxonomists will be assisting attendees with specific concerns, including reviewing taxonomy and indexing rule base troubleshooting.
This week promises to be the best DHUG meeting yet, with client success stories from the Public Library of Science (PLOS), the American Society of Civil Engineers, and SciTech Strategies.
Melody K. Smith
Sponsored by Data Harmony, a unit of Access Innovations, the world leader in indexing and making content findable.
Marjorie M.K. Hlava Selected to Receive Prestigious Miles Conrad Award from the National Federation of Advanced Information Services (NFAIS™)
Marjorie (Margie) M.K. Hlava, president of Albuquerque-based Access Innovations, Inc., has been selected to receive the prestigious Miles Conrad Award from the National Federation of Advanced Information Services (NFAIS). The award will be presented at the upcoming 2014 NFAIS Annual Conference in Philadelphia, PA, February 23-25, 2014. In keeping with longstanding tradition, Marjorie will present a 45-minute lecture with her perspective on the information industry during the NFAIS Annual Conference.
“The objective of the Miles Conrad Memorial Lecture, established in 1965 in commemoration of NFAIS founder G. Miles Conrad, is to recognize and honor those members of the information community who have made significant contributions to the field of information science and to NFAIS itself,” said Marcie Granahan, Executive Director of NFAIS. “The lecture is presented every year at the organization’s annual conference by an outstanding person on a suitable topic in the field of abstracting and indexing, but above the level of any individual service.”
“Margie Hlava is a well-known and well-respected information industry pioneer,” said NFAIS President Suzanne BeDell. “She has worked behind the scenes for most of the major information organizations, including many NFAIS member organizations. Margie believes that you learn as much as you receive by being active in professional organizations, and she has been intimately involved in the standards process for much of her career. She served for seven years on the NISO board and was personally involved in the development of many NISO standards. She chaired the Special Libraries Association’s (SLA) Standards Committee for nine years, has chaired the NFAIS Standards Committee since 2001, and is currently a member of the NISO Content and Collection Management Topic Committee. Margie is one of my predecessors, having been NFAIS President from 2003 to 2004, as well as President of other organizations such as the American Society for Information Science and Technology and Documentation Abstracts. She has served on the Board of SLA twice and currently serves on several boards, including those for the ASIS&T Bulletin of which she is Chair, Information Systems and Use, Places and Spaces, University of North Carolina SILS, and the SLA Taxonomy Division, of which she is the founding chair. Margie also is a volunteer outside the information industry, serving on the boards of the New Mexico Information Commons, the Hubbell House Alliance, New Mexico Data Stream, and the Hubbell Family Historical Society. The NFAIS Board is delighted to confer our organization’s highest honor upon her.”
“Previous recipients are people I have long admired and looked up to as luminaries in our field,” said Marjorie Hlava when she was notified of the award. “I am truly honored to be among them.”
Margie was educated as a botanist and trained by NASA as an information engineer, a position she worked in for five years. She was a beta tester on the NASA Recon, Dialog, and other early online host systems such as BRS and SDC. She was also the Information Director for the Department of Energy National Energy Information Center and its affiliate NEICA. She rose to the position of Information Director before taking her team private as Access Innovations, Inc. in 1978.
Margie’s abiding research interests center on speeding the human processes in knowledge management through productivity enhancements. She has developed the Data Harmony software suite specifically to increase accuracy and consistency while streamlining the clerical aspects in editorial and indexing tasks. The most recent innovation is applying those systems to medical records for medical claims compliance in a new division, Access Integrity.
Margie’s work has been acknowledged through numerous awards throughout her career, including ASIS&T’s Watson Davis award, and recognition both as an SLA Fellow and as a Woman of Influence for Technology. She is the author of two books and over 200 articles. She holds two U.S. patents encompassing 21 patent claims. She has no intention of resting on her laurels, but plans to continue her adventures in information science and explore the boundaries of new technology and methodologies. A complete list of prior Miles Conrad Award winners can be found on the NFAIS website.
About Access Innovations, Inc. – www.accessinn.com, www.dataharmony.com, www.taxodiary.com Founded in 1978, Access Innovations has extensive experience with Internet technology applications, master data management, database creation, thesaurus/taxonomy creation, and semantic integration. Access Innovations’ Data Harmony software includes automatic indexing, thesaurus management, an XML Intranet System (XIS), and metadata extraction for content creation developed to meet production environment needs. Data Harmony is used by publishers, governments, and corporate clients throughout the world.
About NFAIS – www.nfais.org Founded in 1958, NFAIS is a membership organization of more than 55 of the world’s leading producers of databases and related information services, information technology, and library services in the sciences, engineering, social sciences, business, and the arts and humanities. For more information on NFAIS and its member organizations, on NFAIS Annual Conferences and meetings or the Miles Conrad Memorial Lecture series, contact Jill O’Neill, Director of Communication and Planning (firstname.lastname@example.org or 215-893-1561), or visit the NFAIS website.
Changes and trends in information technology are perhaps best discovered by examining what users of IT say about how they’re using it. One of those opportunities is happening soon. Every year, Access Innovations hosts a meeting for users of the company’s Data Harmony software products. The theme of this year’s Data Harmony Users Group meeting is “Then and Now: Addressing the Changing Needs of Semantic Enrichment.” This begs some questions: Just what are the new challenges that semantic enrichment needs to meet? And why are the needs changing? Some preliminary answers may be found in the DHUG meeting agenda.
On Monday, February 10, Jay Ven Eman (Access Innovations’ CEO) will give a presentation on “Leveraging Your Semantic Enrichment Investment.” He asks (and presumably will answer, at least in part), “How far can you push semantic enrichment? … What processes and tools are needed?” It is easy to imagine numerous organizations asking the same questions in one form or another. Those who have some familiarity with semantic enrichment can sense the scope of possibilities. They are asking semantic enrichment to accomplish more for their information assets. They want to push semantic enrichment technologies. These expectations put demands on the related processes and tools, which in turn are given expanded capabilities. And with new possibilities, the cycle continues.
On Tuesday, February 11, Marcie Zaharee of The MITRE Corporation will present a case study, “Taxonomies as a Tool to Increase Discovery of Intelligence Community Data Assets.” Access Innovations’ taxonomy/thesaurus software, Thesaurus Master, was used as a taxonomy management tool. The taxonomy that MITRE built was used as an aid in populating metadata fields, with the immediate goal of increasing discovery of data assets. The bigger goal was to promote data visibility, accessibility, and understandability for Intelligence Surveillance Reconnaissance (ISR) data. Accomplishing this, in a very complex modern world, calls for taxonomy development capabilities that provide detailed hierarchical relationships, software recognition of the many ways in which each concept and each piece of equipment can be described, and the ability to specify associations among related concepts.
That presentation will be followed by another case study, “Public Library of Science Thesaurus: Year One.” Rachel Drysdale of the Public Library of Science (PLOS) will discuss the process of building the thesaurus, and of integrating it into the PLOS journals workflow and publication platform. These activities are representative of those of the various online publication and research platforms that have been springing up since the establishment of the Internet. The growing volume of digital materials in digital journals and databases requires excellent search capabilities, which require the integration of semantic enrichment tools into publication platforms.
Large volumes of publishing data beg for analysis and interpretation, in order to manage that data and the underlying platform. Kevin Boyack of SciTech Strategies will show how a comprehensive map of the scientific literature was used to visualize the PLOS thesaurus. The visualization, based on semantic enrichment in the form of the metadata associated with each article, can be used to show coverage and trends for various entities such as journals, institutions, and even individual authors. This is a new way of utilizing the data made available by semantic enrichment.
On Wednesday, February 12, John Kuranz and Kathryn Brown of Access Integrity will discuss the automated use of taxonomy-based metadata for medical coding and verification purposes. This new technology combines a semantic enrichment, rule-based taxonomy tool with comprehensive medical coding datasets. The IntegraCoder concept extractor software analyzes text in electronic medical records and recommends relevant codes, significantly increasing coder productivity and efficiency. (For further information, see http://integracoder.com/, or view a Youtube video.) The development of this technology answers needs caused by the increasing complexity of the coding systems required for billing and statistical purposes. In turn, this increasing complexity has been a result of the rapid advances in medical diagnosis and treatment in recent decades.
Next, Bob Kasenchak, Access Innovations’ production coordinator, will give a presentation on “Leveraging Semantic Fingerprinting for Building Author Networks.” As online publishing platforms and their content assets have grown, and as researchers have proliferated, so have the problems resulting from multiple authors with similar or identical names, and from the many individual authors who have gone by various names. Similarly, specification of author affiliations has become difficult, because of the variety of names that any given institution may have gone through by this point in history. Semantic enrichment techniques that identify authors and institutions now need to rise to the challenge. Bob points out that “With the rise of ORCID and other universal databases of researchers and institutions, it is increasingly crucial for publishers to sort out their own data containing named entities.” His talk will explain Access Innovations’ approach to clarifying author identities so that a solid foundation exists for author networks.
Kirk Sanders, an editorial services project manager at Access Innovations will present a case study on “Data Harmony Custom Features as Implemented for Triumph Learning.” The Triumph Learning project, which required representation of the new Common Core educational standards and associated concepts in a thesaurus, called for complex mapping techniques to establish connections among the various metadata values. The nature of the metadata required expanded display options, which required a custom export format. The new semantic enrichment techniques represented in this project can be seen as meeting the challenges represented by advances in educational standards and philosophies.
A common thread in these presentations is that our world is changing. It is becoming more complex and, in many ways, more advanced. New techniques for semantic enrichment are a response to the needs created by these changes.
Barbara Gilles, Taxonomist
The link between business and information technology is the data, information, and process assets that are stored and automated through technical tools. This blog suggests the first steps toward governing and managing these important assets before tool implementation, helping to avoid the too common “graveyards” of expensive, underused tools.
Identify business critical information and data
In order to get past the confusion of rapidly evolving types, formats, risks, and tools, first identify the most important information and data assets for your organization and start treating them like assets. These assets may already be known but not documented, or identifying them may require chartering and funding a project. Critical information and data assets vary widely across organizations and departments. They need to be based on the core products, expertise, and risks of an organization, which also may need to be identified. For example, data from production machinery and its interpretation could be an unrecognized competitive asset. In other cases, information and data may not yet be regarded as critical assets, but regulatory scrutiny may be about to change that perception.
The identification and listing of important information and data assets should include brief descriptions, the most recent owner, and a relative value. This high level overview is intended to enable discussions about assets, prioritization of work and investments, and the creation of general policies. It should not be confused with the detailed, time-consuming asset management inventories for which records managers and librarians are trained. It is, however, the first step toward governance and “thoughtful localization and organization,” proven techniques which can later be the basis for advanced management techniques such as developing and using metadata, taxonomies, and controlled vocabularies. The overview can employ simple, existing tools such as a spreadsheet or database that can aid in analysis and produce reports.
The first goal is to initiate discussions about how information and data assets support organizational strategies and to determine what governance and management programs are needed. Governance is the exercise of control over multiple operations through accountability frameworks and priorities. It may take some time to build out all the needed policies and measurements regarding decision rights, alignment, and communication, but the discussions will get the work started. Management, which is the exercise of control over day-to-day operations, decisions, work, people, or things, will come later and will comply with governance policies.
Assign an Information and Data Governance Focal Point
Responsibility for information and data governance needs to be assigned if progress is expected, even if the organization is not ready to fund a full-scale program. A part-time person can be responsible for the information and data asset list, act as an authenticating gatekeeper for changes, and make sure that it is discussed at appropriate high-level meetings. With a little bit of additional time the assignee could set up and publicize a mail box or shared site for collecting issues, ideas, and needs, compile them, and recommend projects that are worthy of investment.
The steps above are the beginning and will help to determine where effort, investment, and tools can be justified and what should be accomplished. Much additional work is needed to realize more significant competitive advantages, provide complete functional requirements for tools, and meet regulatory requirements.
Keeping in mind the principle that information is best understood and used by its primary users, governance, standardization, normalization, and coordination may be needed across departments to achieve strategic quality and integrity goals. In addition, specific, detailed, ongoing programs and organizations may be needed for information and data management, funded to evolve, grow, and change as uses, formats, and values fluctuate with business and regulatory changes.
Incrementally, over time, or as the result of concentrated, planned projects, a deeper understanding of needs can be achieved, and more advanced management techniques can be justified, funded, and adopted. Examples include more advanced techniques for asset valuation, cooperative metadata and vocabulary adoption and use, development of competitive information and data techniques, and strategic asset based service level agreements with vendors and operating level agreements with internal groups.
In most cases, over time, it will be beneficial to incorporate principles, standards, and best practices from a variety of complementary disciplines which have found successful ways to deal with the issues – records management, information science, library science, ISO, ANSI, related industries, project management, organizational change, and COBIT and ITIL frameworks for IT governance and management.
Watch future blog postings for more details on this subject.
Judith Gerber (guest blogger), JGG Enterprises
Sponsored by Access Innovations, the world leader in taxonomies, metadata, and semantic enrichment to make your content findable.
Heather Kotula, a long-time employee of Access Innovations, Inc., has recently been promoted to the position of DHUG (Data Harmony Users Group) Meeting and Marketing Coordinator. Heather is one of many faces of fresh, young talent at Access Innovations, and her promotion can only mean good things for the company.
Ms. Kotula started with Access Innovations in 1995, and has since filled many positions within the company and seen it grow over many years. She has worked in finance, as a project manager, office manager, and Vice President of Operations. Her versatility in her early years at Access Innovations gave her a strong background and knowledge of the company, and these have translated into new marketing ideas and skills that will propel the company forward.
Ms. Kotula has coordinated the past three DHUG meetings, and her new position within the company puts her at the forefront of new ideas for the annual DHUG gatherings. These meetings include two days of case studies and presentations and three days of software training. Access Innovations’ clients from around the world attend to present case studies, get training in the use of the Data Harmony software, network with other users, and become acquainted with the team at Access Innovations. As well as putting these meetings together, she handles and oversees the marketing endeavors at Access Innovations. She is always working to keep the company up to date, as well as providing ways to communicate the company’s products and services to the world to benefit others.
“The changes in the world of information over the past 20 years are astounding,” commented Ms. Kotula recently. “Since the founding of the company in 1978, Access Innovations has been preparing and waiting for the ‘Information Age’ to arrive. We have the best tools and an unparalleled breadth of experience in taking information from archive to actionable asset.”
Heather received her Bachelor’s degree in Distributed Foreign Languages from the University of New Mexico in 1991. Before she received her degree, Heather also attended the Goethe Institut in Munich and took German language classes there, as well as attending the Scuola Dante Alighieri in Florence, Italy where she received a Certificate of Fluency in Italian language. Heather received her Master’s degree in Business Administration from New Mexico State University in 1995.
Founded in 1978, Access Innovations has extensive experience with Internet technology applications, master data management, database creation, thesaurus/taxonomy creation, and semantic integration. The Access Innovations Data Harmony software includes automatic indexing, thesaurus management, an XML Intranet System (XIS), and metadata extraction for content creation developed to meet production environment needs. Data Harmony is used by publishers, governments, and corporate clients throughout the world.
As an IT process and governance consultant, I see a large number of software tools for managing knowledge, information, and data. The choices between vendors and in-house development options seem endless. Evaluation can be challenging because descriptions of purpose, use, and comparative approach are often not clear, standardized, or based on research.
Recently I heard a well-known software vendor describe one of its popular and respected products as being able to deliver far beyond its designed capabilities of content tracking and retrieval across multiple platforms. It was described as also able to “accelerate, automate, and maintain compliance with core business processes.” There was no mention of the significant management work required to make this happen. Most organizations are not even close to the needed level of defined processes, policies, measurements, and organizational roles.
I also recently heard a talented technical manager describe how he developed a taxonomy for his specialty, but he was unaware of existing taxonomy tools, standards, or related taxonomies with which his proprietary tool needs to interoperate for long-term success.
I am part of the IT community where management and historical perspectives are not always adequately evaluated before a technical solution is considered. Consequently, I have seen far too many graveyards of expensive tools that never met their potential and were discarded.
My heart goes out to those who feel overwhelmed by a need to do “something” with increasing numbers of content types, formats, devices, and security or litigation risks. Growing amounts of content, vaguely written regulations and laws, and nebulous but formidable concepts like “Big Data,” “The Cloud, ” and “Dark Data,” add additional complexity. It is understandable why an easy one-tool solution is attractive. Nevertheless, it is not productive or necessary to keep buying more expensive tools that will only be discarded. There are better ways of addressing the problems.
Establish Governance and Management Processes First
Establishing basic governance and management processes for knowledge, information, and data is essential for informed decisions about tool purchase, configuration, coordination, and ongoing content viability and validity. The seemingly simpler choice of following what a tool vendor suggests for processes usually complicates the enterprise business environment. Designers of general and industry tools have no way of knowing specific business details or organization strengths, which are often competitive advantages.
Use What is Already Known
Having “too much information” is not new. Humans have survived by processing large amounts of information in the subconscious brain while concentrating the conscious mind on the most pressing external business. Libraries were begun as shared repositories and the beginnings of thought about managing “too much information” by around 2,500 BC.
Marjorie M.K. Hlava, President, Access Innovations, states in a May 27, 2013, TaxoDiary blog post, “We librarians and information specialists get to view anarchy in the universe more often than other people do. And we are the ones who have the job of putting the universe into some sort of order. With a thousand points of knowledge…”
What has been learned by facing “a thousand points of knowledge” head-on applies to the terabytes, exabytes, zettabytes, and yottabytes we now face. They all seemed boundless when first encountered, but can be bounded with thoughtful localization and organization.
During my earlier career as a corporate librarian and records manager, I learned information science models that combined thousands of years of thought with current research and technology for addressing “too much information.” A few examples paired with initial action steps follow:
- Information is best understood and used by its primary users
Define core organizational products, areas of expertise, and risks. Focus and fund knowledge, information, and
data work in these important areas.
- Information is a definable, manageable asset
Define the knowledge, information, and data assets needed to produce core products and maintain expertise,
dividing the assets into manageable, coordinated groups.
- Managed metadata can describe information so that it can be found and/or linked
- Controlled, agreed vocabularies in areas of specialty can greatly enhance retrieval
Define and assign organizational entities and roles to make and maintain the asset definitions, decision rights,
priorities, growth needs, agreed metadata schemas and vocabularies, measurements and reports.
- Information does not follow the rules of thermodynamics – it grows when it is used
Plan for growth, change, coordination and interoperability by using existing standards and making use of what is
Watch future blog postings for more details on this subject.
Judith Gerber (guest blogger)
Sponsored by Access Innovations, the world leader in taxonomies, metadata, and semantic enrichment to make your content findable.