Groundhog Day: Names and Recursions

February 1, 2016  
Posted in Access Insights, Featured, Taxonomy

I’m sure you’re all just like me and waiting anxiously to hear the results from Punxsutawney, Pennsylvania, whence this very day we will find out from Punxsy Phil whether spring will come early this year or we have to wait six more weeks (pro tip: In the Northern Hemisphere, it’s always going to fall on March 20th or 21st).  As ridiculous as the holiday might seem to some of us, though, there are things about groundhogs and Groundhog Day that are pretty interesting.


Photo, Aaron Silvers, / CC BY-SA 2.0

Firstly, nobody seems capable of agreeing on what the rodent is called. The holiday would suggest that groundhog is the accepted term, but growing up, I always knew them as woodchucks. And there’s the well-known tongue twister (“How much wood would a woodchuck chuck if a woodchuck could chuck wood?”), which lends credence to its status as the accepted term. But depending on where one resides, the critter is also known as land-beaver, land-squirrel, rock chuck, pasture pig, and my personal favorite: whistle-pig. Some also call it a marmot, but that’s really a broader classification of the genus to which the groundhog belongs (Latin name: Marmota monax). All groundhogs are marmots, but not all marmots are groundhogs, which is plain old Taxonomy 101.

Kingdom: Animalia
Phylum: Chordata
Class: Mammalia
Order: Rodentia
Family: Sciuridae
Genus: Marmota
Species: M. monax

While there are plenty of names for the animal writ large, there are also more celebrity groundhogs than you may be aware; although Punxsy Phil is the most prominent, plenty of states have them. Georgia boasts General Beauregard Lee; Ohio, Buckeye Lee; North Carolina celebrates Groundhog Day with Sir Walter Wally; and Alabama holds Smith Lake Jake to be the true authority on winter’s end. Montana has three: Warren Whitefish, Dayton Dennis, and Moose City Moses. Wiarton, Ontario has a whole festival surrounding the albino groundhog Wiarton Willie, which even features a hockey tournament.

There’s even a song about it, “Oh, Murmeltier” (sung to the tune of “Oh, Tannenbaum”) for which professor and marmot scholar K.B. Armitage of the University of Kansas has written English lyrics:

“Oh Whistlepig, oh Whistlepig,

We celebrate your famous day.

Oh Whistlepig, to you we pray

That winter soon will go away.

We like the sun and daffodils.

We’ve had too much of winter’s chills.

Oh, marmot friend, we’re warning you,

If winter stays, you’ll be rockchuck stew!”

…which is just plain weird.

Then, we have “Groundhog Day,” one of the most enduring comedy films of recent decades. In it, a meteorologist named Phil Connors (played by Bill Murray) travels to Punxsutawney to cover the Groundhog Day event. While there, he gets stuck in a recursive feedback loop, in which February 2nd is replayed over and over, while he tries to break the loop and move on to February 3rd (and get the heck out of Punxsutawney).

Bill Murray

All comedy hijinks aside, movies are ripe for classification. Genres, while easily arguable, are the broadest way by which we classify them. In the case of “Groundhog Day,” it’s a comedy, but we also have drama, horror, etc. Sometimes, such as in this case, the classification is fairly obvious, but some films rightly belong to multiple genres, such as horror-comedies, or dramedies (a term that I personally despise, but it’s out there in common use).

Then, for some movies, we sub-classify by the film’s content or style. Film noir, for instance, isn’t a genre of its own; they’re dramas, but they’re particular kinds of dramas with a specific tone and stylistic touch. If somebody wants to watch something of that nature, it’s much smarter to search for “film noir” than to try wading through the thousands of “dramas” that have been released in the century-plus of cinema—and would thereby be returned in an online search.

But we classify movies in ways other than genre, as well. The MPAA rating system is designed to tell consumers whether the movie is suitable for their age group or comfort level. Sometimes, we classify by their overarching plot, such as the biopic, the road movie, or the coming-of-age film, independent of genre. One can classify them by country of origin, or level of the movie’s budget, or really any way at all.

But let’s go back to “Groundhog Day” and the recursive feedback loop in which the main character gets stuck. It’s funny when it happens to Bill Murray, but it can be devastating to taxonomy. Say, for instance, you have a taxonomy with a top term of Business. A sensible narrower term under this could be Risk. That could be used for any number of kinds of risk, but in this case, the taxonomist adds a narrower term of Risk Management.  Under that, one could place Insurance, which easily falls under Risk Management. So far, everything looks just right



Risk management

Then, somebody comes along to screw around with the taxonomy, and looks at Insurance without looking at the broader terms first. It’s easily arguable that under Insurance, if one wasn’t paying attention, could go Risk Management—of which of course a primary topic is Risk.


Risk Management


When that happens, you get this:


Recursions of this kind are the taxonomic equivalent of what happens in “Groundhog Day,” and it’s not good, or even funny. You’ll go on forever in this loop, getting nowhere and draining system resources at an increasing pace.

So today, we can all have a laugh at a movie, watch some hockey, and gather around to see a groundhog (or whatever you want to call it) leave its burrow, all because of Groundhog Day. But stay warm, because (spoiler alert) there is absolutely six more weeks of winter to come.

Daryl Loomis
Access Innovations

Originally posted February 2, 2015.

Conference Early Bird Discount Ends Friday

January 28, 2016  
Posted in Access Insights, News, semantic

The Enterprise Data World Conference in San Diego, CA on April 17-22, 2016 is just around the corner, and if you haven’t made the decision to attend yet, maybe the list of speakers and topics will help you make up your mind. The speakers can change the way you make decisions in our data-driven world and hopefully inspire you to use your data skills in new ways.

Among those on the impressive list of topics are Data Privacy through the Information Governance & Quality Lens with Daragh O Brien and Katherine O’Keefe of Castlebridge Associates; Developing a Modern Enterprise Data Strategy with Edd Dumbill and Colette Glaeser of Silicon Valley Data Science; and Access Innovation’s own Bob Kasencheck leading Investing in Semantics: Making the Business Case (An ROI-based Argument).

Super Early Bird discounts end January 29, 2016, so don’t wait.

Melody K. Smith

Sponsored by Access Innovations, the world leader in taxonomies, metadata, and semantic enrichment to make your content findable.

Looking Forward To DHUG: Part 2

January 25, 2016  
Posted in Access Insights, Featured


Wednesday, February 10 is the second official day of DHUG, and it kicks off with a bang.

Our keynote speaker, Dr. Moriba Jah, who hates being called “the garbage man of space,” opens the session. Dr. Jah is an astrodynamicist and one of the world’s premiere authorities on space debris, which consists of anything from a chip of paint floating in the ether to a defunct, broken down satellite. His work in monitoring, tracking, and cataloging this orbital detritus is fascinating and I can’t wait to hear what he has to say on this.

Dr. Jah is a tough act to follow, but if anybody can, it’s long-time DHUG attendee, Charlotte McNaughton, Director of Publishing Technologies at the American Society of Civil Engineers (ASCE). Charlotte spoke at last year’s DHUG about how ASCE uses Data Harmony’s XML Intranet System (XIS) as a repository for their metadata records as the back end for their Civil Engineering Knowledge Environment (CEKE). I don’t yet know what she’ll be speaking about this time around, but if last year is any indication, it’s going to be quite interesting.

Following Charlotte, we come back in house with Access Innovations’ president, Margie Hlava, who will be joined by crack programmer Dan Vasicek, to discuss an exciting development for scholarly publishers.


Unfortunately, there are issues with algorithmically generated articles that are being approved for publication. It’s not a lot of them, and it’s telling about what kind of pressure publishers are under, but it’s a problem. Fighting algorithms with algorithms, Dan has created a method for detecting these bogus submissions before they’re published. This is some really cool technology and, after Margie discusses the problem, Dan will explain how it works. I think the attendees (many of whom are in scholarly publishing, after all) are going to really like this one.



Up next is Justin Francom, a programmer with Innoventrum, whose subsidiary, Find-A-Code, we have partnered with to create IntegraCoder, a unique medical coding software application that enriches clinical documentation and helps medical practices get paid more efficiently. Justin, who I work closely with, will discuss the innovative ways that he integrated Find-A-Code’s medical coding data sets with the Data Harmony software to create what is truly a fantastic tool.



As the day begins to wind down, we’ll welcome Cindy Acton of Cargill to the stage. Cindy attended DHUG last year, and has been gracious enough to explain to the audience about how Cargill implemented Data Harmony in their content management tool, setting her on a journey to overhaul a taxonomy that contains over 9000 terms. Just hearing how she keeps all that straight will be extremely interesting to me and learning how she works with Data Harmony should be very enlightening for all the attendees.



Our final “official” speaker at DHUG will be our very own Vice President of Communications and Marketing, Heather Kotula. Heather will discuss an important and misunderstood aspect of what we do at Access Innovations: the return on investment. As I wrote last week, it’s pretty clear to most people who are using our software how it saves time and, in turn, money. It’s those that dole out that money who sometimes have difficulty understanding the ROI. It’s easy to understand why that’s the case, as information infrastructure is a much more esoteric concept than the bottom line and, if the value of that initial investment isn’t readily apparent, they’re not going to spend the money. Heather’s talk will be helpful for attendees with strategies to cogently explain the concept and how, by implementing informational infrastructure, a business will soon recoup that investment.


But after Heather’s talk, Wednesday’s not quite done. To close the show, Margie will return to the stage to answer questions from the audience (I’m sure there will be tons), address feedback both on the meeting and the software, and discuss the directions that Access Innovations will be going in the future. This is always interesting to hear about, since many of the changes that happen with Data Harmony during the year come directly out of this aspect of DHUG and we really can’t appreciate user feedback enough.


Thursday will be a little less busy, as we will say farewell to most of the attendees after Margie’s final address, but a few will stick around for a day of individual meetings and training. The personal nature of these dedicated sessions is great for education, of course, but it also helps us to build relationships with our users.

If you’d like to join us in sunny Albuquerque for a couple of days of highly interesting discussion, there’s still space, so sign up for DHUG and find out the myriad ways that our software is implemented. I can’t wait.

Daryl Loomis, Business Development
Access Innovations

Looking Forward To DHUG: Part 1

January 18, 2016  
Posted in Access Insights, Featured

It’s that special time of year again, the snowcapped mountains and chilly air a reminder that spring is still a few weeks away.  It’s also a reminder that something else is just around the corner, as well: the 2016 Data Harmony Users Group (DHUG) meeting. It’s the most exciting time of the year here at Access Innovations and 2016 is going to be the best year yet.

As usual, DHUG will bring our fantastic users together in one place for a few days of networking, training, and discovery, but we’re doing things a little bit differently this year. First, we’ve consolidated the meeting into a two-day core conference, with a day each of pre-conference and post-conference events for our most hardcore users.

It all begins Monday, February 8, with our pre-conference training sessions. These sessions are designed to introduce users to taxonomic principles and get them familiar with the Data Harmony software. As someone who was brand new the whole world of taxonomies at last year’s DHUG, I can personally attest to how phenomenally informative these sessions are. It’s not just for newbies, though; these sessions are great for anyone in need of a refresher.


Moreover, it’s a great primer for Tuesday, when we officially kick off DHUG. Margie Hlava, founder and president of Access Innovations, opens the show with an update on the new features in Data Harmony. Most of these features come directly from needs and desires of our clients, many of whom are right in the room with us. It’s great to see the satisfaction that comes with having one’s ideas put into action; it satisfies their needs and it makes our software better.



Next up (at least currently, as times and days are subject to change), from the International Society of Optics and Photonics (SPIE), we present Tim Lamkins, who will discuss how SPIE implemented a taxonomy to install a tagging and feedback loop into their submission system. Their goal, to attain business intelligence for use in conference proceedings, further highlights the analytic capability of our software and I’m really excited to see how it works.



Following Mr. Lamkins, we have Dee Magnoni, the Research Library Director at the Los Alamos National Laboratory. She’s going to talk about one of the biggest issues currently facing research libraries throughout the world: access to content. This will be a very interesting talk, as both libraries and publishers (well-represented groups at DHUG) have a stake in this game.



Our own esteemed Director of Business Development, Bob Kasenchak, is up next to talk about data visualization using Tableau (a data visualization application), which ingests taxonomies and, using content metadata, deliver highly useful and interactive graphs for what is emerging as something of a theme in data analysis.


Bob will bfleete followed by Jabin White from ITHAKA/JSTOR with a presentation on making optimal use of one’s taxonomy investment. The ability to show the return on investment of a taxonomy is an extremely important subject because, while that might be perfectly clear to information workers, the people with hands on the checkbook might not be so quick to get it.



Next, we have long time DHUG attendee Xi Van Fleet from the American Society of Civil Engineers (ASCE), who is presenting on how ASCE has worked closely with Access Innovations to adapt their thesaurus into an author expertise taxonomy. This will be especially interesting because of the unique work that has gone into fulfilling ASCE’s needs.


Following that, we present Jeffrey Gordon from dataCloud, who will present his dataStream web platform to the audience. Designed to ingest massive amounts of data, it works with live or fast data to deliver solutions to big data problems quickly for complex analysis (not to mention it looks super cool).


Our final two presentations of the day both feature Rachel Drysdale from the Public Library of Science (PLOS). In the first, she will be joined by her colleague, Helen Atkins, and they will discuss the PLOS subject area database, which uses Data Harmony to index their gigantic amount of content. Rachel will describe the stages of the project, while Helen will describe the use cases for the database. After that, Helen will leave the stage and Bob Kasenchak will sneak back on to discuss, alongside Rachel, the linked data proof of concept (POC) to add DBpedia links to the PLOS thesaurus.

Just because the presentations are done for the day, it doesn’t mean the day is done. We cap off Tuesday’s session with our networking dinner, which takes place at a different venue every year. This time is most exciting, as it will be at the Unser Racing Museum, a multi-dimensional, interactive museum experience about the world of racing. I’ve never been, so I’m especially looking forward to it (plus, green chile chicken cordon bleu…yum!)

Wednesday’s session is just as packed, but I’ll discuss this next week.

Daryl Loomis, Business Development
Access Innovations

Linked Data is a Hot Topic

January 14, 2016  
Posted in Access Insights, metadata, News

Everyone can use a little help now and again. Bob Kasenchak from Access Innovations will be facilitating a free webinar on January 22, 2016 at 3:00 p.m. EST that is titled, “Make Your Thesaurus Smart! with Linked Data.”

Why is everyone talking about Linked Data? What is it — and what does it have to do with my taxonomy? Bob Kasenchak will answer that and discuss how Linked Open Data concepts can be used in conjunction with your thesaurus to create dynamic web pages, offer relevant content from other sources, add value to research portals, and transform the Internet into a giant database for satisfying queries.

This talk offers a short introduction to the concepts in play (and the conundrums they introduce), continues with a brief technical explanation of the processes involved, and concludes with examples of the cool stuff made possible by this technology. 

Register here today.

Melody K. Smith

Sponsored by Data Harmony, a unit of Access Innovations, the world leader in indexing and making content findable.

Access Innovations and IntegraCoder Release New Evaluation and Management Software Application

January 11, 2016  
Posted in Access Insights, Featured, indexing, metadata

Access Innovations, under subsidiary Access Integrity, and Find-A-Code are pleased to announce that they are ready to launch eMDoc, an all-new evaluation and management (E&M) medical coding tool through IntegraCoder, their combined medical record analysis system. Providers using IntegraCoder in one of their integrated electronic health record (EHR) companies will automatically have access to eMDoc, their robust evaluation and management solution.

eMDoc leverages Access Innovations’ patented Data Harmony software and Find-A-Code’s exhaustive medical coding datasets farther than ever before, pushing them to allow providers the ability to record evaluation and management data into an EHR at any time during the patient encounter. Like IntegraCoder, eMDoc automatically reads the EHR for context regarding evaluation and management, then returns a scorecard with E&M coding suggestions. As providers proceed through encounters, they will continuously receive an ongoing account of their level of service to the patient.

“We’ve been hearing the pain in the industry about E&M for a long time,” says John Kuranz, CEO of Access Integrity. “E&M calculation is frustrating for even the most experienced medical billers and coders and I’m thrilled with the hard work put in by our teams to allow IntegraCoder the ability to tackle this problem.”

“For too long, developers have tried and failed to solve the E&M problem, but the combined effort to bring eMDoc to the table has been outstanding,” explains Find-A-Code CEO LaMont Leavitt, “adding the capability to score an encounter while the encounter is still happening is a genuine breakthrough in the billing and coding space. An E&M product designed specifically for providers fills a much needed hole in the industry.”

IntegraCoder is a web-based solution combining the technologies of Access Integrity and Find-A-Code. Access Integrity’s engine analyzes content in EMRs and provides highly relevant diagnosis and procedure suggestions, while Find-A-Code provides exhaustive coding and revenue cycle/denial management resources. IntegraCoder’s indexing system recognizes key concepts within an EMR and delivers suggested codes for users to select from, which increases accuracy in clinical documentation.

For more information about eMDoc or to schedule a demo of the software, visit the IntegraCoder website at


About Access Innovations, Inc.,,

Access Innovations has extensive experience with Internet technology applications, master data management, database creation, thesaurus/taxonomy creation, and semantic integration. Access Innovations’ Data Harmony software includes machine aided indexing, thesaurus management, an XML Intranet System (XIS), and metadata extraction for content creation developed to meet production environment needs. Data Harmony is used by publishers, governments, and corporate clients throughout the world.


About Access Integrity,,
Access Integrity provides a patented technology for complete and compliant EMR analysis. Access Integrity plays an important role in medical transaction processing by extracting rule based relevant data (concept extractor) from medical records, increasing coding accuracy, clinical decision support, and overall understanding of a patient encounter. Access Integrity is the first company to employ Data Harmony’s semantic enrichment and rule-based concept extraction technology in the healthcare industry. The award-winning and world-renowned Data Harmony software suite has been used in the content management and information technology industries for more than 15 years.


About Find-A-Code, LLC

Find-A-Code, LLC is dedicated to providing the most complete medical coding and billing resource library available anywhere. Find-A-Code’s online libraries include extensive information for all major code sets (ICD-9, CPT®, HCPCS, DRG, APC, NDC, ICD-10 and more) along with a wealth of supplemental information such as newsletters and manuals (AHA Coding Clinic®, AMA CPT Assistant, DH Newsletters, Medicare Manuals). All code information and newsletter databases are indexed, searchable and organized for quick access and extensively cross-referenced. Find-A-Code also provides tools for code set translation (such as ICD-9 to ICD-10), code validation (edits) and claim scrubbing.

Register Today for the Data Harmony Users Group Meeting

January 4, 2016  
Posted in Access Insights, Featured

It is not too late to register for the twelfth annual Data Harmony Users Group (DHUG) meeting, scheduled for February 9-10, 2016 at the Access Innovations, Inc. offices located at 4725 Indian School Road NE, in Albuquerque, New Mexico.

Access Innovations, the company behind the Data Harmony product line, hosts the annual meeting. Case studies presented by Data Harmony software users make up most of the program, and most attendees indicate that these are the primary reason they attend DHUG.

“Based on feedback from the 2015 meeting, we have altered the schedule for DHUG 2016,” said Heather Kotula, Director of Communications at Access Innovations. “We are a client-driven organization and our users’ needs reflect the new arrangements, the same as updates and new features in the Data Harmony software.”

Monday, February 8, attendees will meet at the Access Innovations office, where introductory training topics will be covered. “Having the training at our office makes it possible for everyone attending to have a hands-on experience with the software,” remarked Win Hansen, Director of Production Services. “Whether they are experienced users, new users, or still considering Data Harmony software for their organization, this approach is the most beneficial.”

Tuesday and Wednesday will feature a combination of case studies presented by users and presentations by Access Innovations and Data Harmony staff.  The Welcome and Features Update by Margie Hlava, president of Access Innovations, normally a three-hour presentation on the second day of the meeting, will now be split between the two days. The Welcome portion will be presented in the morning on Tuesday. The Features Update will be combined with the wrap-up on Wednesday afternoon. Sessions on these two days will held in the Mega Classroom at NMSU, adjacent to the Access Innovations office.

Thursday, February 11, is a full day of in-depth training, also held in the Access Innovations office. “The segments on this day are geared toward users who will work with the admin module, importing and exporting data, and performing batch processes,” explained Jack Bruce, Senior Taxonomist. “We will also cover information about XIS, our XML database system.”

Meeting registration includes a networking reception at the nearby Hampton Inn on Monday evening and dinner on Tuesday evening. The Tuesday evening dinner will be held at the Unser Racing Museum, which uses modern technologies to educate and immerse visitors in the exciting world of racing.

“By hosting the meeting at the Access Innovations home office, we are able to make the entire staff available to discuss technical and tactical issues,” remarked Bob Kasenchak, Director of Product Development. “It also fosters better communication among the various parties, and we are able to identify and solve issues much faster.”

To register for the meeting please visit the DHUG registration page.

Data Harmony software users are encouraged to submit case study abstracts to present at the meeting. The submission form uses the Data Harmony Smart Submit software module.

For information about planning a trip to Albuquerque for the meeting, go to

Our Top Information Industry Trends of 2015

  1. The Demise of the “Physical Book” has been exaggerated
  2. The Information Industry has no single central meeting
  3. Taxonomies are used in websites but not in search
  4. Analytics are booming.
  5. Changing information landscape
  6. Google Scholar
  7. Semantics, Linked Data, and the cloud
  8. Boomers are not retiring

The weather in Albuquerque is cold and overcast with a chance of snow for Christmas day—great weather to reflect on the trends we saw in 2015 and what’s ahead for 2016. This review is a narrow focus on things which impact Access Innovations view of the world and where things fit in within the information industry we enjoy.

  1. The Demise of the “Physical Book” has been exaggerated: print is on the rise 

At Techcom 2010, Nicolas Negroponte famously announced that “The physical book is dead.” In fact, five years later, the e-book is declining, while books have witnessed an incredible double-digit increase this year. Some surprise sleepers leading the sales include adult coloring books—I just got a couple myself!

  1. The Information Industry has no single central meeting or gathering point 

Associations are important parts of our professional engagement, and often of learned publishers as well—which makes them our customers. The world of professional associations is under many different kinds of pressures. Attendance is down at library meetings, but booming in technical fields. Formerly crucial conferences, such as International Online (dead) and SLA (withering) no longer provide a gathering of all members of the industry in a single spot.

As this market has fragmented we find ourselves attending more, not fewer, meetings—an alphabet soup of smaller meetings: STM, ALPSP, PSP, SSP, CES, CESSE, NFAIS, SIIA, etc. InfoToday meetings are split into a confusing array of verticals such that you don’t even know who else is there. Other conferences like DataVersity, Predictive Analytics, and SemTechBiz, seem to change their names yearly; it’s hard to tell whether it is the same group of attendees.

  1. Taxonomies are used in websites…but not in search 

Metadata needs to be included in the fundamental design for search to work really well.

Search has been king…but is back to being a pauper. Precision and recall gave way to “relevance” (defined as “my guess that this result is what you want returned!”). (or more cynically, “This is what we want you is best for you”.)Those who operate on the search kernel fundamentally cannot believe that words should have controls added to disambiguate them. Search should be smart enough to incorporate word differences, but without taxonomies and other vocabulary control options it is not.

The Dialog search system was optimized for metadata from the ground up, but implementations of, for example, Lucene usually add a taxonomy after the fact–when the implementation is already done. The taxonomy, the inclusion of which was likely insisted upon by the librarian or web master, is just an annoyance to be dealt with.

Web interfaces are doing an increasingly good job of mitigating the need for training to effectively use their underlying systems. Often, parametric or fielded search options are included in the facets right in the web interface. Most of the well-fielded search implementations are on SQL, Oracle, or Endeca platforms. Search is still broken; it still needs metadata.

Search is still lost in general. It is incredibly difficult to get those who generate the search kernel to believe that vocabulary control is needed, and in fact produces measurably better search results (it goes against the core training they get in computer science school.) so the work-around ends up being to embrace the taxonomy on the web site.

A few scholarly publishing/research sites actually use taxonomy terms in the search (for both type ahead and for the first inverted file search) on the technical side. Those sites are sticker– clients love them, as they deliver precise results without having to go to Google Scholar to find the papers they want. Of course, Google Scholar itself does not have much of a taxonomy: just 260 terms to cover the world of content on their site, and is amazingly well hidden from view. 

There are an increasing number of scholarly and research web sites leveraging taxonomies to excellent intent. Some of our clients’ sites include:

The JSTOR Labs Sustainability Portal

The Public Library of Science (PLOS) Topic Browser

The Center for Disease Control’s National Institute for Occupational Safety and Health Mining (NIOSH) Portal

American Institute of Physics’ Scitation Platform

  1. Analytics are booming 

Predictive text and search analytics are rapidly moving fields. Analytics help corral the onslaught of incredible amounts of information (“big data” in some cases) and translate the results into pictures or visualizations of the data in ways that are easy to understand by putting together trends and diverse data sets into meaningful, actionable data. End users are using linking and dashboard tools to uncover new trends, and content providers use them for market research and management.

Included in the area of analytics is data mining, which brings into focus large bodies of information—but also raises the often conflicting issues of privacy, security, and freedom of expression; for example balancing detecting signs of terrorism with common sense. Everything is considered to its logical extreme; moderation and goodwill are lost.

  1. Changing information landscape

Libraries are disappearing, information storehouses and archives are appearing, corporate information is still hard to find, and knowledge walks out the door when employees leave. In the meantime, we can look up anything during a dinner conversation (just ask Siri, Cortana, Alexa, or Google!). The need for printed references is declining, as Siri (and her companions) can find things on the web and point you to valid links.

During the 1970-1995 time period we attended many meetings that focused on looking for ways to find the “Elusive End User”. Now everyone is a consumer of digital information, and the demand/need for information is increasing rapidly. Barriers to information access come and go—which creates demand.

For example, access to medical services and first-hand medical information was restricted by health care providers, so the Internet has stepped in to meet that need; some of that information is of high quality, but much is misleading. In October of this year, the U.S. adopted the ICD-10 for coding diseases on among forms medical insurance claims forms —years after the rest of the world. The CMS (Centers for Medicare and Medicaid Services) has also mandated more information be made available to patients by providing relevant content at the time of service. The flip side is heavy fines and slow cash flow for providers who do not accurately code using the new system.

  1. Google Scholar comes of age

At first, Google Scholar terrified scholarly publishers. Now, however, they strive to ensure that their articles are all indexed by the scholar crawlers for maximum exposure. While Impact Factors for authors and publishers are still important, the trend seems to be to strive to get your publications the top hits/results in Google. The new game is to use Google as a springboard to more content: surface the data in Google, and lead the user to a deep dive on the publisher’s site.

  1. Semantics, Linked Data, and the Cloud 

“In the beginning was the word…” Without the words there is no way to express a thought. Words express meaning. Words are the semantics. In Pygmalion Professor Higgins notes that “the moment an Englishman opens his mouth another English man despises him” for his manner of expression and his accent. On top of that we now have incredible limits on expression applied by the political correctness and thought police. So individual expression becomes guarded, coded and moves underground.

Taxonomies make semantic inferences much more reliable. Disambiguation of terms and gathering of synonyms has to be done within context; once the context is established, inferences can be reliably drawn. The sentence “George lives in London” makes no sense without knowing which London and which George is meant.

Linked Data (and Linked Open Data) are becoming much more prevalent because of the context and underlying data structures and definitions they offer. Open vocabularies and datasets, and the links between them (which clarifying the terms used) are increasingly available.

All of this interlinking of data is enabled by ability to link things via the universally available Internet. Whether a closed corporate or government system, open web, or some combination, entire systems are being moved to the “Cloud”—that place where any information object can be reached with a URL (and perhaps appropriate access and permissions). We have seen a massive migration to Cloud access from installed systems over the last year more than double what we saw in 2014. I believe this trend will accelerate.

  1. Boomers are not retiring.

It isn’t just that they are getting older and might not have enough money; they really don’t want to give up and transition out of the workforce. They like what they are doing, feel like they have a positive contribution to make, and would like to continue to do so.

All of the rules for ADA compliance that governments and organizations have been implementing over the last few years have enabled people with age-related disabilities to continue to work:

In the meantime, some 92 million “Millennials” are entering the workforce with different mindsets, work approaches, and information-gathering methods.

While the 77 million in the Baby Boomer class maintain some of the old approaches to information, the new, born-digital set has different expectations…and both have the same informational needs:

The workplace is changing. Seniority is being replaced by ability. Increasingly capability determines advancement instead of age.  While politicians campaign on income equality, the workplace is adjusting to a whole new playbook based on what people of any age contribute to the products and services of the organization. We are moving to a sharing economy, immediate information access, and constant social interactions. This means paywalls are avoided, ads are tolerated, privacy is not a concern, but identity theft is. In these times of global strife, continuing economic uncertainty, and technological change the workforce may well move to be more like the Millennials, with a resulting workforce which is fast-adapting, flexible, and innovative.

2016 is a whole new world! We look forward to it.

Marjorie Hlava, President
Jay Ven Eman, CEO
Access Innovations, Inc.

The Ghosts of Concepts Future

December 21, 2015  
Posted in Access Insights, Featured, Taxonomy

The bell struck twelve.


The Phantom slowly, gravely, silently, approached. … It was shrouded in a deep black garment, which concealed its head, its face, its form, and left nothing of it visible save one outstretched hand. But for this it would have been difficult to detach its figure from the night, and separate it from the darkness by which it was surrounded.

(From A Christmas Carol in Prose; Being a Ghost Story of Christmasby Charles Dickens.)

And so it is with emerging concepts, those concepts whose forms we can but vaguely discern at the present point in time, whose true reality lurks in the future.

As taxonomists, we have a responsibility to discern those future concepts, although they may still be invisible to most. We can save the various expressions of those concepts in search logs from being rejected from consideration for a vocabulary simply on account of their as yet infrequent appearance.  In a taxonomy or thesaurus, we can provide labels that will consolidate the indexing for a concept for which researchers have not yet settled on a name. In some cases, especially with widely used vocabularies, we can perhaps determine the name by which a concept will be known on a standard basis.

This role in itself is one of the emerging responsibilities for taxonomists, thanks to the rapid advances in science and technology. In “What Next, Taxonomy?” (posted on The Taxonomy Blog on November 4, 2011), taxonomist Marlene Rockmore concludes that taxonomists need to deal with emerging technologies in a variety of ways, including collection of relevant content:

“So what next, taxonomy? What is nice to hear is that more taxonomists are surviving because their organizations understand their core roles. What’s the emerging topics and challenges –  how to distribute and decentralize (localize)  while having authority and control, how to collect new content on emerging, current topics, visualization, how to be more agile, how to fit in with new technologies like social media, mobile, and big data. Phew! That’s a challenge. Taxonomists have a chance to build relationships not only between terms, but with stakeholders on the way to a compelling, visualized, multidimensional content strategy. Good luck.”

This challenge has been growing in step with the rapid advances in science and technology. One example among the many advances in science is the ability of biologists to recognize new and emerging species, as well as life forms that have existed for a while but were formerly overlooked. The Live Science page Newfound Species observes:

“Science has identified some 2 million species of plants, animals and microbes on Earth, but scientists estimated there are millions more left to discover, and new species are constantly discovered and described. The most commonly discovered new species are typically insects, a type of animal with a high degree of biodiversity. Newly discovered mammal species are rare, but they do occur, typically in remote places that haven’t been well-studied previously. Some animals are found to be new species only when scientists peer at their genetic code, because they look outwardly similar to another species — these are called cryptic species. Some newfound species come from museum collections that haven’t been previously combed through and, of course, from fossils.”

Even the humble hosta has its own emergings, due in part to technological and social advances in communication.


A Rookie’s Guide to Hostas, Hostas, Hostas observes:

“In past centuries, we used to talk about people “discovering” new species of plants. What this usually meant was that European, English or American plant explorers traveled to remote parts of the world and found plants that were new to them. Now, of course, we know that local people in those other parts of the world were often quite familiar with these plants all along. Many of the so-called new plants, including hostas, have been found in local paintings and documents produced long before the Westerners started poking around. In more recent times, however, with better communications, we more universally share the knowledge of different horticultural communities.”

As far as actually emerging species are concerned, evolutionary biologist Rob DeSalle of the American Museum of Natural History has indicated the continuing nature of species emergence:

“Identifying a new species as it emerges is the holy grail of evolutionary biology. … Species must be emerging someplace on earth. The best places to look would be places with lots of species, like rain forests, and islands, because isolation opens new niches.”  (In “Q & A; Emerging Species” by C. Claiborne Ray, published June 17, 2003 in The New York Times)

The ScienceDaily website has a webpage dedicated to news about “new” species of plants and animals. While most of these will escape public awareness, Time Magazine has sifted through the barrage of information to identify the Top 10 New Species” of 2014. According to author Bryan Walsh, “The collection includes a dragon tree, a skeleton shrimp, a gecko and a microbe that likes to hang out in the clean rooms where spacecraft are assembled.”

Speaking of top things of 2014, and moving on to emerging technologies, the Massachusetts Institute of Technology’s online Technology Review has published a list of 10 Breakthrough Technologies 2014.” The list includes such things as brain mapping, genome editing, and agile robots.


The Wikipedia article “Emerging technologies” emphasizes the role of technology convergence in the emergence of new technologies. The article mentions an acronym of particular interest to those in the information technology world:

“NBIC, an acronym for Nanotechnology, Biotechnology, Information technology and Cognitive science, is currently the most popular term for emerging and converging technologies, and was introduced into public discourse through the publication of Converging Technologies for Improving Human Performance, a report sponsored in part by the U.S. National Science Foundation.”

Wikipedia also has a “List of emerging technologies” containing brief descriptions of “some of the most prominent ongoing developments, advances, and innovations in various fields of modern technology.” More than two hundred emerging technologies are listed.

There are and will continue to be many new and emerging concepts in science, technology, and other fields. Taxonomies can help define the terminology for those concepts. This is perhaps most readily evident for genus-species-subspecies-etc. names, whose designation is the territory of the biological taxonomist, or the biologist temporarily acting as taxonomist. Elsewhere, taxonomists can identify predominant labels and the occasionally used synonyms, and then use that information to add appropriate preferred terms and non-preferred synonyms to a vocabulary. They can also add definitions and scope notes. The skills of the taxonomist can bring clarity to formerly mysterious concepts and nomenclature. 

No fog, no mist; clear, bright, jovial, stirring, cold; cold, piping for the blood to dance to; Golden sunlight; Heavenly sky; sweet fresh air; merry bells. Oh, glorious! Glorious!


So don’t be scared of the ghosts of future concepts. Think of them as true spirits of the future, taking flight with the benefit of well-chosen terms and synonyms in a taxonomy or thesaurus.

Every time a new term rings true, an emerging concept gets its wings.



Barbara Gilles

Originally posted December 30, 2013.

Using Taxonomies to Organize Associations

December 15, 2015  
Posted in Access Insights, Featured, Taxonomy

as·so·ci·a·tion  (ə-sō′sē-ā′shən)

We use taxonomies and ontologies to organize document collections. But controlled vocabularies (of a sort) are also used to organize the aisles in a grocery or hardware store, clothes in the closet, and your kitchen. In the last two cases, the subject matter expert (SME) is definitely you!


If you are more comfortable shopping in a Target instead of a Walmart, it is probably due to the way in which they’ve organized their collection of merchandise. Barnes & Noble and Borders books stores had significantly different ways to organize their books and other offerings. I loved one, but could not easily find my way around the other.

We frequently organize document collections for associations—organizations of learned and scholarly publishers. Occasionally, they ask us to organize the governance layer of the society as well. What they want us to do of course is take all the committees, special interest groups, divisions, chapters, and communities of interest or practice and add them to the taxonomy for use in navigation on their website. To do so, we look at the content to be indexed that’s relevant to that section of the society to ensure proper tagging.


Our philosophical bent at Access Innovations is to build a term record for every term in the taxonomy (or thesaurus or ontology). That means a small (usually) database of terms; their broader, narrower, and related terms; aliases (synonyms); and perhaps definitions, scope notes, or other links. The terms are used to tag the content, whether they are HTML pages, articles, book chapters, memorabilia, meetings, minutes, etc. We are often asked to provide a “full path” export showing exactly where in the taxonomic hierarchy the term itself resides, and we do. But we know that searchers do not ask for the full path; they ask for the term in that tiny little search box. Thus, we tag at the term level—and each term needs to stand on its own as a potential search term. The meaning of the term should not be inferred from its place in the hierarchy, since the searcher (1) usually has no idea where it resides taxonomically, and (2) doesn’t really care; what they want is the appropriate content.

Along the way, over the course of organizing content for many associations and societies, we are often able to shed an interesting sidelight of information: we learn the organization well, but only from its content. We are not experts in the field, nor are we active members of the organization. We can read the history of how it started, why it is different from other organizations, what is so special about it that made many people come together to form the society in the first place. However, by building the taxonomy we get a snapshot in time—we see the content and organization as it is today. This interesting perspective has led us to see where the society is and what it has become, not what it was. It gives a fresh perspective on how the organization is really organized, what it actually covers, and, based on recent activity, in which direction the association is going. This provides a solid foundation for future scopes and long range planning for organizations.

Visualization of the data provides the present communities of interest and links to the other communities within the organization. Add the time and date of the publication of each piece of content and it also shows the trending directions for each topical area.

As we build out the governance layer (how the organization fits together) we depend on people who know the organization and the published guides about how it works. If we did not, we might organize it in a very different way based on what they actually do today, which would be an uncomfortable experience for those who know and are active in the association. Just like going into a bookstore which is arranged differently than you think it should be, the arrangement of the taxonomy for an organization needs to reflect how the organization thinks of itself. The other way of looking at it (solely from the content data) often does not reflect how the organization wants to be seen; it could, however, be an excellent strategic planning asset to use the taxonomy for this purpose. Sometimes a taxonomy is used exactly that way for a look at the future. If you are a member of an association, how would you go about building a taxonomy for the organization, and then applying it to the governance layer in order to secure a bright future?


Marjorie Hlava, President
Access Innovations

Next Page »