Choosing Terms for a Taxonomy or Other Controlled Vocabulary

September 21, 2015  
Posted in Access Insights, Featured, Term lists

choosing Original source: First Edition of the Oxford English Dictionary. Image redrawn by User:DavidPKendal after the diagram by James Murray, first editor of the OED.


Recently, we’ve looked at choosing controlled vocabulary terms. More specifically, we’ve considered related terms, choosing non-preferred terms, and choosing broader and narrower terms.  In this final installment of our “choosing terms” series, let’s broaden our scope to the task of choosing terms for inclusion in a vocabulary. And once again, let’s consult our usual trusty guide. That would, of course, be the Z39.19 standard (ANSI/NISI Z39.19-2005 (R2010)), “Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies”.

Yes, there it is. Section 6.1 of Z39.19 covers “Choice of Terms”. But wait. It’s only two inches long. There must be more to choosing terms than that. In fact, the first paragraph states, “Many issues need to be considered in selecting terms for a controlled vocabulary.” While that’s absolutely true, could this be a cop-out by our trusty guide?

And then, there they are — the cross-references:

  • The information space or domain to which the vocabulary will be applied (section 11.1.1)
  • Literary, user, and organizational warrant (section 5.3.5)
  • Specificity or granularity of the terms (section 11.1.7)
  • Relationship with other, related controlled vocabularies (section 10.9)

Let’s identify and examine the relevant passages from each of these sections.

The first cross-reference, regarding “The information space or domain to which the vocabulary will be applied”, is to section 11.1.1. Strangely enough, that section is headed “Avoid Duplicating Existing Vocabularies”. There doesn’t seem to be anything in that section that’s directly relevant to our topic. Anyway, most of us know (or can guess) that a controlled vocabulary for a particular subject area domain should include the terminology of that domain, and perhaps some of the terminology of peripheral subject areas, and not go too far astray from the core subject areas.

Looking elsewhere, we find that section 6.6.1 (Usage) has some advice relevant to vocabulary domains vis-à-vis term selection: “Terms should reflect the usage of people familiar with the domain of the controlled vocabulary as reflected in literary, organizational, and user warrant (see section 5.3.5).

Coincidentally and conveniently, the next cross-reference listed above also tells us to see section 5.3.5 regarding “literary, user, and organizational warrant”. There, we find some important advice that’s directly relevant to our topic:

“The process of selecting terms for inclusion in controlled vocabularies involves consulting various sources of words and phrases as well as criteria based on:

  • the natural language used to describe content objects (literary warrant),
  • the natural language of users, (user warrant), and
  • the needs and priorities of the organization (organizational warrant).”

The subsequent subsections go into a bit more detail. Additionally, going back to section 6.6.1, we find that much of the discussion on usage has to do with literary, user, and organizational warrant.

The next cross-reference, to section 11.1.7 (Levels of Specificity), has to do with “specificity or granularity of the terms”. The main piece of advice there is as follows: “The addition of highly specific terms is usually restricted to the core area of the subject field covered by a controlled vocabulary because the proliferation of such terms in fringe areas is likely to lead to a controlled vocabulary that is difficult to manage.” There are other considerations that are not mentioned there, such as the degree of specificity needed to properly index and search for content that is associated with the vocabulary.

The final cross-reference, to section 10.9, is regarding “Relationship with other, related controlled vocabularies”. The title for section 10.9 is “Storage and Maintenance of Relationships among Terms in Multiple Vocabularies”. Much of this section has to do with mapping between vocabularies. “This option requires designating one controlled vocabulary as the master with others as subsidiaries. The goal is to map the terminologies of the various controlled vocabularies to be included against a common classification scheme.” I think that where the term choice element comes into play here is making sure that the “common classification scheme” is complete enough to encompass the concepts represented by all the terms in all the vocabularies.

Mapping does not necessarily involve a subsidiary vocabulary per se. It could involve selected portions of a vocabulary. And the completeness aspect could involve considerations for making linked data effective. Here’s a tiny case study illustrating the need for adding a term to accommodate both:

“An example of the editorial work needed to create truly linked data is the process of mapping the implied conceptual links to actual links. For instance, the nationality/culture controlled list within ULAN should now map to terms in the AAT. While much mapping could be done through algorithms, comparing the ULAN nationality term to AAT terms, it had to be vetted by the editorial staff. Where “East German” was a historical nationality in the ULAN list, it did not exist in the AAT; the term was added to the AAT so that the link could be made.” (Patricia Harpring, “Linked Open Data in the Cultural Heritage World: Issues for Information Creators and Users.” Post on the Council on Library and Information Resources website, March 20, 2014. Permalink:

There are many other factors to be considered, such as clarity and lack of ambiguity. Those factors, plus the factors mentioned above, along with a goodly amount of common sense, should provide a good foundation for choosing vocabulary terms well.

Barbara Gilles, Communicator

Bob Kasenchak to present as Part of FEDLINK’s Building Web Taxonomies Program

September 14, 2015  
Posted in Access Insights, Featured, Taxonomy

Bob Kasenchak, Director of Business Development for Access Innovations, Inc., will be presenting at the Building Web Taxonomies program, sponsored by the Federal Library & Information Network (FEDLINK). Bob will discuss smart thesauri, how to build and implement them into an existing workflow, and what they can do for analytics and linked data.

FEDLINK conducts an extensive series of programs, workshops, and hands-on training that covers the policy and management of the information industry, as well as resource sharing, and technology. The Building Web Taxonomies program will take place on Monday, September 21, from 11:00am to 1:30pm ET at the Madison Building of the Library of Congress. Registration is required for this free program. More information can be found at or by calling (202) 707-4813 (TTY (202) 707-4995).

“Information science is quickly trending toward smart thesauri and ontologies,” Bob remarks. “My presentation will be detailed but not overly technical. I’m excited to be giving this talk to the FEDLINK audience and I hope they come with lots of questions.”

Bob is one of three presenters at the event. The other two will be Lee Lipscomb, Assistant Librarian at the Federal Judicial Center (FJC) in Washington, DC, and Keisha Fourniller, also from FJC. Designed to inform at all skill levels, from the new librarian to the experienced taxonomist, this program will review web taxonomy development basics and ways to improve current taxonomies.  The program will focus on taxonomy development strategy, structure, and designing a smart thesaurus. Following the panel discussion, participants will join a question and answer session.

Bob’s interest in information science began while working at Schwann Publications in the late 1990s.  Publishing a quarterly phone-book-sized classical music catalog featuring carefully controlled synonymic records and standardization of terms suggested the necessity for hierarchical data structures in the service of organizing information about composers and musical works. After a decade studying and teaching music, Bob joined Access Innovations in Albuquerque, New Mexico, as a taxonomist in 2012. Most recently his duties have included experimental business development, data analysis, and product development.


About Access Innovations, Inc.,,

Access Innovations has extensive experience with Internet technology applications, master data management, database creation, thesaurus/taxonomy creation, and semantic integration. Access Innovations’ Data Harmony software includes machine aided indexing, thesaurus management, an XML Intranet System (XIS), and metadata extraction for content creation developed to meet production environment needs. Data Harmony is used by publishers, governments, and corporate clients throughout the world.

Choosing Non-Preferred (and Preferred) Terms for a Thesaurus

September 7, 2015  
Posted in Access Insights, Featured, Taxonomy, Term lists


© Kellyplz | Dreamstime.comSquirrels Making Wishes. Photo

Synonyms and other non-preferred terms are largely what make a taxonomy (or other controlled vocabulary) a thesaurus. They can enrich a vocabulary in a variety of ways. Searches on a vocabulary can take advantage of non-preferred synonyms to direct the search from those words or phrases to the preferred term. More significantly, each non-preferred term can be used as a basis for one or more indexing rules for retrieving information from a database.

For the most part, non-preferred terms are conceptually equivalent (more or less) to their preferred term pairings. The difference, of course, is that the “preferred term” is the one that represents the concept in a thesaurus hierarchy and therefore in the accepted set of words and phrases for use in indexing using that particular thesaurus. (I must confess: I’ve always been a bit bothered by the widespread practice of referring to all the regular terms in a thesaurus as “preferred terms”, whether or not they all have non-preferred pairings of some sort. What exactly is it that the non-paired terms are preferred to? Ah, well.)

The relationship between a regular thesaurus term and any of its non-preferred pairings, or vice versa, is known as an equivalence relationship. The discussion of equivalence relationships in section 8.2 of ANSI/ NISO Z39.19-2005 (R2010) (“Guidelines for the Construction, Format, and Management of Controlled Vocabularies”) explains: “The relationship between preferred and non-preferred terms is an equivalence relationship in which each term is regarded as referring to the same concept. The preferred term in effect substitutes for other terms expressing equivalent or nearly equivalent concepts.”

So how do we choose those non-preferred terms? Let’s back up slightly. Z39.19’s section 8.2 starts out with this statement: “When the same concept can be expressed by two or more terms, one of these is selected as the preferred term.” Actually, that’s something of an oversimplification, although it’s a plausible scenario. It suggests that we’re starting out by choosing a preferred term from a little collection of candidates; the non-preferred terms must be the ones that are left over after you choose a winner. (This seems more like a matter of rejecting the less fortunate terms than choosing them, doesn’t it?) This could very well be the case if you’re using a bottom-up approach to thesaurus construction, starting with a large assortment of possible terms and then piecing them together into sub-hierarchies.

However, most thesauri are probably constructed with a combination of bottom-up and top-down approaches. On the top-down side of things, you might be crafting the hierarchical structure in a more conceptually oriented way, adding the terms that first come to mind or that are available from your collection of possible terms for the vocabulary’s overall subject area(s). Often, those terms are the ones that best represent the concept for the users of the thesaurus. In that case, choosing non-preferred terms becomes a matter of thinking of or discovering possible synonyms.

Those synonyms should only be ones that might be used in searching the thesaurus or any associated databases; otherwise, you’re cluttering the vocabulary with deadwood. On the other hand, you should cast a fairly wide net, and try to discover or think of as many ways of searching for the occurrence of a concept as practicable.

There is a danger of casting too wide a net. The main danger, perhaps, is that of choosing non-preferred terms that have a wider concept than the preferred term, or that go outside the boundaries of the concept in some other way. This might not be much of a problem in searches within the thesaurus. However, it can lead to inappropriate indexing. For instance, if your main term is Dogs, and you use Canines as one of its non-preferred terms, content that discusses wolves, foxes, jackals, or coyotes is likely to be indexed with the term Dogs, even when there isn’t a hint of a single dog hair in the entire article or whatever.

Choosing a non-preferred term with a narrower concept than the regular term is perfectly fine, though, as long as that non-preferred term doesn’t match better with a different regular term in the thesaurus. You might want to check the regular term’s narrower terms to see if there’s a better fit somewhere else. Another possibility is to add the narrower concept to the hierarchy as a regular term. One factor in deciding on adding the term is the degree of granularity that you want the thesaurus to have. How detailed should it be for the vocabulary’s eventual users? How many levels deep should it be?

It’s worth the thought and care that it takes to have well-chosen non-preferred terms in your thesaurus. These terms help to make the thesaurus a powerful tool in indexing, information retrieval, and knowledge domain representation.

Barbara Gilles, Communicator

Choosing Broader and Narrower Terms

August 31, 2015  
Posted in Access Insights, Featured, Taxonomy


Photo by Fanghong, / CC BY-SA 3.0

As most readers of this blog know, taxonomies are controlled vocabularies in which the terms are arranged hierarchically. Terms representing the broadest concepts are at the top level, and terms representing more specific concepts within those concepts are placed at a deeper level. The result is a vocabulary structure containing increasingly narrower terms, with “narrower” referring to more specific concepts. And where you have narrower terms, you inevitably have broader terms. It’s partly a matter of perspective: Going deeper into a taxonomy, the terms get ever narrower. Turn around and go back up to towards the top, and they get ever broader.

These “hierarchical relationships” are discussed in section 8.3  ANSI/ NISO Z39.19-2005 (R2010) (“Guidelines for the Construction, Format, and Management of Controlled Vocabularies”), which starts with this statement: “The use of hierarchical relationships is the primary feature that distinguishes a taxonomy or thesaurus from other, simple forms of controlled vocabularies such as lists and synonym rings.” (Regarding the implication that all thesauri are hierarchical, remember that practically all modern thesauri are hierarchical. However, there are other things, such as equivalence relationships and associative relationships, that distinguish them from other taxonomies and from other controlled vocabularies in general.)

Logical hierarchical structure is essential to enable taxonomy users to browse and navigate the taxonomy. In addition to the eventual users of the “finished” taxonomy, these users include taxonomists creating, developing, and maintaining the taxonomy, as well as the indexers who hunt for and apply the most appropriate taxonomy terms for the content being indexed. There are also technical considerations, such as “rolling up” of the narrower terms to their respective broader terms for such purposes as customized RSS feeds. Creating a usable hierarchical structure is partly a matter of choosing good terms for the top levels, to serve (ironically) as a foundation for the structure. After that, it’s largely a matter of choosing good narrower terms for the broader terms, and of choosing good broader terms for the narrower terms.

ANSI/NISO Z39.19 discusses three types of hierarchical relationships: the generic relationship; the instance relationship; and the whole-part relationship. These are simply different aspects of the more general – more specific pairings that are the one overriding principle of hierarchical taxonomy structure.

The instance relationship is straightforward: the narrower term is a specific instance (“instance” being a one-of-a-kind kind of thing) of the broader term. Borrowing from the Z39.19 example:

Mountain regions

.. Alps

. . Himalayas

With whole-part relationship, the name is perhaps self-explanatory. If not, some examples adapted from Z39.19 should be.

Nervous system

. . Central nervous system

. . . . Brain

. . . . Spinal cord


. . Ontario

. . . . Ottawa

. . . . Toronto

In the generic relationship, the narrower term “is a type of” whatever the broader term represents. For instance, a parakeet is a type of bird. Using the customary visual principle (which, confusingly, not all taxonomies follow) of going from left to right as the concepts get more specific:


. . Parakeets

The classic test for the appropriateness of a generic relationship is the “all-and-some” test. In the example above, in going from broader to narrower, are some birds parakeets? The answer is yes. Now, going in the other direction, from narrower to broader, are all parakeets birds? Again, the answer is yes. Our example passes the all-and-some test.

Now, let’s mess things up (for illustrative purposes only, of course).


. . Parakeets

Again, going from broader to narrower, are some pets parakeets? The answer is yes. So far, so good. Heading back up, are all parakeets pets? Not yet. There are still flocks of parakeets out in the wild. (No, “a lot of them are pets” doesn’t count.) The example above fails the all-or-some test.

Most of the mistakes I’ve seen in taxonomies and thesauri have to do with hierarchical relationships that would fail the all-or-some test. It might seem like an academic thing, but what it comes down to is checking the logic and predictability of the hierarchical relationship pathways. A term in the wrong place is likely to be overlooked, and it may cause inappropriate or missed search suggestions, as well as problems with RSS feed content.

With many scholarly and scientific thesauri, the terms aren’t well described by the types mentioned above. With disciplines of study, it’s more a matter of nesting sub-disciplines within disciplines. The all-or-some test may still be useful, but you might need to mentally preface each term under consideration with “the discipline of” or “the study of”.

One more thing: Polyhierarchy is good. If you don’t take advantage of an opportunity to pair up a term with a logical term, only because it already has a broader term, you’re limiting the pathways by which the term can be discovered. And you’re limiting the subject area overview and insight that a more complete set of narrower terms would provide.

And now a quiz for you: What’s the opposite of logical hierarchy? The answer: Taxonomic anarchy!

Barbara Gilles, Communicator

Data Harmony® v.3.10 Named 2015 Trend Setting Product by KMWorld

Access Innovations, Inc., the industry leader in data organization and innovators of the Data Harmony® software suite, is pleased to announce that KMWorld has named Data Harmony Version 3.10 on their list of Trend Setting Products for 2015.

“It is vital to stay at the forefront of knowledge management and, with Data Harmony v.3.10, we have delivered the most integrated, flexible, productive, streamlined, and user-friendly semantic enrichment software on the market,” notes Marjorie Hlava, president of Access Innovations, Inc. “We will continue developing new and innovative ways to analyze, enhance, and access data to increase findability and distribution options for our customers.”

The proven, patented Data Harmony software is the knowledge management solution to index information resources and, with Version 3.10, has pushed the envelope further, with a more modern graphical look, increased search functionality, auto-complete, color-coding for easier readability, and much more. With these improvements in place, Data Harmony offers a richer, more advanced, and friendlier customer experience.

The Trend Setting Product awards from KMWorld began in 2003. Speaking on behalf of the judging panel, KMWorld Editor-in-Chief Hugh McKellar says, “In each and every case, the thoughtfulness and elegance of the software certainly warrants deep examination. Depending on customer needs, the products on the list can dramatically boost organizational performance.

McKellar adds, “The panel, which consists of editorial colleagues, market and technology analysts, KM theoreticians, practitioners, customers and a select few savvy users (in a variety of disciplines), reviewed more than 200 vendors, whose combined product lineups include more than 1,000 separate offerings. The products identified fulfill the ultimate goal of knowledge management—delivering the right information to the right people at the right time.”

Data Harmony v.3.10 is available through the cloud, a hosted SaaS version, or an enterprise version hosted on a client’s server. More information about Data Harmony and its 14 software modules is available at


About Access Innovations, Inc.,,
Founded in 1978, Access Innovations has extensive experience with Internet technology applications, master data management, database creation, thesaurus/taxonomy creation, and semantic integration. Access Innovations’ Data Harmony software includes machine aided indexing, thesaurus management, an XML Intranet System (XIS), and metadata extraction for content creation developed to meet production environment needs. Data Harmony is used by publishers, governments, and corporate clients throughout the world.


About KMWorld
KMWorld ( is the leading information provider serving the Knowledge Management systems market and covers the latest in Content, Document and Knowledge Management, informing more than 40,000 subscribers about the components and processes – and subsequent success stories – that together offer solutions for improving business performance.

KMWorld is a publishing unit of Information Today, Inc. (

Choosing Related Terms

August 17, 2015  
Posted in Access Insights, Featured, Term lists

Source: Dreamstime

Which quiche? You’ll have to read to the end to find out.

As many of our readers are aware, hierarchical thesauri are distinguished from other taxonomies by (among other things) the inclusion of non-hierarchical relationships among terms. One main type of non-hierarchical relationship is the equivalence relationship, which is usually expressed as a preferred term – non-preferred “term” (generally a synonym or quasi-synonym) pairing. The other main type of non-hierarchical relationship is the associative relationship, in which regular thesaurus terms are paired. (Note: I’m using “regular terms” here to refer to what are commonly called “preferred terms”, meaning terms that aren’t non-preferred terms.)

The paired terms are called “related terms”; while terms in a thesaurus can be “related” in various ways, “related terms” are those that carry a reciprocal associative relationship. If Term A has Term B as a related term, then Term B will likewise have Term A as a related term. Good taxonomy management software will be responsive to a taxonomist’s addition of a related term to a term record, and will automatically add the reciprocal relationship to the other term’s term record.

The Z39.19 standard (ANSI/NISI Z39.19-2005 (R2010), “Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies”) offers a somewhat vague definition of the associative relationship: “A relationship between or among terms in a controlled vocabulary that leads from one term to other terms that are related to or associated with it.” (Subsection 4.1) As the standard comments later (8.4), “the associative relationship is the most difficult one to define.”

Subsection 8.4 does capture the essence of the associative relationship: “This relationship covers associations between terms that are neither equivalent nor hierarchical, yet the terms are semantically or conceptually associated to such an extent that the link between them should be made explicit in the controlled vocabulary, on the grounds that it may suggest additional terms for use in indexing or retrieval.”

Z39.19 discusses associative relationships between sibling terms, as well as the more common (and perhaps more valuable) associative relationships between terms in different hierarchies within a thesaurus. With the sibling relationships, the conceptual relationship should be stronger than simply being part of the same broader concept; otherwise, there is no point in adding the associative relationship. (I should point out here that there are those who advocate for always adding associative relationships among sibling terms.) Even so, when one browses or navigates a thesaurus, siblings are readily visible. The value of associated siblings is mainly in search.

Associative relationships across hierarchies are another matter. They call attention to terms (and content indexed with those terms) that the thesaurus user or searcher should perhaps be aware of, and otherwise might miss. Some of the example pairings in Z39.19 are weaving and cloth; pathogens and infections; surface tension and liquids; and ducks and rubber ducks. These examples are from Z39.19’s Subsection 8.4.2, which illustrates the various types of associative relationships (such as Process / Agent) listed in a table in section 8.1. While it’s instructive to be acquainted with these types, there’s no need to memorize them or refer to them, unless one is looking for inspiration for adding more “related terms” to a term.

Here are some of the problems I’ve seen with related terms in thesauri:

Few or no related terms; not taking advantage of the ability to add related terms, and missing out on the benefits that they can provide.

Too many related terms, with the ones that could be valuable getting lost in the mix, and the network of relationships becoming cumbersome.

Very vague related terms, with no real value. This often goes hand in hand with too many related terms.

Related terms that might better serve as broader or narrower terms, in a hierarchical relationship.

And last but not least, terms that would be great as related terms, but that are in an inappropriate hierarchical relationship.

One simple example of the last problem has to do with quiche. As mentioned elsewhere in the TaxoDiary blog: “In a food thesaurus, “Quiche” or “Quiches” would not be appropriate as a narrower term under “Vegetarian foods,” because some quiches contain bacon or ham.”

However, Quiche would make a fine related term for Vegetarian foods, because many quiches are suitable for vegetarian consumption.

So choose wisely, and Bon appetit!

Barbara Gilles, Communicator
Access Innovations, Inc.

Let’s Talk Semantics

August 12, 2015  
Posted in Access Insights, News, semantic

The rapid growth of digital information has been discussed here many times. It has spawned many interesting and creative approaches to information analyzing, indexing, retrieval, and findability. Access Innovations has been acutely aware of the needs for quite some time and works to provide comprehensive solutions for its clients. This interesting topic came from Research Information in their article, “From documents to data.”

The world we live in, where millions and billions of documents are searched in a matter of minutes with a simple keyword search, has produced many approaches to managing the data. One approach that has gained a lot of interest is semantic enrichment, the enrichment of content with additional semantic metadata to enable greater understanding of content by machines.

The author of this article reached out to our own Bob Kasenchak, head of product development at Access Innovations, for more information about semantic enrichment. The article outlines the challenges facing data professionals seeking to manage digital resources and is worth your time to read. The situation is becoming more complex and, in many ways, more advanced. New techniques for semantic enrichment are a response to the needs created by these changes.

Melody K. Smith

Sponsored by Data Harmony, a unit of Access Innovations, the world leader in indexing and making content findable.

Classifying Exoplanets: Where does Earth 2.0 fall?

August 10, 2015  
Posted in Access Insights, Featured, Taxonomy

On the 9th of January, 1992, astronomers around the world rejoiced. For the first time ever, they had definitive proof of a planet orbiting another star. These early observations were extremely limited, mainly focused on noticing the wobble of the parent star, but they opened the proverbial floodgate for exoplanet discovery. A little over two decades later, the list of known exoplanets continues to grow every day, with the number of verified planets well above a thousand, and the number of candidates exponentially more than that. All of these new planets have highlighted the rather humbling fact that, before now, we really knew nothing about the sheer number and diversity of exoplanets within the galaxy.

For a hundred years, it was assumed that other planets would roughly reflect what we see in our own solar system: A number of small rocky worlds close to the star, with a number of large gas giants circling farther out beyond the Goldilocks Zone (where liquid water can exist). The reality is quite different.

We now know, for instance, that planetary migration is common. Large gas planets, which form far away from their parent star (a requirement to keep their gas from becoming heated and stripped away during the planetary formation process), often tend to fall closer to their star over time. These “Hot Jupiters,” as they are called, often orbit extremely close to their star. This tight orbit opens up a world of geologic possibilities for the planet (and its moons). Imagine a planet like Jupiter, which has vast amounts of frozen water. As the planet drifts closer to the sun, the water ice melts and these planets develop oceans deeper than the diameter of earth. Water worlds like this have often been referenced in science fiction, but now they have become known as scientific fact.

How do we go about classifying these exoplanets when each one illustrates how little we actually know? Pluto was only recently kicked out of the planetary club, with its eviction predicated on our defining some of the most basic aspects of a “planet” (size, gravitational impact, etc.) How do we go about sub-classifying the many exoplanets we’re now beginning to find when we can barely agree on what constitutes a planet in the first place?

Should planets be classified on material composition in accordance with historical precedent (rocky worlds vs. gas giants)? This seemed to be effective at classifying the planets in our solar system, but when we know that gas giants can migrate to extremely close orbits, and their frozen gas compositions can change drastically, does this standard still hold up? The reverse of this is also true as we find an increasingly large number of “Super Earths”, planets that are rocky like earth and our inner solar system neighbors, but closer in size to the gas giants.

What about the type of star that a planet orbits? One would think that perhaps that could be the common denominator. But now we know that some planets orbit large hot stars, others orbit old cool stars, and some orbit two stars, while some have been flung out of their orbits altogether to float lonely out in space forever. Large planets evicted from their solar systems in this way aren’t even considered planets but are instead considered “sub-brown dwarfs”, somewhere in the gray zone between planets and stars.

As we have learned about exoplanets, we quickly realized that we knew practically nothing about them. What we now know has shattered our old classification system. Whatever eventually replaces it will need to be far more sophisticated and take into account the vast diversity we now know exists. The Star Trek dream of discovering Class M planets simultaneously seems further away and closer than ever before, and I for one am eager to see what happens.

Win Hansen, Production Manager
Access Innovations, Inc.

Registration Now Open for the 12th Annual Data Harmony Users Group Meeting

August 3, 2015  
Posted in Access Insights, Featured

Registration is now open for the twelfth annual Data Harmony Users Group (DHUG) meeting, scheduled for February 9-10, 2016 at the Access Innovations, Inc. offices located at 4725 Indian School Road NE, in Albuquerque, New Mexico.

Access Innovations, the company behind the Data Harmony product line, hosts the annual meeting. Case studies presented by Data Harmony software users make up most of the program, and most attendees indicate that these are the primary reason they attend DHUG.

“Based on feedback from the 2015 meeting, we have altered the schedule for DHUG 2016,” said Heather Kotula, Director of Communications at Access Innovations. “We are a client-driven organization and our users’ needs reflect the new arrangements, the same as updates and new features in the Data Harmony software.”

Monday, February 8, attendees will meet at the Access Innovations office, where introductory training topics will be covered. “Having the training at our office makes it possible for everyone attending to have a hands-on experience with the software,” remarked Win Hansen, Director of Production Services. “Whether they are experienced users, new users, or still considering Data Harmony software for their organization, this approach is the most beneficial.”

Tuesday and Wednesday will feature a combination of case studies presented by users and presentations by Access Innovations and Data Harmony staff.  The Welcome and Features Update by Margie Hlava, president of Access Innovations, normally a three-hour presentation on the second day of the meeting, will now be split between the two days. The Welcome portion will be presented in the morning on Tuesday. The Features Update will be combined with the wrap-up on Wednesday afternoon. Sessions on these two days will held in the Mega Classroom at NMSU, adjacent to the Access Innovations office.

Thursday, February 11, is a full day of in-depth training, also held in the Access Innovations office. “The segments on this day are geared toward users who will work with the admin module, importing and exporting data, and performing batch processes,” explained Jack Bruce, Senior Taxonomist. “We will also cover information about XIS, our XML database system.”

Meeting registration includes a networking reception at the nearby Hampton Inn on Monday evening and dinner on Tuesday evening. The Tuesday evening dinner will be held at the Unser Racing Museum, which uses modern technologies to educate and immerse visitors in the exciting world of racing.

“By hosting the meeting at the Access Innovations home office, we are able to make the entire staff available to discuss technical and tactical issues,” remarked Bob Kasenchak, Director of Product Development. “It also fosters better communication among the various parties, and we are able to identify and solve issues much faster.”

To register for the meeting please visit the DHUG registration page.

Data Harmony software users are encouraged to submit case study abstracts to present at the meeting. The submission form uses the Data Harmony Smart Submit software module.

For information about planning a trip to Albuquerque for the meeting, go to


About Access Innovations, Inc. –,,

Founded in 1978, Access Innovations has extensive experience with Internet technology applications, master data management, database creation, thesaurus/taxonomy creation, and semantic integration. Access Innovations’ Data Harmony software includes machine aided indexing, thesaurus management, an XML Intranet System (XIS), and metadata extraction for content creation developed to meet production environment needs. Data Harmony is used by publishers, governments, and corporate clients throughout the world.

Everything Goes Every Place It Fits

July 27, 2015  
Posted in Access Insights, Featured, Taxonomy

The traditional taxonomy is monohierarchical – there is one and only one place for every term. This is only sensible for many purposes; no one wants to look up a book in a library catalog only to find out that it is either in Section A or Section F, and no biologist would ever identify a new species as being both a fungus and an animal. “A place for everything and everything in its place” is the guiding principle of the monohierarchical taxonomy.

But there are times when that guiding principle just isn’t appropriate.

Take synthetic biology, for instance. To some biologists, DNA is just a biopolymer – a biologically created compound made up of distinct smaller compounds. A synthetic biologist wouldn’t be likely to disagree with that characterization; DNA is a biopolymer, so the traditional biological taxonomy still stands. It is also a type of information storage. The traditional biologist wouldn’t disagree with that characterization either, but might question its relevance in building a biology taxonomy. When the synthetic biologist is designing devices that modify DNA strands to store data and other devices to read that data, though, it becomes relevant. DNA is a data medium as well as a biopolymer. It belongs with magnetic tape and optical discs as much as it does with cellulose and starches. In the same way, genetically engineered spider silk is both an animal product and an artificial fiber in equal measure – pigeonholing it into a specific location in the taxonomy keeps it out of an equally good location.

What about household electronics? By 2008, Sony’s Blu-Ray discs had overtaken Toshiba’s HD-DVDs and were clearly going to be the next generation video format of choice. But Blu-Ray players were still relatively rare and expensive. If you knew where to look, or even to look at all, one of the most economical options for a new Blu-Ray player was the Playstation 3. Unfortunately, even online retailers weren’t marketing the game console as a Blu-Ray player. A monohierarchical taxonomy at the retailer would obviously classify the systems as game consoles – that’s exactly what they were. But they were also functional and affordable Blu-Ray players. There’s no way of knowing for certain, of course, but it is entirely possible that online retailers like could have sold even more Playstation 3 consoles if their customers had seen the consoles as an option when searching for Blu-Ray players.

These aren’t isolated situations. Synthetic biology isn’t the only multidisciplinary field. Modern science includes chemical physics, astrophysics, neuroeconomics, and a whole host of other fields that draw from two or more distinct disciplines. The whole point of these multidisciplinary sciences is to study the places where the parent disciplines converge. Technological convergence is a real and growing trend. Microwave televisions may not be the wave of the future, but a smart oven that can bake a pie based on a recipe it downloaded off the internet and then call you on your cell phone when it’s done may be just around the corner; if your next oven comes from a computer manufacturer, where will it be listed in the online catalog?

Luckily, modern taxonomy has a tool to deal with that problem: polyhierarchy. In a polyhierarchical thesaurus, you might still find DNA in the traditional biological place, as a child term of biopolymers, but you might also find it as a child term of storage media. Likewise, you could find that Playstation 3 either by browsing the list of game consoles or by browsing the list of Blu-Ray players. It isn’t suitable for every situation, but it makes for a more flexible thesaurus that provides added value in many circumstances. An article on spider silk is tagged in a way that lets both the researcher interested in animal products and the one interested in artificial fibers know that it may be relevant. An online store search returns the microwave television in a search for either microwaves or televisions. The polyhierarchical thesaurus replaces “A place for everything and everything in its place” with “Everything goes every place it fits.”

Tim Soholt, Webmaster
Access Innovations, Inc.

« Previous PageNext Page »