Taxonomists like to view a vocabulary as a literary work, which is more artistic when the style is consistent and cohesive. Consistency – which leads to predictability when searching or browsing – also makes it easier to avoid unintentional inclusion of multiple preferred terms for a single concept.

Nouns, Nouns, Nouns

In modern taxonomies and thesauri, each term reflects a single concept, or (especially at the top level), a conceptual area that might need to be expressed in the form of two or three very overlapping concepts.

These single concepts are best expressed in noun form. So use nouns and noun phrases for your terms. Do not use adjectives or adverbs in isolation. “Very” by itself is not a good search term. As far as verbs are concerned, avoid infinitives and participle forms. Gerunds, such as “Running”, “Reading”, and “Machine learning”, are nouns (at least in English) and are completely acceptable as taxonomy and thesaurus terms. Use “Communication” instead of “Communicate”, and “Administration”, not “Administer”.

Avoid initial articles. So, it is theater, not the theater; it is state, not the state. However, if an article is part of a proper name, use it even though it is an initial article. Examples: Le Mans, El Salvador, The Hague.

Singular versus Plural

In general, unless you are referencing unique items (such as are often known by proper noun names), use the plural form. For “count nouns”, or names of items that you can count – how many telephones, how many desks, how many whatevers – use the plural form. If it is a non-count noun, or one that you say how much (think of “Cash”), then it is put in as singular. Count nouns are plural; non-count nouns are singular. It is mostly a matter of common sense and what “sounds” correct, but if you are not sure, try asking yourself how many or how much of the term in question.

Do not be overly rigorous about applying plural forms wherever possible. Again, use common sense and your knowledge of common usage. Do not change “Water” and “Money” into “Waters” and “Monies.” Abstract terms, such as the “-tions” and “-ities,” should normally stay singular. Be careful with words whose meanings change between singular and plural (art/arts, novelty/novelties, quality/qualities, security/securities, speech/speeches).

There are some other exceptions to the plural rule. In taxonomies and thesauri, parts of the body are generally expressed in singular form. Taxonomy terms for body systems and organs are customarily given in singular form, rather than the usual plural form. For example, in an anatomy branch, you might have Ear, with one of its narrower terms being Middle ear. This seems more natural, probably because of the more generic nature of the concepts being represented. “Middle ears” just seems strange.

Unique entries are generally shown in the singular: Big Ben, Fisherman’s Wharf; you’ll usually know when to use the plural form, especially for specific groups such as Ice Capades, and World Famous Lipizzaner Stallions, which are always referred to in the plural.

Capitalization

For a proper name, use capitals as appropriate. For other terms, I recommend (and many taxonomists practice) capitalizing just the first letter of the first word (unless there’s a contrary spelling such as with “pH”). If the term is a two-word phrase, like Electrical engineering, capitalize only the first letter of “Electrical”. Yes, ANSI/NISO Z39.19 suggests (but doesn’t insist on) lower casing all terms. However, in common practice, the first-letter capitalization is more common, and more natural.

Taxonomist Heather Hedden has commented on this issue:

The choice of initial capitalization for a thesaurus … would not be incorrect, and is probably becoming more common, just as initial capitalization is becoming more common in main entries in back-of-the-book indexes.

A “taxonomy” implies a hierarchical classification or categorization of concepts. When we think of categories we think of labels or headings with subcategories. Headings in general tend to have initial capitalization or title capitalization. Thus, if it’s a strictly hierarchical taxonomy, where all terms are interconnected into a single hierarchy or a limited number of hierarchies, then it will more likely have initial capitalization or title capitalization. Such capitalization is particularly common on the relatively smaller/less detailed taxonomies that are proliferating on websites, intranets, and content management systems. It fits in with the web design style of capitalization on headings and categories.

From “Capitalization in Taxonomies” The Accidental Taxonomist (blog) http://accidental-taxonomist.blogspot.com/search/label/Editorial%20style

Some taxonomies and thesauri use SOLID CAPITALIZATION throughout. However, terms that are solid capitalized are difficult to read, and are downright forbidding to skim through quickly when browsing or navigating a vocabulary. I strongly recommend that you avoid that style.

Acronyms

Acronyms will sometimes be preferred terms, but only when they are well-known. Always include the expanded name, in one form or another; it could be a non-referred term in a thesaurus. Laser (Light amplification by stimulated emission of radiation) is very popular. A similar term is LASIK (Laser-assisted in situ keratomileusis). The preferred term is the term in common usage. We put the synonym in so we can find it. For an acronym, we maintain the capitalization.

Spelling

Aim for consistency in use of terms that have variant forms. If your vocabulary is in English, decide whether you will be using American English or British English spellings with the preferred terms. Include the other spellings as non-preferred terms.

Examples: aluminum/aluminium; fiber optics/fibre optics; call centers/call centres

In other cases in which there are two or more alternate spellings, use the spelling that is most widely recognized, or that is most likely to be favored by users of the vocabulary.

The Little Things (Commas, Hyphens, Apostrophes, and Parentheses)

In general, you should avoid including those little things listed above in your terms, unless they are necessary for correct spelling (which, of course, they sometimes are). They can complicate search, depending on your software system. Also, applying a conscious decision to exclude those things where there is an option helps to make the overall terminology more consistent and predictable.

One or more commas in a term is a red flag that two or more concepts have been combined. (One logical exception would be a proper name with commas, such as the Bureau of Alcohol, Tobacco, Firearms and Explosives.) Unless the term in question is a top term, this could be counter to the ideal approach of one term per concept, and one concept per term.

Parentheses in terms contradict the natural language approach. However, sometimes text in parentheses is valuable for clarifying the meaning of a term. Use a parenthetical qualifier, such as in “Binary systems (chemistry)”, at the end of a term if you must. Do not use it if you do not need to.

I will have some more comments on parenthetical qualifiers next time.

Marjorie M.K. Hlava President, Access Innovations

This posting is one of a series based on a workshop, “Thesaurus Creation and Management,” that Marjorie Hlava presented in December of 2012.