Access Insights

Maintaining a Thesaurus in an Excel Workbook, Part 2

In Part 1, we looked at maintaining a taxonomy in Excel – a set of preferred terms arranged in a hierarchy. This taxonomy structure is a handy way to organize a group of terms and can be used across an industry for benchmarking or reporting requirements (see Strategies for Incorporating Data Exchange Standards in E-Business Taxonomies advocating for the construction industry and The IFRS Taxonomy, including the labels used in the International Financial Reporting Standards). Excel works quite well to create and maintain a taxonomy, but how about a thesaurus?

Natural Language Processing Only Goes So Far

By |May 7th, 2012|Access Insights, Featured, search, semantic|Comments Off on Natural Language Processing Only Goes So Far

The sirens call of natural language processing has been issued again. In this study, the researchers compared the use for free text searches to administrative codes to see which would give better indication of safety based on 20 indicators. The authors rightly suggest that instead of relying only on the notoriously poor check boxes used with discharge orders that the hand written notes form the discharge nurse or physician might be much more instructive.

The DSM-5 Draft: Half-Baked Meatloaf?

For a short time in the 1990s, I helped a small company develop proposals for providing mental health services. My desk housed two standard references: the Chicago Manual of Style, and the fourth edition of the Diagnostic and Statistical Manual (DSM-IV) of the American Psychological Association (APA). The DSM is essentially a classification system, like a taxonomy, providing a structured list of the psychiatric diagnoses used by mental health practitioners.

Access Innovations, Inc. Creates Taxonomy For Iowa Code, Administrative Code, and Acts

By |April 23rd, 2012|Access Insights, Featured, Taxonomy|Comments Off on Access Innovations, Inc. Creates Taxonomy For Iowa Code, Administrative Code, and Acts

Access Innovations, Inc., a leader in the data management industry, has collaborated with the Iowa Legislative Services Agency to build a customized thesaurus that allows the Iowa Legislature General Assembly to easily access its extensive legal body of existing and proposed laws, bills, acts, and regulations by using controlled, vocabulary-driven indexing in addition to published indexing codes.

Maintaining a Thesaurus in an Excel Workbook

By |April 16th, 2012|Access Insights, Featured, Taxonomy, Term lists|Comments Off on Maintaining a Thesaurus in an Excel Workbook

There’s been some discussion recently in the Taxonomy Community of Practice LinkedIn group about free or low cost thesaurus management software. I’ve noticed a dearth of postings about using Excel, a very popular tool, particularly if you already have a Microsoft Office license. Experts disparage Excel as a tool, but it can provide a way to start your thesaurus development. And, if you are mindful of organizing your Excel worksheet so that its data can be imported later into a dedicated tool, you can achieve some important objectives. Excel is indeed the most popular thesaurus management tool. (see Taxonomy & metadata strategies for effective content management workshop slides in which taxonomy expert Joe Busch reiterates this.)

Leveraging Your Taxonomy – Part 10 (Taxonomies in SharePoint)

I hope this series on search has been helpful to users and professionals alike. Let’s close with a look at taxonomies in SharePoint. Let’s look at this data flow in another way. We have incoming information; going to dump into a repository. We need to add metadata to that repository. We want to add taxonomy terms. The taxonomy terms all need to be controlled or suggested. So, there’s a backend to do that. Once we have the data in that repository it could be exported to a SQL or a relational database, transactional system, for e-commerce. It might be put into a repository so that the full displays can be done. It might be loaded into a search system and you also might have a presentation layer for display.

Access Innovations, Inc. Creates Taxonomy For Iowa Code, Administrative Code and Acts

Access Innovations, Inc., a leader in the data management industry, has collaborated with the Iowa Legislative Services Agency to build a customized thesaurus that allows the Iowa Legislature General Assembly to easily access its extensive legal body of existing and proposed laws, bills, acts, and regulations by using controlled, vocabulary-driven indexing in addition to published indexing codes.

Leveraging Your Taxonomy – Part 9

By |April 2nd, 2012|Access Insights, Featured, search, Taxonomy|Comments Off on Leveraging Your Taxonomy – Part 9

As we continue the series on search, we are close to wrapping up with a more in-depth look behind the scenes of database management systems. Let’s take a quick look at behind the scenes. We want to connect the database management system to the thesaurus tool so that we can validate the terms and make sure that they are in good shape and, as people are adding records to the database, if they have any suggestions or candidates, we want to lock those in as well. The thesaurus tool will tell you which terms are actually correct, allow you to add, change, and delete, and otherwise manage the term base. Then the indexing is used to actually suggest indexing terms to records as they are loaded to the database management system. That system can be SharePoint, it could be a content management system, it could be a Documentum or a FileNet, or any other thing you want to use as a repository to manage your data. That is driven by the taxonomy.

Leveraging Your Taxonomy – Part 8

By |March 26th, 2012|Access Insights, Featured, search|Comments Off on Leveraging Your Taxonomy – Part 8

As we continue the series on search and how it works we are looking at file indexes more completely, more specifically complex inverted file indexes. Stemming is the de-pluralization or removing the gerund endings. It is also called lemmatization. Truncation – left and right – are popular parts of search. Right, in some cases, chopping a word off at its end; is pretty easy. Left-hand truncation is hard because if you look at this wild card in the word ‘organization’ which can be spelled with either an ‘s’ or a ‘z’, depending on where you are from, the ‘-ation’ can be chopped off pretty easily but the right part, I have to build an entire index, starting with o, or, org, org, so that I can go through all of those to see where the full extension is. When people do left-hand truncation, it is a lot more expensive. It is a much bigger, additional index.

Leveraging Your Taxonomy – Part 7

By |March 19th, 2012|Access Insights, Featured, search, Taxonomy|Comments Off on Leveraging Your Taxonomy – Part 7

This is the next piece in our series of blog posts on search and how it works. Next let’s look at an inverted file index, let’s pretend that this is the outline of the presentation. I have Define Key Terminology, Thesaurus Tools, Functions, Features, Class, Construction of the Thesaurus etc in the figure below. You can see that the word “Thesaurus” is used three times here. I have a number of other words that you might focus on to see where they are. If I am going to take these and make them into an inverted file, the simple inverted file index is just going to take them and make them into an alphabetic list. So it will sort the high ASCII characters first – the special characters and the numbers – and then it will sort the rest of them alphabetically.