A Newbie’s Guide to DHUG Meetings

The biggest week of the year at Access Innovations is almost upon us. Every year, we present the Data Harmony Users Group (DHUG) meeting, where our esteemed clients come from all over the nation to meet and learn from the people who built the software and use it on a daily basis. Right about now, there starts to be a lot of buzz around the office. There are a lot of people coming to Albuquerque for this, and everyone here is pretty excited to swap ideas with them, because they’ve come up with some interesting uses of our software, things that have made us better in the process.

Now, I haven’t been in the taxonomy game as long as most of the people here, so like its attendees, DHUG meetings are brand new to me. I don’t know exactly what to expect, but there some things that I’m definitely looking forward to seeing. The workshops that we’re conducting for the attendees will certainly be interesting and informative for a newbie like me, but the people I’m most anticipating are our guest speakers. These are people with different perspectives who are removed from the office echo chamber, which helps breathe fresh life into taxonomies.

This year, we have great guests who are gracious enough with their time to discuss their experiences with Data Harmony software and how they use it within their organizations. Its applications are broad, and each case study is unique, so what sorts of things am I going to hear about?

Kicking off these case studies are Sharon Garewal and Ron Snyder from JSTOR, one of the largest and most respected shared digital libraries in the world. We’ve done a lot of work with them and, this year, they’re launching the JSTOR Sustainability Collection and discussing it at DHUG 2015.

This interdisciplinary collection is composed of journals, reports, and working papers from the realms of academic publishing, scholarly societies, industry groups, research institutes, and universities to look at how the environment, human activities, and industry can be made sustainable in the long term. This has become an increasingly important issue, and they will discuss how the JSTOR Thesaurus, which was built using Data Harmony, makes crossing through the many fields of study a fairly straightforward affair.

One of the really interesting things about taxonomies and modern data analytics is how indexing can be leveraged to see information that would have taken a mountain of time and effort to figure out before. That’s precisely what Helen Atkins of the Public Library of Science (PLOS) will discuss with attendees in her talk, “The Fate Predictor Project.”

PLOS ONE, their international, open-access, online journal, has semantically enriched their content recently. Using the metadata that got extracted, they were able to see statistics about acceptance and rejection of papers. Using this data, along with data about country of origin, author, number of authors, etc., they are able to predict with accuracy whether a currently submitted paper will get accepted or rejected. That doesn’t take away the need for peer review, but knowing what kinds of things flag often for rejection will be able to save the PLOS editors huge amounts of time.

This is just one example of how sophisticated data usage can open eyes to otherwise unseen patterns. Marketing companies use it to see buyer patterns, leading to all those advertisements directed to individuals. This is how the Internet of Things will work, so that your refrigerator knows what resides inside and for how long, and can recommend recipes, keep your shopping list, and tell you when your milk has gone sour. Maybe its biggest current application is in security, where it’s being used in myriad overt and covert ways. This is right in line with the kind of semantic enrichment that Access Innovations does.

The talk that I’m most interested in will come from Kevin Ford of MarkLogic. His presentation, “Implementation of Taxonomy Triples from Data Harmony Exports,” will explore how companies can convey more accurate information, make data-driven decisions, and reduce risk by taking content from documents and data and combining them with RDF triples into a single architecture. By enabling search across different kinds of information from many sources, this kind of architecture can help users glean greater insights, and will help customer bases quickly and accurately mine knowledge from the data.

Ontologies are taking an increasingly prominent place in the world of semantics, and many believe that their use will take a big step toward genuine artificial intelligence. How far off that might be is certainly up in the air, but it’s presentations like this one that will start to reveal how it might work, if not when it might work.

These aren’t the only presentations at DHUG 2015. There will be more case studies from our users, as well as panels by the highly knowledgeable staff from Access Innovations. Those, in conjunction with meeting new people over great food and conversation, are going to make February 16-20 a pretty great week.

Daryl Loomis
Access Innovations

A Newbie’s Guide to DHUG Meetings