Nobody is going to deny that publishing is and always has been a sometimes messy process, but sophisticated uses of metadata and taxonomies can help clean it up. It fascinates me how intimately it can work in every step of the process to make it easier on everybody, from the author writing the piece to the institution that publishes it, all the way to its marketing and use.

Let’s start at the beginning, with the writer. Presumably, the person is an expert in his or her field, or at least working toward it, but that absolutely doesn’t make them an expert in searching for the information they need. That’s what always made library sciences so valuable, and while they’re still extremely valuable (don’t want to offend my librarian friends out there), the rise of enriched metadata means that searching and finding the content they need to conduct their research can be laid out clearly and concisely in front of them. This allows them to function in a noise-free environment and produce their best possible work.

So they’ve done all that and it’s time to submit the work to publishers. As we’ve seen, this can be an ordeal, but semantically enriched content, once again, can be implemented to ease the process for both the author and the publication. Tagged with relevant thesaurus terms, the submission can be analyzed to identify its subject, where it can then be more easily sorted and sent to properly qualified experts in the field for peer review. This might seem like a small part of it, but any amount of time saved is a big benefit to the author, who is often under the crushing weight of tenure deadlines.

However, once the author’s submission is out the door and in the hands of peer reviewers, it goes through its revision process, sent back and forth to get everything squared away. This, of course, can take a long time, but once the work is ready for publication, metadata begins to take on its most important role. Those same (or similar) subject terms that helped direct the submission into peer review now help to make certain that it is now directed to the most relevant possible journal, ensuring that the right people can easily find it.

This is the point at which, with the right tools and the right people in place, the metadata can really shine, because there’s so much that can be done with it. Once an article is published, either in an open access format like PLOS One or a more traditional subscription journal, its metadata can be used for an increasing number of purposes, anything from simple organization to highly advanced linked data.

Whatever that data is used for, the most important thing is that the content can be found. Everything after that is useless if it sits in the ether, hidden so nobody can read it. And as is likely fairly clear by now, the metadata is absolutely crucial at this end stage, where other researchers need to locate the content to conduct their own work. Just like original authors’ needs for clear, concise search results when their process started, if these new researchers have their results muddled with bad results and noise, let alone a result that get missed completely, it’s much more difficult to find the necessary content. This can prevent authors’ work from reaching the people who require it and keep it from furthering work in the field.

That’s counterproductive to research, obviously, but it’s also totally unnecessary. It shouldn’t take much to get people to see how this kind of metadata enrichment can make authors’ and publishers’ lives easier. It’s relatively new and there are a lot of buzzy words attached to it, but that doesn’t change the value of the core concept.

The good news is that semantically enriched metadata is starting to show up all over the place. Software like Data Harmony from Access Innovations automates much of this to help academic journals and institutions facilitate research. The pile of metadata is already gigantic, so it’s vital that the new content that journals are constantly publishing gets analyzed and tagged swiftly and accurately.

To me, the furthering of research is the most important thing, but there is another step in the process, that of marketing and sales. It’s the same principle as with everything else here: you can’t buy what you can’t find. The place with the clearest inroads to the content the consumer is looking for will be the one that wins. But the truth is that the sooner that people adopt the ideas behind semantically enriched metadata, the sooner it is that we all win.

Daryl Loomis

Access Innovations