Using taxonomies and semantic tagging of articles to enable subject browsing and to improve Search are two of the most popular reasons why publishers invest in semantic technology, but the potential benefits are not limited to these two applications. In a panel discussion at the SIIA Information Industry Summit in January, executives from some of the most prominent companies in the information industry shared the key reasons why they are investing in semantic technology:

  • Brad Allen, Elsevier: “Enable more machine/machine apps – let the system do the heavy lifting.”
  • Kevin Jiang, Thomson Reuters: “Enhance discovery in more and faster ways than search alone.”
  • Francois Ragnet, Xerox:  “Bridge the gap between technical jargon and consumer language.”
  • Adriaan Bouten of McGraw-Hill:  “Provide a richer product by connecting the dots for the user.” 

Over the last few years, interest in semantic technologies has grown among publishers of all types. In the STM area, Elsevier took a leadership position in this effort with the release of a number of APIs that leverage the Scopus taxonomy and other metadata to enable links and interactivity with external content. 

In consumer media, the NY Times continues to expand its open platform solutions, offering tagged content to developers of new applications. The key drivers for these investments, as well as other semantic initiatives within the publishing world are to: 1) stimulate awareness and new uses for content assets, leveraging the “long tail” in proactive ways, rather than simply relying on web search and traditional channels of distribution; and 2) create unique value within digital collections to offset the reduction in overall revenue as consumption of print declines. 

Semantic enrichment within publisher sites can enrich the user experience in many ways. A thorough exploration of these features, in the context of a scientific research paper, was documented by David Shotten and his team from the University of Oxford in a 2009 article in the journal PLoS Computational Biology. Semantic features included markup of textual terms referring to people, places, taxonomy terms, organisms, etc., and live DOI links to relevant third-party information resources, interactive tables and figures. In popular and consumer media, links from personal and company names to Wikipedia and DBpedia and mashups with Google Maps are becoming common features of many sites. 

Access Innovations has developed web services and APIs to embed links within the text of articles, giving publishers a number of ways to enrich the user experience and drive traffic to other high-value content. In addition to linking taxonomy references within the text, links can be based on authority files of place names, people, organisms, and chemical formulas.  Links can also go both ways, enabling the linking of documents across different content types. Thus, a user who visits a site to read an article can also be made aware of closely related events, products, and topics of interest. The result is a richer site experience, a more engaged user, and additional revenue opportunities for the publisher.

Bert Carelli
Vice President, Business Development
Access Innovations/Data Harmony