- The Demise of the “Physical Book” has been exaggerated
- The Information Industry has no single central meeting
- Taxonomies are used in websites but not in search
- Analytics are booming.
- Changing information landscape
- Google Scholar
- Semantics, Linked Data, and the cloud
- Boomers are not retiring
The weather in Albuquerque is cold and overcast with a chance of snow for Christmas day—great weather to reflect on the trends we saw in 2015 and what’s ahead for 2016. This review is a narrow focus on things which impact Access Innovations view of the world and where things fit in within the information industry we enjoy.
- The Demise of the “Physical Book” has been exaggerated: print is on the rise
At Techcom 2010, Nicolas Negroponte famously announced that “The physical book is dead.” In fact, five years later, the e-book is declining, while books have witnessed an incredible double-digit increase this year. Some surprise sleepers leading the sales include adult coloring books—I just got a couple myself!
- The Information Industry has no single central meeting or gathering point
Associations are important parts of our professional engagement, and often of learned publishers as well—which makes them our customers. The world of professional associations is under many different kinds of pressures. Attendance is down at library meetings, but booming in technical fields. Formerly crucial conferences, such as International Online (dead) and SLA (withering) no longer provide a gathering of all members of the industry in a single spot.
As this market has fragmented we find ourselves attending more, not fewer, meetings—an alphabet soup of smaller meetings: STM, ALPSP, PSP, SSP, CES, CESSE, NFAIS, SIIA, etc. InfoToday meetings are split into a confusing array of verticals such that you don’t even know who else is there. Other conferences like DataVersity, Predictive Analytics, and SemTechBiz, seem to change their names yearly; it’s hard to tell whether it is the same group of attendees.
- Taxonomies are used in websites…but not in search
Metadata needs to be included in the fundamental design for search to work really well.
Search has been king…but is back to being a pauper. Precision and recall gave way to “relevance” (defined as “my guess that this result is what you want returned!”). (or more cynically, “This is what we want you is best for you”.)Those who operate on the search kernel fundamentally cannot believe that words should have controls added to disambiguate them. Search should be smart enough to incorporate word differences, but without taxonomies and other vocabulary control options it is not.
The Dialog search system was optimized for metadata from the ground up, but implementations of, for example, Lucene usually add a taxonomy after the fact–when the implementation is already done. The taxonomy, the inclusion of which was likely insisted upon by the librarian or web master, is just an annoyance to be dealt with.
Web interfaces are doing an increasingly good job of mitigating the need for training to effectively use their underlying systems. Often, parametric or fielded search options are included in the facets right in the web interface. Most of the well-fielded search implementations are on SQL, Oracle, or Endeca platforms. Search is still broken; it still needs metadata.
Search is still lost in general. It is incredibly difficult to get those who generate the search kernel to believe that vocabulary control is needed, and in fact produces measurably better search results (it goes against the core training they get in computer science school.) so the work-around ends up being to embrace the taxonomy on the web site.
A few scholarly publishing/research sites actually use taxonomy terms in the search (for both type ahead and for the first inverted file search) on the technical side. Those sites are sticker– clients love them, as they deliver precise results without having to go to Google Scholar to find the papers they want. Of course, Google Scholar itself does not have much of a taxonomy: just 260 terms to cover the world of content on their site, and is amazingly well hidden from view.
There are an increasing number of scholarly and research web sites leveraging taxonomies to excellent intent. Some of our clients’ sites include:
- Analytics are booming
Predictive text and search analytics are rapidly moving fields. Analytics help corral the onslaught of incredible amounts of information (“big data” in some cases) and translate the results into pictures or visualizations of the data in ways that are easy to understand by putting together trends and diverse data sets into meaningful, actionable data. End users are using linking and dashboard tools to uncover new trends, and content providers use them for market research and management.
Included in the area of analytics is data mining, which brings into focus large bodies of information—but also raises the often conflicting issues of privacy, security, and freedom of expression; for example balancing detecting signs of terrorism with common sense. Everything is considered to its logical extreme; moderation and goodwill are lost.
- Changing information landscape
Libraries are disappearing, information storehouses and archives are appearing, corporate information is still hard to find, and knowledge walks out the door when employees leave. In the meantime, we can look up anything during a dinner conversation (just ask Siri, Cortana, Alexa, or Google!). The need for printed references is declining, as Siri (and her companions) can find things on the web and point you to valid links.
During the 1970-1995 time period we attended many meetings that focused on looking for ways to find the “Elusive End User”. Now everyone is a consumer of digital information, and the demand/need for information is increasing rapidly. Barriers to information access come and go—which creates demand.
For example, access to medical services and first-hand medical information was restricted by health care providers, so the Internet has stepped in to meet that need; some of that information is of high quality, but much is misleading. In October of this year, the U.S. adopted the ICD-10 for coding diseases on among forms medical insurance claims forms —years after the rest of the world. The CMS (Centers for Medicare and Medicaid Services) has also mandated more information be made available to patients by providing relevant content at the time of service. The flip side is heavy fines and slow cash flow for providers who do not accurately code using the new system.
- Google Scholar comes of age
At first, Google Scholar terrified scholarly publishers. Now, however, they strive to ensure that their articles are all indexed by the scholar crawlers for maximum exposure. While Impact Factors for authors and publishers are still important, the trend seems to be to strive to get your publications the top hits/results in Google. The new game is to use Google as a springboard to more content: surface the data in Google, and lead the user to a deep dive on the publisher’s site.
- Semantics, Linked Data, and the Cloud
“In the beginning was the word…” Without the words there is no way to express a thought. Words express meaning. Words are the semantics. In Pygmalion Professor Higgins notes that “the moment an Englishman opens his mouth another English man despises him” for his manner of expression and his accent. On top of that we now have incredible limits on expression applied by the political correctness and thought police. So individual expression becomes guarded, coded and moves underground.
Taxonomies make semantic inferences much more reliable. Disambiguation of terms and gathering of synonyms has to be done within context; once the context is established, inferences can be reliably drawn. The sentence “George lives in London” makes no sense without knowing which London and which George is meant.
Linked Data (and Linked Open Data) are becoming much more prevalent because of the context and underlying data structures and definitions they offer. Open vocabularies and datasets, and the links between them (which clarifying the terms used) are increasingly available.
All of this interlinking of data is enabled by ability to link things via the universally available Internet. Whether a closed corporate or government system, open web, or some combination, entire systems are being moved to the “Cloud”—that place where any information object can be reached with a URL (and perhaps appropriate access and permissions). We have seen a massive migration to Cloud access from installed systems over the last year more than double what we saw in 2014. I believe this trend will accelerate.
- Boomers are not retiring.
It isn’t just that they are getting older and might not have enough money; they really don’t want to give up and transition out of the workforce. They like what they are doing, feel like they have a positive contribution to make, and would like to continue to do so.
All of the rules for ADA compliance that governments and organizations have been implementing over the last few years have enabled people with age-related disabilities to continue to work:
In the meantime, some 92 million “Millennials” are entering the workforce with different mindsets, work approaches, and information-gathering methods.
While the 77 million in the Baby Boomer class maintain some of the old approaches to information, the new, born-digital set has different expectations…and both have the same informational needs:
The workplace is changing. Seniority is being replaced by ability. Increasingly capability determines advancement instead of age. While politicians campaign on income equality, the workplace is adjusting to a whole new playbook based on what people of any age contribute to the products and services of the organization. We are moving to a sharing economy, immediate information access, and constant social interactions. This means paywalls are avoided, ads are tolerated, privacy is not a concern, but identity theft is. In these times of global strife, continuing economic uncertainty, and technological change the workforce may well move to be more like the Millennials, with a resulting workforce which is fast-adapting, flexible, and innovative.
2016 is a whole new world! We look forward to it.
Marjorie Hlava, President
Jay Ven Eman, CEO
Access Innovations, Inc.