Semantic Technology Conference Celebrates 10 Years

July 22, 2014  
Posted in News, semantic, Technology

DATAVERSITY™ Education, LLC, and SemanticWeb.com have released the agenda and opening of registration for the 10th annual Semantic Technology & Business Conference (SemTechBiz). This interesting news came from Semantic Web in their article, “Semantic Technology and Business Conference Announces Agenda and Opens Registration.”

SemTechBiz brings together today’s industry thought leaders and practitioners to explore the challenges and opportunities jointly impacting both corporate business leaders and technologists. This year’s conference will take place in San Jose, California, August 19-21, 2014, at the San Jose Convention Center. For the first time, the conference will be co-located with the NoSQL Now! Conference.  To view the three-day agenda or learn more about the conference speakers and registration options, visit www.SemTechBiz.com.

Melody K. Smith

Sponsored by Access Innovations, the world leader in taxonomies, metadata, and semantic enrichment to make your content findable.

Searching and Finding

July 18, 2014  
Posted in News, search, semantic

Do the current methods of searching the Internet actually have an expiration date? The scientists at the Defense Advanced Research Projects Agency (DARPA) believe that just may be the case. Their goal is to revolutionize the discovery, organization, and presentation of results. Network World brought this news to us in their article, “DARPA seeks the Holy Grail of search engines.”

DARPA’s Memex program is designed to develop software that will enable domain-specific indexing of public web content and domain-specific search capabilities. The technologies developed in the program will also provide the mechanisms for content discovery, information extraction, information retrieval, user collaboration, and other areas needed to address distributed aggregation, analysis, and presentation of web content.

Semantic technology continues to grow and expand its uses. Search is just one of those. Access Innovations, developer of the M.A.I. machine-assisted indexing system, specializes in complex coding, tagging, and indexing.

Melody K. Smith

Sponsored by Access Innovations, the world leader in thesaurus, ontology, and taxonomy creation and metadata application.

The Semantics of Search

July 15, 2014  
Posted in News, search, semantic

Semantic search continues to develop and be even more integral to information management, but how does it affect your efforts to identify and employ keywords? This is assuming that keywords even matter anymore. How does it help make your content findable? This interesting topic was found on Brandpoint in their article, “Are Keywords Necessary in a Semantic Search World?

Semantic search returns much more specific answers to search queries, rather than the previously deployed hierarchical list of approximations Google guessed the user meant. In the pre-Hummingbird days, content was created around keywords, which were then strategically placed in the hopes of attracting links back to the website, increasing its rank. The more links you had, the more confidence search engines placed in the trustworthiness of the website, and this was then reflected in the search engine results pages for the website — producing more visitors and ultimately more revenue.

Even with semantic technology powering search, information management for any type of business is critical for fast, easy, and comprehensive findability. One key way to ensure this is through a solid taxonomy, based on standards, built by someone with years of experience in the field.

Melody K. Smith

Sponsored by Access Innovations, the world leader in taxonomies, metadata, and semantic enrichment to make your content findable.

The Value of Your Data

July 10, 2014  
Posted in Autoindexing, metadata, News, semantic

Data is everywhere, but in the healthcare industry it is fast becoming the largest producer. Combine that with the many other initiatives in health information technology going on – electronic health records, ICD-10 coding classification transition, Affordable Health Care, Meaningful Use – they need whatever help they can find. EHR Intelligence brought this interesting information to us in their article, “Can the semantic web revolutionize healthcare analytics?”

It appears that analytics could be of great use here, but with everything else going on, the analytics landscape looks hopelessly complicated and intimidatingly expensive. It is important to understand the value of your data and how analytics can provide strategic direction in these times of change.

By eliminating the complexity of a solutions-based approach, customers have greater findability and better results. It is so very important to choose a product that makes your content findable – easily and thoroughly. Access Innovations is one of a very small number of companies able to help its clients generate ANSI/ISO/W3C-compliant taxonomies and associated rule bases for machine-assisted indexing.

Melody K. Smith

Sponsored by Access Innovations, the world leader in taxonomies, metadata, and semantic enrichment to make your content findable.

Counting Calories Is So Yesterday

July 9, 2014  
Posted in News, semantic

Semantic technology has entered the kitchen, or at least, the kitchen arena. Apps, software and services that provide us with in-depth information about the food we eat have never been more popular. Consumers want to know more about their food than ever before. Enter software developers who are working fast and furiously to meet that demand. This interesting information came from Factor-Tech in their article, “Thirst for data: Nutrition apps and software to be revolutionised by semantic technology.”

Klappo is changing the status quo by collecting in-depth nutritional data from a vast range of sources, including recipes, restaurant menus, and packaging labels. The results are far more extensive than you would get from a typical food label. Using semantic technology, the system reads the instructions to create an accurate picture of what the cooking process is and what the resulting nutritional information will be. This is a far cry from counting calories from ingredients.

Melody K. Smith

Sponsored by Access Innovations, the world leader in thesaurus, ontology, and taxonomy creation and metadata application.

The New Search

July 8, 2014  
Posted in News, search, semantic

Digital marketing is changing in every industry, but none more than the insurance industry. There are tactics more appropriate and effective than the search engine optimization and keyword linking of days gone by. Google has made that very clear. Using proper keywords and acquiring backlinks is still relevant strategy for ranking your insurance agency or carrier website in search engines, but the volume of keywords and accumulation of backlinks must look natural.

This change is being driven by semantic search. This kind of search has the ability to make associations between things in ways that come closer to how we humans make such connections. So search results that provide more relevant resources to our actual needs are of a higher value than link driven results. This interesting information came from Small Business Trends in their article, “How Semantic Search is Changing Insurance Industry Digital Marketing.”

Melody K. Smith

Sponsored by Data Harmony, a unit of Access Innovations, the world leader in indexing and making content findable.

The Science of Search

July 3, 2014  
Posted in News, search, semantic

Mostly known as a method for technical writers, structured authoring is fast becoming the preferred choice for enterprise-level content production. This interesting information came from ClickZ in their article, “Structured Authoring: The New Normal in Content Production.”

The science of information retrieval and processing is an ever-changing science. The evolution is evidenced by last fall’s Hummingbird update and the changes at Schema. Adjusting to semantic search in this evolutionary process is key if marketers want to keep up. Consider it much like the impact the printing press took when technology hit the print world.

To be most effective, modern content needs to be machine-readable for semantic technologies. It needs to be responsive to users at ever-more-granular levels and easily deployed on mobile and stationary devices alike.

Melody K. Smith

Sponsored by Data Harmony, a unit of Access Innovations, the world leader in indexing and making content findable.

The Time of Data

June 27, 2014  
Posted in metadata, News, semantic

Informatica has announced its Intelligent Data Platform that is designed to provide “the right data at the right time.” Sci-Tech Today brought us this information in their article, “Informatica Intros Its Planned Intelligent Data Platform.”

There are three components to Informatica’s Intelligent Data Platform. A data intelligence layer delivers self-service data for businesses, collecting metadata, semantic data and usage information. A second component, the infrastructure, offers clean and connected data. And a data engine aggregates and manages data.

Not yet a completed product, the platform is being developed as “a combination of existing Informatica platform capabilities and new product initiatives, some of which are in early beta testing.” Apparently some platform capabilities will be available as packaged offerings and reference architecture by the end of this year.

Melody K. Smith

Sponsored by Access Innovations, the world leader in taxonomies, metadata, and semantic enrichment to make your content findable.

Rule Base Solutions

People often ask us how much time it will take to manage a rule base with Data Harmony software. We reply with specific customer experience numbers and tell them a few hours per month of editorial time to maintain both the thesaurus and the rule base. One customer of ours, the American Institute of Physics, found that maintaining their thesaurus and rule base takes less than 15 hours per month for 2000 articles per week throughput. Another customer, The Weather Channel, manages breaking news all day long with four hours per month of maintenance. It takes the editorial team just a few hours per month to keep up with the changing trends and events within their field and transfer those into the organizational knowledge base represented by the M.A.I.™ rule base. This is a small investment that provides the organization with the highest level of accuracy in coding (usually well over 90% hits without human intervention), as well as to support analysis of the trends in the business, the creation of author profiles, semantic fingerprints of the entire organizational holdings, and extraction of real meaning for all the data. Other customers, such as IEEE and the US GAO, find the accuracy of their Data Harmony software implementations so high that they now only sample the data periodically to glean new terms and trends. They do not see the need to review every single item.

The real question, though, should be a matter of control. If a rule-based solution maintained by the editorial staff is the approach taken, then full control remains with the editorial department. If a programmatic learning system – the seductive call of the purely automatic system – is the choice, then oversight either remains with the vendor or moves to the IT (information technology) department. The lower accuracy of the indexing returns (usually in the 60% range) means much more time spent by the editorial department on the production of the taxonomy tagged items. The time that would have been spent improving the knowledge base is instead spent in production time processing records, due to lower accuracy levels.

Here’s an example:  let’s assume 1000 articles per month. Using 90% accuracy versus 60% accuracy, how much extra production time is involved?  Let’s also suppose, for easy calculations, that there are 10 terms per article. If our rule base indexing is 90% accurate, then only one term will need to be reviewed, researched, and replaced or discarded. If alternative indexing methods produce 60% accuracy, then there are four terms per record to research, replace, or discard. The time to research a term and decide on its disposition is conservatively two minutes. So two minutes per term at 1 term per article is just 33.3 hours per month. But if four terms (60% accuracy) need reviewing, then 133.3 editorial hours per month are needed – obviously, four times the effort.  Moreover, the rule base improves over time with this small editorial input, so the maintenance time continues to decrease.

A statistical approach can appear to be a gift on a silver platter, but beware – such an approach means more time spent on production, less on building a knowledge base, lower accuracy, higher throughput costs, and no chance to learn about the data through semantic fingerprinting. To make matters even more frustrating, you have little control of the system. It has to be improved and worked on by the vendor or the IT department. New terms require a full revamping of the system each time, resulting in costly delays, rather than the real-time, instant updates that a system based on Java object-oriented programming allows. As a result, the taxonomy is not responsive to the organization’s data.

It is tempting to think that the classification of content can be done without the use of a vetted taxonomy properly applied or that the taxonomy only provides a convenient file folder naming convention. Unfortunately, the cost is high to make that choice. The accuracy is lower, the throughput is slower, and the clerical aspect of the indexing process is increased when you use a statistical system. In addition, control is no longer with the editorial department, but shifted to IT and the vendor. The power dynamic of the choice is clear: IT versus editorial. Who do you want to be in control of your indexing?

Marjorie M.K. Hlava
President, Access Innovations

Data Harmony Version 3.9 Includes MAI Batch GUI – A New Interface For M.A.I.™ (Machine Aided Indexer) and MAIstro™ Modules

June 16, 2014  
Posted in Access Insights, Featured, metadata, semantic

Access Innovations, Inc. has announced the inclusion of the MAI Batch Graphical User Interface (GUI) as part of the recent Data Harmony Version 3.9 software update release. MAI Batch GUI is a new interface for running a full directory of files through the M.A.I. Concept Extractor. This tool enables processing of large amounts of text through the Data Harmony M.A.I. Concept Extractor with a single command. Usually used in working with legacy or archival files, it allows complete semantic enrichment of entire back files in a short time. Once run, the taxonomy terms from a thesaurus or taxonomy become part of the record itself.

“For Data Harmony Version 3.9, we decided to add the interface to the MAIstro and M.A.I. modules to allow use directly from the desktop, giving more power to the user,” remarked Marjorie M. K. Hlava, President of Access Innovations, Inc. “It’s a fast, easy way to perform machine-aided indexing on batches of documents, without any need for command-line instructions.”

“M.A.I.’s batch-indexing capability has been in place for years via command line interface,” noted Bob Kasenchak, Production Manager at Access Innovations. “This new GUI makes it really easy to use. Customers only need to open ‘MAI Batch app’ in their Data Harmony Administrative Module, choose the files or directories to process, and submit the job.”

The purpose of MAI Batch is to provide immediate processing of data files on demand. MAI Batch can be deployed to achieve rapid subject indexing of legacy text collections.

MAI Batch GUI offers semantic enrichment by extracting concepts from input text in most file formats, including the following:

  • Adobe PDFs
  • MS Word DOC files
  • HTM/HTML pages
  • RTF documents
  • XML files

For XML files, the ‘XML Tags’ option permits users to define specific XML elements for MAI Batch GUI to analyze during batch processing. This option opens the door for indexing source documents that are tagged according to different XML schemas. XML Tags also permits the exclusion during indexing of sections in the document structure, as designated by the user.

The interface’s Input and Output panes present a practical view of the batch during processing, enabling a degree of interactivity – M.A.I. is a very accessible automatic indexing system. It’s a ‘machine-aided’ software approach, even when applied to batches of documents. IT support is important but not needed to process and maintain the Data Harmony Suite of products.

When the documents already contain indexing terms, MAI Batch GUI will derive accuracy statistics for inclusion in the output, logging the statistics of indexing accuracy for the batch. M.A.I. calculates the indexing accuracy of its suggested terms from Concept Extractor compared to the previously-applied subject terms. This powerful method for enhancing the accuracy of subject indexing is based on reports generated by the M.A.I. Statistics Collector, giving a taxonomy administrator all the data needed to continually improve the results based on the system recommendations, selections, and additions.

About Access Innovations, Inc. – www.accessinn.com, www.dataharmony.com, www.taxodiary.com

Founded in 1978, Access Innovations has leveraged semantic enrichment of text for internet technology applications, master data management, database creation, thesaurus/taxonomy creation, and semantic integration. Access Innovations’ Data Harmony software includes machine aided indexing, thesaurus management, an XML Intranet System (XIS), and metadata extraction for content creation developed to meet production environment needs.  Data Harmony is used by publishers, governments, and corporate clients throughout the world.

« Previous PageNext Page »