Diffbot announces a new structured database that can automatically sort the web into human-like categories of knowledge. Marketing Land brought this news to our attention in their article, “Data Extractor Diffbot Wants To Turn The Web Into The Semantic Web.”

The Global Index is currently in a closed beta phase but will be moving to general availability after the next round of funding. Diffbot specializes in automatically extracting the unstructured content on web pages, categorizing it using artificial intelligence, computer vision and natural language processing, and then storing it by data type in a structured database.

Aspiring toward the semantic web, this new tool is creating semantic web content – information that is characterized by its meaning. The Global Index is comparable to Google’s Knowledge Graph, which also categorizes info on the web into usable and related knowledge. The difference is availability. The Google effort is based on Wikipedia, several other sources and human efforts. It’s available only through Google’s search engine, while the Global Index will shortly be open to the public.

Melody K. Smith

Sponsored by Access Innovations, the world leader in thesaurus, ontology, and taxonomy creation and metadata application.