Knowledge-based Artificial Intelligence. That’s the direction taken by startup Cognonto, co-founded by Michael Bergman, a man whose history in the AI, Machine Learning, Semantic technologies, Internet search and data arenas goes back a long way. That includes his additional duties as CEO of Structured Dynamics, birthplace of UMBEL (Upper-level Mapping and Binding Exchange Layer), a knowledge graph and vocabulary for interoperating Web-accessible information, which had its latest update in May.
As far as the new Cognonto venture, whose initial fruits are the Cognonto Platform and KBpedia knowledge structure, Bergman says it’s been in gestation for about eight years.
“The ‘aha’ moment came when we realized how many of the large-scale QA systems were basing their knowledge structure around Wikipedia,” Bergman says. “We realized this was a huge storehouse of very useful information, but one that everyone reinvented every time they brought in their own system,” from Siri to Viv to IBM Watson and the Google Knowledge Graph.
What could serve better specifically to support Artificial Intelligence apps was systemizing the organization of critical knowledge bases – the very often leveraged Wikipedia, Wikidata, GeoNames, OpenCyc, DBpedia and UMBEL – into a single structure. Instead of taking a bespoke approach, Bergman envisioned creating a resource that could be shared by multiple companies.
“We had done a lot of Wikipedia mapping work and Semantic Web work for a number of years and this gave us a real focus,” he says. “If we were purposeful around this, we could create a resource to make AI apps a lot faster and more efficient.”
Bergman and his team began more purposely seeking customers for what would emerge as the Cognonto venture a few years back, focusing on those companies who needed knowledge-based applications and were more closely aligned to its R&D interests. “I think our customers were evolving as we were,” says Bergman, looking to solve particular knowledge problems with a Semantic approach but requiring knowledge bases underneath all that to be productive.
“We were rapidly moving to more knowledge-oriented apps to do things like entity recognition and tagging and categorization” – things that at the time it was viewing as more Semantic technology-based. “But really the key thread in all this is that they also were knowledge-based,” he says. There was a growing realization, he notes, that it was the knowledge component matched with Semantic technologies that then leverage this ability to do knowledge-based AI.
New Mindset for a New Solution
KBpedia is the knowledge structure component that combines the six major knowledge bases mentioned above – with their hundreds of thousands of concepts and 20 million entities – plus mappings to another 20 knowledge bases. Governing this all is a knowledge graph, or schema, that Cognonto calls the KBpedia Knowledge Ontology (KKO).
Bergman says that there was no adequate off-the-shelf way to bring together the various knowledge bases that it needed to. He was led to the work of a 19th century American mathematician, philosopher, and polymath, Charles Sanders Peirce, one of the first individuals to formulate the logic of signs. Bergman was impressed by his approach for organizing how to think about upper-level ontologies and it’s in large part the genesis of KKO’s ontology.
“There is a firm philosophical logical grounding to the system,” he says, helping to deal with some hard nuts to crack around modeling issues. There are many different approaches for distinguishing a logical basis for ontologies, but Peirce basically says to base everything around 3s, explains Bergman. That is, the object itself; what a particular agent perceives about the object; and the way that agent needs to try to communicate what that is. “Without that triad it’s hard to ever get at differences of interpretation, context or meaning,” he says, whether that be between something like events and activities or individuals and classes.
Once you adopt that mindset, a lot of things that seemingly were irreconcilable differences begin to fall away, and the categorization of information becomes really very easy and smooth, he says. “You realize that things people often times argue about often is a matter of a difference of perspective,” Bergman notes. With this approach, it becomes easy to bring other data sources onboard the structure, as well, including a company’s own data sources.
“That triadic mindset provides a ready means for capturing the differences everyone has about words’ meaning and how you understand things” – perspective and context – he says. “By explicitly incorporating that from the get-go you can explicitly incorporate differences of interpretation.” Not a whole lot of upper structure is needed to capture that mindset, with the highest levels of KKO only having 144 concepts all tying into real-world things, including ideas. Whether products, people, animals, or landscapes at the upper structure, “each one is organized according to a natural classification scheme we call a typology,” he says.
While he acknowledges that this may all sound esoteric, it is very practically “a powerful, effective way for us to deal with formerly intractable modeling issues.”
The Cognonto Platform is a generic system for managing and accessing the knowledge structure, be it KBpedia or some other Semantic technology platform. “The platform can talk to and manage any knowledge graph,” he says. Critically, it also includes a suite of building and testing infrastructure scripts. All the knowledge base sources Cognonto relies on constantly change, and there needs to be a way to keep the overall system current. “This way we have the ability to automatically build and test for consistency and logic and coherence so we can constantly keep the system updated,” he says.
In most cases, these capabilities serve as a starting foundation rather than an off-the-shelf answer for integrating enterprise domain information and perspectives to serve individual needs. Everything Cognonto has undertaken has required this, so its model is to offer its system via a dedicated SaaS model rather than as a standard SaaS offering. It delivers the Cognonto platform along with KBpedia for mapping assistance, then gives customers the keys to utilize and manage the system as their needs dictate. “It’s a dedicated system because it’s unique to their circumstance,” he says.
Getting to Work with Knowledge-Based Artificial Intelligence
The company just published new use cases on applying the technology to enterprise system needs. They include integrating enterprise and domain data with KBpedia in order to obtain as comprehensive and accurate tagging of entities as possible for specific enterprise needs; use of the Cognonto Mapper for linking external data and schema to make it more relevant to specific domain needs or problem areas; and ‘word embedding’ models with word2vec. Here, the rich structure in KBpedia is used to create training corpuses for word2vec rapidly and cheaply on the fly in order to cluster or classify documents by topic, or to characterize them by sentiment or for recommendations.
Because the system is set up to be incredibly richly structured, and logically and coherently organized, enterprises have a rich pool of features (i.e. variables) upon which to do Machine Learning. “One of the key advantages of the system is in supervised learning,” he says. There’s generally a lot of expense and time spent in labeling training sets, “but because of everything being well-structured we can just issue a couple of queries and actually get completely labeled positive and negative training sets almost instantaneously, and do similar things to create gold standards,” Bergman explains. The company discovered, too, that it can use its same relevant domain slicing approach for unsupervised and deep learning, too, thanks to the rich and clearly defined input structures.
While Bergman’s business history has mostly revolved around operating open source shops, that’s not the case here, aside from the upper ontology. “All the goodies underneath the hood at this point aren’t open-sourced,” he says. “It’s a little against our nature but it reflects what we see as the value this offers in so many cost and time advantages.”