Semantic technology trends are expanding well beyond an interesting, more advanced search engine. Besides providing scientists with a more functional search engine, semantic technology is now being used to improve artificial intelligence and machine learning.
Semantic technology uses a variety of tools and methods designed to add “meaning” to a computer’s understanding of data.
When asked a question, rather than simply searching for keywords, semantic technologies will explore a wide variety of resources for topics, concepts, and relationships. In the financial and science industries, companies have begun to semantically “enrich” content, processing complex data from a variety of sources.
In the world of publishing and media, organizations like the BBC, Springer Nature, and the Financial Times are using semantic technology to make knowledge discovery more efficient.
Other industries, ranging from the energy sector to e-commerce to the U.S. government, are using semantic technology to improve their research.
Semantic technology has evolved significantly in the last few years, with some predicting it will soon become commonplace in online research. It’s a powerful tool with the ability to recognize themes and concepts automatically. When asked a question, semantic technology can search topics, concepts, and associations from a significant number of sources.
According to Marco Varone, CTO and founder of expert.ai:
“A lot of things are happening in the semantic language understanding space. Many more things have happened in the last three, four years than in the previous 10 to 15. In the last few years, the change has been from experiments in semantics and language, to real projects.”
Semantically Enriched Metadata
CEDAR (the Center for Expanded Data Annotation and Retrieval) has created tools and services that will semantically enrich metadata with ontology terms.
Their software package, called Workbench, helps scientists develop and publish metadata that describes scientific experiments. There has been significant interest in developing metadata standards that scientists can use to annotate their published articles.
With Workbench, scientists can create well-targeted metadata and submit it to public repositories. By using semantically enriched metadata descriptions, which include themes and concepts, scientists can make their published experiments more readily available for other scientists to find.
The process of adding semantic metadata to enrich content is often referred to as “semantic tagging.” Tagging can be embedded into XML files directly, or tags can be held externally inside databases and content management systems. When content is not easily accessible for tagging — for example, when it is made up of images or videos, and not text — tags can be placed inside metadata headers.
The CEDAR Workbench was designed for the biomedical community, but provides a model of metadata enrichment tools for other industries.
The Semantic Web
The birth year of the semantic web is considered by many to be 2021. Use of the semantic web will increase significantly over the next few years, particularly in the science and medical communities. This subdivision of the world wide web translates internet data into machine-readable data. It uses technologies like RDF (Resource Description Framework) and OWL (Web Ontology Language).
Websites can expose their semantics by embedding RDF statements within their webpages. There are a variety of ways to accomplish this:
- RDFa
- RDF-XML
- RDF-JSON
- JSON-LD
- Microdata
RDF, as a data model, does not add meaning to data, but does provide a way to express relationships. For instance, an RDF triple can communicate that Lansing is the capital of Michigan, but to a computer, without context, this has no meaning. By adding meaning and context, a capital is defined as a type of city, a city is part of a country, and a country is defined as a political entity. This provides the computer with an understanding of the context, although it will not understand it the way humans do.
OWL is much more developed and complex than RDF. (RDF lays the foundation and OWL builds on it.) It imitates human reasoning to process and integrate data on the web. OWL includes a number of syntaxes and specifications, and is designed to offer a rich and complex understanding of things, collections of things, and the relationships between these things.
There have been predictions of a Web 3.0, which would incorporate semantic technology, but it doesn’t exist yet, and may be some time in coming.
Semantic Technology, NLP, and Artificial Intelligence
Human language is complicated, and to understand it, there must be an understanding of the grammatical rules, as well as meaning and context. A good understanding of human language also includes slang, colloquialisms, and acronyms.
Natural language processing (NLP) algorithms, combined with semantics technology, allow computers to simulate the ability to understand human language. NLP is based on machine learning and supports a computer’s ability to analyze, understand, and potentially use human language to communicate.
Chatbots and virtual assistants (the most evolved forms of artificial intelligence) have started combining NLP with semantic technology.
In 2018, Microsoft purchased Semantic Machines, which combined semantics technology with NLP machine learning algorithms to provide context for conversations with virtual assistants and chatbots.
Since that time, Microsoft has applied the techniques and methods taken from Semantic Machines to their virtual assistant, Cortana. More specifically, Cortana’s Scheduler, which is used to negotiate meeting times.
It allows users to schedule meetings by speaking normally, such as, “Find a time when Kevin and I can meet for coffee next week.” Cortana’s Scheduler searches for attendee availability and communicates back-and-forth using email. When all is organized, it sends out calendar invitations. Cortana’s Scheduler can also be used to reschedule or cancel meetings.
Knowledge Graphs, Relationships, and Semantic Technology
A knowledge graph (also referred to as a semantic network) is a symbolic representation of real-world objects and events (things, concepts, activities) and their relationships. When a knowledge graph is semantically enriched, additional meaning has been associated with items on the graph.
For example, a node labeled “RPA” might have little meaning by itself. To a software developer, however, it might be recognized as “robotic process automation,” describing software that automatically performs certain administrative tasks.
By adding meaning to the node’s name, it can be assigned relationships with other software and automated services.
A knowledge graph will label the RPA node as software. By aligning the RPA node to a software ontology, a computer begins to understand the object in context with other types of nodes that are also inside the knowledge graph.
In 2018, Ontotext developed an expert knowledge graph (commissioned by NuMedii) using concepts from genomics, disease conditions, drug products, scientific literature, etc. The massive integration and semantic interlinking of medical data helped NuMedii discover knowledge hidden away in documents and find new patterns and correlations. They were able to access information that might otherwise have been inaccessible and forgotten.
Semantic Technology Trends and Deep Learning
Perfect Memory, a French software publisher, has combined semantic technology with deep learning to create an efficient platform that makes data instantly available for changes and modification. Perfect Memory’s software automates metadata for collecting, interpreting, and transforming any form of digital content. Their process provides quick, intuitive access to extensive amounts of data and content.
The founder and CEO of Perfect Memory, Steny Solitude stated:
“After 10 years of R&D, Perfect Memory has successfully industrialized the theory of the semantic web, effectively rendering all data and content intelligible, ultimately returning it to the end-user in a format, language, and context designed to work specifically for that organization.”
In 2018, Perfect Memory provided Eurovision Media Services (and others) with a dynamic microservices platform, called DAM-as-a-Brain.
Their platform automatically gathers media from different sources and processes it using several features: speech to text, facial recognition, named entity recognition, etc. The platform helps Eurovision Media Services manage and profit from their media content in smarter ways.
Deep learning and semantic technology will be used to create smarter forms of artificial intelligence with superior recognition capabilities.
What Semantic Technology Trends Are Coming?
Scientific and medical research will continue to lead the way in using semantic technology as a powerful search engine. While there is a great deal of research and publishing in the scientific and medical communities, few of the articles published are read.
The research articles generally don’t get read until someone actually needs the information, so using a more powerful search engine to seek out useful data has great potential for saving lives, and avoiding experiments that have already been performed (or performing them again with variations that weren’t tried the first time).
Additionally, semantic technologies support the continuing evolution of artificial intelligence, especially in combination with deep learning and natural language processing. Semantic technology can supply background knowledge for AI systems, allowing them to provide more targeted responses.
Expect chatbots and virtual assistants to sound more and more human.
Image used under license from Shutterstock.com