Without a doubt, initiatives such as generative AI (GenAI) and cloud migration have garnered the bulk of attention among influencers and data leaders this year, as organizations tried to determine how, and if, they made sense for their business. This trend looks like it will continue in 2024, as nearly all of Gartner’s top strategic predictions revolve around AI and its impact. Further, the analyst firm comments that by 2027, the productivity value of AI will be recognized as a primary economic indicator of national power.
While the generative AI buzz shouldn’t be ignored, like any data strategy, it’s critical that data and analytics professionals be crystal clear on their priorities and aligned with the business plans, priorities and goals.
In the coming year, organizations will evaluate and modernize data management practices to drive greater business outcomes; determine when to implement AI; better understand the role semantic metadata plays in data fabrics; and accelerate the adoption of knowledge graphs – which will be driven by large language models (LLMs) and the convergence of Labeled Property Graphs (LPG) and Resource Description Frameworks (RDF).
In 2024, data and knowledge management trends will include:
1. Organizations will (finally) manage the hype around AI.
As the deafening noise around GenAI reaches a crescendo, organizations will be forced to temper the hype and foster a realistic and responsible approach to this disruptive technology. Whether it’s an AI crisis around the shortage of GPUs, climate effects of training large language models (LLMs), or concerns around privacy, ethics, bias, and/or governance, these challenges will worsen before they get better leading many to wonder if it’s worth applying GenAI in the first place.
While corporate pressures may prompt organizations to “do something with AI,” being data-driven must come first and remain top priority. After all, ensuring foundational data is organized, shareable, and interconnected is just as critical as asking whether GenAI models are trusted, reliable, deterministic, explainable, ethical, and free from bias.
Before deploying GenAI solutions to production, organizations must be sure to protect their intellectual property and plan for potential liability issues. This is because while GenAI can replace people in some cases, there is no professional liability insurance for LLMs. This means that business processes that involve GenAI will still require extensive “humans-in-the-loop” involvement which can offset any efficiency gains.
In 2024, expect to see vendors accelerate enhancements to their product offerings by adding new interfaces focused on meeting the GenAI market trend. However, organizations need to be aware that these may be nothing more than bolted-on Band-Aids. Addressing challenges like data quality and ensuring unified, semantically consistent access to accurate, trustworthy data will require setting a clear data strategy, as well as taking a realistic, business driven approach. Without this, organizations will continue to pay a “bad data tax” as AI/ML models will struggle to get past a proof of concept and ultimately fail to deliver on the hype.
2. Knowledge graph adoption will accelerate due to LLMs and technology convergence.
A key factor slowing down knowledge graphs (KG) adoption is the extensive (and expensive) process of developing the necessary domain models. LLMs can optimize several tasks ranging from the evolution of taxonomies, classifying entities, and extracting new properties and relationships from unstructured data. Done correctly, LLMs could lower information extraction costs, as the proper tools and methodology can manage the quality of text analysis pipelines and bootstrap/evolve KGs at a fraction of the effort currently required. LLMs will also make it easier to consume KGs by applying natural language querying and summarization.
Labeled Property Graphs and Resource Description Frameworks will also help propel knowledge graph adoption, as each are powerful data models with strong synergies when combined. So while RDF and LPG are optimized for different things, data managers and technology vendors are realizing that together they provide a comprehensive and flexible approach to data modeling and integration. The combination of these graph technology stacks will enable enterprises to create better data management practices, where data analytics, reference data and metadata management, data sharing and reuse are handled in an efficient and future proof manner. Once an effective graph foundation is built, it can be reused and repurposed across organizations to deliver enterprise level results, instead of being limited to disconnected KG implementations.
As innovative and emerging technologies such as digital twins, IoT, AI, and ML gain further mind-share, managing data will become even more important. Using LPG and RDF capabilities together, organizations can represent complex data relationships between AI and ML models, as well as tracking IoT data to support these new use cases. Additionally, with both the scale and diversity of data increasing, this combination will also address the need for better performance.
As a result, expect knowledge graph adoption to continue to grow in 2024 as businesses look to connect, process, analyze and query the large volume data sets that are currently in use.
3. Data fabric will come of age and will employ semantic metadata.
Good decisions rely on shared data, especially the right data at the right time. Sometimes, the challenge encountered is that the data itself often raises more questions than it answers. This trend will continue to worsen before it improves, as disjointed data ecosystems with disparate tools, platforms, and disconnected data silos become increasingly challenging for enterprises. This is why the concept of a data fabric has emerged as a method to better manage and share their data.
Data fabric’s holistic goal is the culmination of data management tools designed to manage data from identifying, accessing, cleaning, enriching, transforming, governing, and analyzing. It is a tall order and is one that will take several years to mature before adoption happens across enterprises.
Current solutions were not fully developed to deliver all the promises of a data fabric. In the coming year, organizations will incorporate knowledge graphs and artificial intelligence for metadata management to improve today’s offerings and will be a key criteria to making them more effective. Semantic metadata will serve as an enabling factor for decentralized data management, following the data mesh paradigm. It will also provide formal context about the meaning of data elements that are governed independently, serving different business functions and embodying different business logic and assumptions. Additionally, these solutions will evolve and incorporate self-learning metadata analytics, driving data utilization pattern identifications to optimize, automate and access domain specific data through data products.
Data security, access, governance, and bias issues continue to routinely impact daily business, and with generative AI getting so much attention, organizations will look to leverage a data fabric powered by semantic technologies to lower cost of ownership and operating costs, while improving data sharing and trust.