Analysts like IDC and Deloitte estimate that up to 80% of the world’s data is unstructured text data, which makes getting valuable insights out of this type of data a huge challenge. Worse, customers can’t easily find the right answers to address their product and service-related questions that are hidden in large amounts of support documents. As a result, employees can spend 20% of their work-related time looking for information stored in various internal systems, and 95% looking for organizational data that will never be accessed again three months after creation. Many companies have implemented taxonomies and ontologies in an effort to create some structure that can be processed by machines. Yet, the sheer amount of data makes it challenging to create and maintain a complete and effective taxonomy and ontology for all but the simplest use cases.
Recognizing the immense value that is being left on the table, organizations in 2023 will apply practical methods to reduce or avoid the need to create these taxonomies or ontologies to make unstructured data searchable. These teams will be shifting to work on leveraging machine learning and natural language search tools that don’t rely on heavy data labeling, modeling training, and complex ontologies to find relevant information across all structured and unstructured sources, removing the overhead associated with those AI projects and accelerating the time to production. In 2023, organizations will need to rethink how they can leverage unstructured data to improve productivity, increase customer satisfaction, and quickly realize return on AI investments. As a result, expect to see the following trends in the coming year:
1. The world will reach the era of “peak data scientist”
The shortfall of data scientists and machine learning engineers (MLEs) has always been a bottleneck in companies realizing value from AI. Two things have happened as result: (1) more people have pursued data science degrees and accreditation, increasing the number of data scientists; and (2) vendors have come up with novel ways to minimize the involvement of data scientists in the AI production roll out.
The coincident interference of these two waves yields “peak data scientist,” because with the advent of foundational models, companies can build their own applications on top of these models rather than requiring every company to train their own models from scratch. Less bespoke model training requires fewer data scientists and MLEs at the same time that more are graduating. In 2023, expect the market to react accordingly resulting in data science oversaturation.
2. The AI industry will offer more tools that can be operated directly by business users
Companies have been hiring more and more data scientists and MLEs, but net AI adoption in production has not increased at the same rate. While a lot of research and trials are being executed, companies are not benefiting from production AI solutions that can be scaled and managed easily as the business climate evolves.
In the coming year, AI will start to become more democratized such that less technical people can directly leverage tools that abstract all the machine learning complexity. Knowledge workers and citizen “data scientists” without formal training in advanced statistics and/or mathematics will be extracting high-value insights from data using these self-service tools allowing them to perform advanced analytics and solve specific business problems at the speed of the business.
3. Chatbots will chat less and answer questions more
Humans don’t want to spend more time interacting with machines as if they were talking to people; they really just want their questions answered quickly and efficiently from the start without lengthy wait times or having to choose from a myriad of options. Although many chatbots accurately execute the specific tasks they were designed to do, they fall far short of end-user expectations because they rarely answer their actual questions.
In 2023, organizations will finally be able to complement chatbots with natural language search capabilities. Because natural language search understands human language and can process unstructured text-based data (documents, etc.) individuals can phrase questions using their own words – as if they were speaking to a person – and receive all the relevant answers back instantly.
4. Line-of-business leaders will take matters into their own hands
Twenty years ago, companies had two choices in the CRM space: They could pay millions for a Siebel Systems CRM or they could pay a fraction of that amount monthly on a per user basis … which ushered in the cloud era. The same thing is happening now for business users when it comes to AI.
In 2023, if the use case provides exceptional value, business users will decide whether it makes sense to hire expensive and difficult-to-recruit data scientists and MLEs, label thousands of data points, train and re-train models over months, and repeat this process as the underlying data changes. Alternatively, suppose the value of this AI project does not justify the significant upfront and ongoing cost. In that case, the organization will find a vendor who can remove all the complexity for business users.
5. Businesses will finally benefit from their unstructured data
Organizations struggle to extract relevant insights when they search for answers in text data, mainly because the search tools they are using are not capable of effectively and efficiently processing unstructured data.
Recognizing the immense value that is being left on the table, organizations in 2023 will apply practical methods to dramatically improve efficiency and unlock the value that has been elusive for so long. Remote and hybrid work have exacerbated the pain of unsatisfying search outcomes because so many employees work from their own locations and access information at different hours, making information sharing within an organization a major challenge. You can’t simply reach out to your colleague sitting next to you for answers whenever you think necessary. In the coming year, expect to see employees turning to natural language search tools to find relevant information across all structured and unstructured sources.