The enterprise lives in a world of data, and the desire – the need – to analyze that data for business success is paramount. As businesses march into 2017, what will be at the top of their Data Analytics and Business Intelligence agenda so that they can realize their goals? How does that reflect upon their work in this area in the last year?
We explored these issues with Ajay Anand, VP of Products at Kyvos Insights and co-founder of Datameer; Ihab Ilyas, co-founder of Tamr and Professor of Computer Science at the University of Waterloo; Thomas C. Redman, the Data Doc and President of Data Quality Solutions; Nova Spivack, CEO and co-founder of enterprise intelligence company Bottlenose; Michael Stonebraker, Ph.D., co-founder and CTO of Tamr, and recipient of the 2014 A.M. Turing Award; and, Anand Venugopal, Head of Product, StreamAnalytix, Impetus Technologies.
A Look Back
Some interviewees decided to discuss some issues and events of the recent past that may influence how business’ Data Analytics efforts move forward in 2017.
A number of trends that have been developing for decades got even closer to each other in 2016: Count among these the growth, availability, and affordability of massively parallel computing and massive storage capabilities with a corresponding massive growth of data, says Spivack. His company has been working to help the process of turning data into useful and actionable insights. That massive pile of data particularly includes new kinds of data – like video and tweets and IoT data – and 80% of this is not structured.
Spivack discussed efforts such as heavy equipment maker Caterpillar’s deal with Uptake in 2015 to use that startup’s platform, which helps industries leverage the data they collect from sensors and gauges on their assets, to analyze and predict when customers’ machines need repairs, and replacement parts. Such cutting-edge Data Analytics projects will increasingly become the norm, he notes.
Interesting insights are hidden in unstructured data: “Within a text field, for example, there may be 100 different, important things to know,” Spivack says. Of course, it’s hard to talk about unstructured data in all its permutations without talking about Apache Hadoop, which Anand points out reached the 10-year mark in 2016. The industry continues to see a maturing of its ecosystem and its readiness for enterprise deployment, he says. “We saw more focus on making the data in the data lake available and accessible to business users, not just Data Scientists,” he says.
Redman points to the fact that companies have started to become more focused on the ties that bind Data Quality to Data Analytics, as well, noting the growing interest in the concept of the Data Provocateur, (see 2017 Trends in Data Strategy for more information). “I don’t know how a lot of executives have been running their companies with the data they have,” he says. Data Quality isn’t an end in and of itself, but a need for enabling good Analytics. A front-and-center cause of the 2008 financial crises, he reminds us, was bad mortgage-related data that affected analysis of loan risks.
“The point [of Data Quality] is to make an organization work better, to make it so [the business] has data it can trust going forward,” Redman says.
Data Analytics Trends in 2017
The expectation going into 2017, as Anand sees it, is for the industry to continue working to ensure that business users will not have to go through a steep learning curve to access Big Data for their Analytics requirements. Kyvos, for instance, offers technology that makes it possible for users to use their existing desktop BI tools, he says, and get the high performance, interactive access that they are used to. (See our recent article about Kyvos’scalable, Self-Service Analytics technology.)
Solutions that address the missing link – that is, closing the last mile with the non-technical business user – will be the light at the end of the tunnel for businesses like the 50% of companies that Gartner reports still are struggling to get value from Hadoop, Anand notes.
At the low end, easy-to-use Analytics Toolkits will come into their own, says Stonebreaker. But at the high end, “where performance and scalability matters, [analytics] will continue to be “rocket science” for another few years,” says Stonebraker, whose company is focusing on helping companies with spend analytics for strategic sourcing. Rocket science needs rocket Data Scientists, but don’t expect the shortage of talent here that has bedeviled businesses to be resolved next year.
“There will continue to be a shortage of qualified Data Scientists,” Stonebraker says. “I don’t expect the market to be in equilibrium until 2019 at the earliest.”
On the plus side, “the first billion dollar ROI on a Data Integration project will be realized.” That ROI may be the result of integrated data showing its value in the Big Data/Predictive Analytics realm, helping businesses to be able to make better strategic calls about things like what products to launch. Ilyas, Stonebraker’s fellow co-founder, adds to the Data Integration discussion that “companies will have initiatives on how to securely share data sets across silos. This is a gap in the market and several companies will emerge to tackle enterprise data brokerage and sharing.”
It’s also his take that “Data Analytics will go vertical (financial, medical, etc), and companies that build vertical solutions will dominate the market.” General-purpose Data Analytics companies, he thinks, will start disappearing, while “vertical Data Analytics startups will develop their own full-stack solutions to data collection, preparation, and analytics.”
Venugopal at Impetus, which is focused on creating new ways of analyzing data for businesses, believes that next year – and in 2018, for that matter – Streaming Analytics will become a default enterprise capability, enabling the real-time enterprise. Instead of enterprises analyzing data in batch-mode once or twice a day, they’ll do it on the order of seconds to gain real-time insights and take opportunistic actions.
“Overall, enterprises leveraging the power of real-time streaming analytics will become more sensitive, agile and gain a better understanding of their customers’ needs and habits to provide an overall better experience,” he says.
Consider it the next big step to help companies gain a competitive advantage from their data, Venugopal explains. In terms of the technology stack to achieve this, there will be an acceleration in the rise and spread of the usage of open source streaming engines, such as Spark Streaming and Flink, in tight integration with the enterprise Hadoop Data Lake. “That will increase the demand for tools and easier approaches to leverage open source in the enterprise,” he says.
In addition, last year brought an increased emphasis on dealing with security issues as more groups within an organization want to obtain access to the business’ data for analysis, Anand says. Overall, he says, the emphasis on security and governance will continue as enterprises move their critical business analytics operations to big data infrastructures. “We are on the cusp of really delivering on the promise of democratizing Big Data,” he says.
Perhaps another piece of that promise will be fulfilled as the trend to Cloud-based deployments grows. “We expect to see adoption of Cloud infrastructures for Big Data accelerate over the next year,” Anand says. “We see a mix of deployments on AWS, Microsoft Azure, as well as Google Cloud.” Some of Kyvos’ customers view Hadoop and Spark as ways of providing a layer on these cloud infrastructures that is not vendor specific, he notes, providing a service layer that is an insurance against vendor lock-in.