Click here to learn more about author Ian Huynh.
Big Data has become one of the most transformative IT developments of the decade with unprecedented amounts of granular data being generated and collected for analysis across a wide range of sources. Virtually every business recognizes the intrinsic value and insights Big Data represents when it comes to fine-tuning products and operations and understanding target audiences to maximize revenue, profits and cost-savings.
The problem is, this abundance of data can also be a double-edged sword. Despite its potential—and the abundance of tools, platforms and solutions that promise to make it useful—the ability to find and act on truly valuable insights still eludes many companies. Even with massive computing power and virtually infinite cloud storage capacity, most businesses still have a very slow, difficult time when it comes to uncovering legitimate insights to use to their advantage. In fact, according to Forrester, nearly 60 percent of technology and business decision-makers indicated that it takes months to years for technology management to fulfill complex new business intelligence (BI) requests.
Why? In my view, it’s because the tools and products that have emerged are based on “small data” architecture—solutions we’ve attempted to morph into filling the demands of Big Data. They’re antiquated, siloed and poorly designed for integrating and managing large amounts of data because of their application-centric approach. That means data is collected, housed and managed in separate applications, often in a proprietary format, rendering it inaccessible for analysis through other applications without sinking a tremendous amount of resources, time, effort and money into making these separate silos work together.
The root of the problem is that, in most IT organizations, the integration and data management functions have historically operated in entirely separate silos, despite their intimate dependency. As integration moves and transforms data between disparate systems, data management performs the extract, transform and load (ETL) functions for the purposes of cleansing, consolidating, governing and harmonizing the data. Considering that integration and data management are so tightly interwoven and must work with the same varied data sources and APIs it’s remarkable that they have been treated as two separate sides of the house for so long.
In order to realize the full potential of the Big Data promise and succeed in a data-driven future, it’s critical that integration and data management be brought together in a data-centric (rather than app-centric) approach. It calls for a new model—Data Platform as a Service or dPaaS—that leverages the power of the cloud to fully unify the integration and data management functions into a single, cohesive system. With dPaaS, data from every application is stored in a single repository. Integration operates as a fully managed service, with self-service data management tools that provide access to the raw data in the repository within governance guidelines. This unique marriage gives companies the power to arrive at insights faster and more accurately than ever before by more easily leveraging the collective data from, within and across a wide range of applications while maintaining complete flexibility and freedom to use the data as desired to solve current business problems and explore new opportunities.
Why is the integrated dPaaS approach so critical to Big Data success? There are four key reasons.
- eliminates redundancies and errors. The dramatic surge in the number of applications and data sources is staggering. As more data is generated across more touchpoints the risk of duplication, redundancies and errors also grows. This not only creates more “dirty” data, but also makes getting accurate analysis very difficult, if not impossible. Rather than spending time examining the results, extremely talented (and highly paid) data scientists must first take on janitorial duty, cleaning the data to eliminate duplicate, incomplete or erroneous record. This is certainly not the best use of their time or skill. A dPaaS solution ensures that as new records are created, they are matched against and merged with existing ones in the central repository, and each field includes the correct and appropriate data. This results in a single, rich, complete picture for each record, rather than multiple, incomplete records for the same information. Unlike conventional integration, which just moves data from one system to another and leaves it in its then-current state with no regard for the future, incorporating data management means that the data in question is captured, cleansed and persisted (along with its metadata) to a central repository, extractable in a schema that is compatible for the next use case.
- It accelerates time to insights. Not only does the unifed approach save time in data cleaning, it also enables real-time streaming data processing in conjunction with storage for longitudinal analysis. Centrally stored data that’s accessible via a wide range of apps provides a level of data agility and verified quality that enables split-second decisions to be made confidently to capitalize on opportunities and defuse market threats. With conventional app-based integration, the process is time-consuming, convoluted and onerous to the point that, by the time the answers are found, the opportunity or threat may very likely have already passed. Unification eliminates the struggle, making streaming data processing and analysis a reality.
- It enables the discovery of interesting cross-system connections and correlations. Siloed systems limit analysis to the data contained within those systems, but there could be much more to discover by leveraging the data across Let’s say a manufacturing company uses a CRM tool to capture leads, a financial system to manage customer accounts, an ERP system to orchestrate the supply chain, product planning and plant floor operations, plus an inventory system to manage product SKUs. When new leads convert to customers, a complex integration protocol “connects” these systems, adding a new record to the financial software, launching the production plan in the ERP and matching the right product SKUs from the inventory management software. This classic integration approach focuses purely on moving and syncing data across different systems and applications. But, what if the sales team wants to measure lead conversion over the last month? Perhaps the CEO needs to report to the board about which products drove the most or least revenue. Or the marketing team needs to know which SKUs perform better in certain geographic territories to tailor their efforts. All valuable insights, right? But, under the conventional integration scheme, getting to these answers would require a whole new set of tools to extract the data from the various systems, then cleanse, harmonize and, finally, run the analysis. These conventional tools overlap considerably in functionality (a waste of IT investment), and they require two separate IT teams. With the integrated dPaaS approach, the data is all there, already cleansed, and can be compared, contrasted, correlated and analyzed in virtually any configuration your business needs.
- It future proofs your Big Data efforts. Implementing a unified structure, with a vital repository of clean, quality data that can be modeled, analyzed and visualized in virtually any scenario, makes it faster and easier to add any number of new data sources and applications to the mix. That means that as your business needs change, you can quickly evolve your technology stack without requiring a monumental and expensive effort to integrate disparate data structures and systems. This gives businesses a more flexible, robust and platform, ready to take on the next generation of data collection and analytics tasks that haven’t even been conceived of yet.
The siloed app-centric model clearly can no longer handle the variety, velocity and volume of data that companies now have at their disposal. The gap has created redundancies, delays and potential errors that hinder data analysis, and the problem is magnified exponentially as new data sources and applications are added to the mix. Instead of being useful and insightful, the complexity of conventional integration and data management is overwhelming to the point that making swift, accurate decisions is impossible, despite substantial investment in tools and technology that promised to help.
To realize the full potential of the Big Data promise, we must combine integration and data management into a unified solution, giving businesses the ability to not only solve the current problems with data analysis, but also tackle future problems and opportunities as they evolve. To achieve that goal, dPaaS can deliver the instant insights that enable real-time decisions and the agility to capitalize on the next wave of Big Data opportunity.