Advertisement

The Ongoing Debate Between Centralized and Distributed Data

By on

Click to learn more about author Kevin W. McCarthy.

Though Bon Jovi is no longer the band they were when I was a teenager, they definitely hit the nail on the head with the lyric, “the more things change, the more they stay the same” from their 2010 Greatest Hits album. This paradigm rings true for many organizations’ approach to storing data in a centralized location, or distributing it in multiple repositories throughout the organization. For years, the market has fluctuated between favoring larger, centralized storage, and preferring smaller, distributed repositories.  And the reasons are justified in both directions, as the swing of the pendulum goes from IT control to business control (and back again).

Back in the day, this tension played out in the debate between mainframes and personal computers. After that, we saw the tides shift between Data Warehousing and Data Marts. Today, the debate remains the same between a large central store, like a Data Lake, or distributed data systems that can differ by department or business unit. In each of these cases, we saw the move from a monolithic, central repository, to more customizable, dispersed sources. In this ongoing debate, IT tends to favor the control of a more centralized approach, while business users prefer the modular method to have greater ability to access and manipulate the data in their own ways.

This tug-of-war between IT and business users over control of the data is circular and endless. Consolidating data can make it too constrained and impede business processes, while distributing data can become too separate and lead to lost data and disorganization. Whenever the control shifts too much in either direction, it hits a tipping point. That’s why we continue to see the same pattern repeating itself time and again. And yet, it is natural debate, as each approach poses advantages and drawbacks.

However, today’s debate presents new challenges. Organizations are dealing with huge volumes of data that are only getting bigger with the Internet of Things (IoT). The temptation is to throw all this information into a Data Lake. Rather than providing value, these Data Lakes run the risk of becoming a dumping ground where the most crucial pieces of information can easily get lost.

The complexity of data is also changing. We used to deal with far more structured data and while that is still the primary data we use, we are storing more unstructured data in the hopes of tapping into that resource for more detailed analysis and insights.

Finally, we have more people that want to access data than ever before. The business has always been interested in what data can provide them, but the speed at which they need to access those insights is now near real-time. To accomplish many of the digital objectives organizations have today, they need to be able to access and analyze data at lighting speed.

With those new challenges, IT and business users have to strike the right balance between governance and control, with accessibility and customization. It comes down to corporate control versus user control.  A centralized approach has its merits: it lends itself to efficiency and allows for greater visibility and control over the quality of the data. It often will not meet business needs, however, in the rapid-paced environment of today—which is why business users tend to want more access. In a recent survey we conducted at Experian, 96 percent of business stakeholders said they wanted greater access to data. They want the ability to view and manipulate data in more creative ways that suit their specific needs.

Regardless of which makes more sense for your business—a consolidated approach, or smaller data repositories dispersed throughout the organization—the approach is really just about the technologies and platforms used to store and process the data. The most important aspect to keep in mind is that you have accessible, high-quality information that is ready to suit the needs of the business stakeholders who will use it to make key decisions and identify opportunities. The best approach, or combination of styles, for you is whatever will allow your organization to work nimbly and efficiently, while still maintaining control to ensure all data used to power decisions meets certain standards and Data Quality thresholds. Figure out what that looks like for you, and as Bon Jovi sang in “Livin’ on a Prayer”: “woah, [you’re] halfway there.”

Leave a Reply