Advertisement

Data Governance: The Oft-Overlooked Pillar of Strong Data Quality

By on
Read more about author Daniel Zagales.

What are the most common causes of Data Quality issues? The conventional answer to that question includes problems like inaccurate data, duplicate data, or data containing missing values. These are the sorts of issues that organizations tend to focus on when they want to improve the quality of their data so that they can leverage it more effectively. These are indeed common sources of Data Quality problems. However, they’re really only surface-level issues. If you truly want to improve the quality of your data, you need to dig beneath the surface to get at the root cause of Data Quality problems: Poor Data Governance. Until you do so, you’ll only be treating the symptoms of Data Quality challenges rather than curing their root cause.

Here’s why Data Governance should be at the center of every organization’s Data Quality strategy, along with tips on how to establish an effective Data Governance operation.

Ungoverned Data Leads to Low-Quality Data

The typical organization today recognizes the value of Data Quality because it knows that if you put garbage into your data analytics tools and processes, you’ll get garbage back out, to borrow an overused metaphor. That’s why many businesses have solutions in place to validate the data that they manage for issues like missing or inaccurate information.

The problem with this approach is that it only works for the data that the organization knows about. It doesn’t address what’s known as ungoverned data, aka data that isn’t managed in conformance with the organization’s data standards.

Ungoverned data typically originates because someone in the organization, such as a data analyst, produces data without establishing strong governance policies to control how the data will be shared, secured, and maintained over time. Data producers may also lack an understanding of the underlying data they are working with and the business rules for aggregating or disseminating various KPIs related to the data.

The decision to share data without first sorting out these issues is not usually the result of malfeasance. On the contrary, ungoverned data often emerges because someone in the organization creates data that other people need, and in the rush to ensure that they can start using the data, sharing begins before anyone creates a plan in place for governing the data over the long term.

It doesn’t help, either, that it’s common for businesses to have IT solutions in place that make it easy to share data, but not to govern it. The typical IT organization implements software that can store and distribute information across an organization, but the IT department has little or no knowledge of how different business units will use the data. As a result, it struggles to determine how data needs to be governed or monitored. It’s only through collaboration between stakeholders from across the organization that effective Data Governance policies can be established and implemented.

Too often, that collaboration doesn’t happen before people begin sharing data, and ungoverned data is the result. From there, you get low-quality data. If you don’t establish clear policies about how data needs to be governed and managed, you’re likely to end up with information that is inaccurate, unreliable, and difficult to analyze efficiently because no one knows which policies to follow to avoid introducing quality problems into the data.

Making Ungoverned Data Governable

The solution to ungoverned data – and, by extension, to the Data Quality issues it breeds – is simple enough: Businesses need to establish solutions that make their ungoverned data governable.

Doing so requires a combination of people, processes, and technologies, including:

  • Audit logs: Establishing a robust audit log for all of the data that your organization produces ensures that you have visibility into who is using the data and whether they modify, extend, or delete any of it. In turn, you can determine whether any data access events violate your organization’s Data Governance standards.
  • Data catalog: Data catalogs record which data your organization owns, as well as who is responsible for managing each data asset, what the asset’s purpose is and details about its data lineage – which means how the data has evolved over time. With a data catalog, you can quickly determine how data is supposed to be used and identify deviations from established policies.
  • Data reclamation processes: In addition to tracking data through audit logs and catalogs, businesses should establish processes for taking ownership of ungoverned data. Doing so involves answering questions like who will be responsible for managing the data, where it will be stored and which metrics the data team will track to ensure that the data meets whichever quality standards the organization must adhere to.

When you have each of these solutions in place for all of the data that your organization owns, you can effectively govern all of it – including any data that initially sidesteps governance processes, but which you can reclaim using the strategy described above.

In turn, you’ll be in a much stronger position to ensure the quality of all of your data – not to mention its security and stewardship. You will be able to worry much less about data that ends up containing inaccurate, duplicate, or missing information because you’ll be able to manage it centrally.

Conclusion: Data Quality Starts with Data Governance

The foundation for Data Quality is Data Governance. Until you ensure that all of your data is governed according to the standards that your organization sets, you’ll be fighting an uphill battle against Data Quality challenges because you’ll be remediating issues like inaccurate or missing information without addressing their root case.

So, don’t settle for Data Quality solutions alone. Strive for a strategy that allows you to master Data Governance, which in turn places you in the strongest possible position to improve the quality of your data.