Kevin Shannon, Global Head of Enterprise Data Governance at Dun & Bradstreet (D&B), shared his views on how to respond to current and future developments in Data Governance at the DATAVERSITY® Enterprise Data Governance Online event, during his presentation titled Trends in Data Governance.
The Interconnection of Value & Reality
In the next few years, Shannon sees a shift from Data Governance providing control to it creating added value for the business. At D&B, “Before we do anything as an Enterprise Data Governance (EDG) organization, we make sure that we fully understand the business value of what we’re proposing.” Shannon shared a slide illustrating how Data Governance can connect value and reality.
Principles such as ‘transparency and accountability’ or ‘non-bureaucratic governance processes’ are applied in areas like Data Stewardship or data vendor management to achieve a competitive advantage and enable innovation.
“This is all critical in the context of making a Data Governance program successful,” Shannon said. He places this process in the context of trends because there are many Data Governance organizations that are still focusing on policies and compliance with those policies, but they are not yet closely engaged enough to provide measurable business value. He suggests working with the business to understand why the business operates the way it does and to ensure that people can agree with, comply with, and understand any policies created. “That is what will achieve measurable business value in the end,” he said.
Providing Value: Be the Glue
The role of Data Governance is to be in the middle of business, legal, and technology, and to and understand their differing perspectives, Shannon noted. Data Governance needs to ensure that the tech department is capable of carrying out policies around data. “We’re that glue that brings it together.”
Trust and Confidence in the Data: A Competitive Advantage
Data Governance can play a significant part in the RFI and RFP process by understanding what customers are asking and providing information about governance processes already in place so customers feel comfortable.
Shannon suggests becoming engaged with customers early in the process:
“We guide folks through these waters in terms of regulations and such to make sure that we come out on the right side of things.”
Bring Data Governance to the front office so it can be seen as sharing goals with the business, he said. “I can tell you straight up that customers are delighted to hear from the enterprise Data Governance team.”In several cases, Shannon’s team has been asked to advise customers about their own GDPR preparations, which shows the value that an enterprise Data Governance team can bring to customer relationships.
Regulations apply to all data – data that sits in legacy systems such as main frames as well as neo-legacy data, such as SQL Server, Oracle, and traditional relational databases with rows and columns. The regulations don’t distinguish between structured, semi-structured, or unstructured data and the expectation is that it will be protected, even if it resides in a Word document.By positioning his team as a trusted resource, Shannon’s customers are confident that data from Dun & Bradstreet has been obtained lawfully and that all critical processes are in place to properly safeguard it.
How to Scale: Trends Through the Next 5+ Years
More and more data is being made available every day, Shannon said, but how do companies go about governing this rapidly growing asset in the future? Shannon cited two Gartner predictions: Through 2022, only 20 percent of organizations investing in information governance will succeed in scaling governance for digital business, and by 2022, over half of data and analytics services are going to be performed by machines instead of human beings.
“I don’t see Data Governance as being any different than the way of the world,” Shannon said, and while there are many everyday pieces that require human intervention, the goal is to automate as much as possible, even in the governance space.
In terms of specific Data Governance trends:
- The discovery engine space has matured considerably over the last four or five years allowing better use of semi-structured and unstructured data, he said.
- Data retention is a key focus and he said that implementing a data retention program should be a high priority, because without it, “I can pretty much guarantee that every day you’re just adding more and more work to the pile.”
- Robotic Process Automation (RPA), artificial intelligence (AI), and machine learning can help the EDG team understand what they have, especially in data lakes. RPA can help Data Governance identify patterns, and quickly monitor and understand changes in key areas.
- Metadata is “hugely important” in the scaling process, and Shannon frames Enterprise Data Governance accountability in the context of the RACI responsibility assignment matrix (Responsible, Accountable, Consulted, and Informed). “More and more, people are beginning to understand that metadata is at least equally important as the data itself.”
- Graph technology is a great way to capture the relationship between logical and physical, he said, and it can be very difficult to manage that type of information in relational databases in the context of scalability.
Shannon calls the ability to mint asset identifiers “a beautiful thing.” He sees this emerging in places where EDG is minting the identifier for every data asset and says that there is much more to come. He compared it to a grocery item with a UPC code. For each unit of data there is a piece of metadata associated with it that is its unique identifier. It can be a URL where everything about that asset is stored and available – how it can be used, what it means, what formats it will be in, etc. “This guarantees a single version of the truth in terms of the asset’s meaning and all the metadata surrounding it,” Shannon said.
The use of Data Governance-as-a-Service (DGaaS)is much further out toward the five-year mark. Similar to the current use of business rule engines – where rules can be configured purely through metadata – a semantically driven Centralized Governance Rule Enginecould be used to consult a Data Governance service for info about a data asset. That service would respond with information about the asset, its uses, where it resides, and other pertinent information.
Semantic Query on Physical Assets
Shannon recommends an open source project called DBpedia designed to present context along with data in a semantic format. “It’s beautiful,” he said. Project developers ran the whole of Wikipedia through Natural Language Processing and populated triplestores with the contents. A triplestore is a database for storage and retrieval of “triples” – subject, verb, object. Users can pose a question using a SQL-like language called SPARQL and the user doesn’t need to know any of the structure underlying that information. Questions such as, “How many grocery stores are there in France?” will bring up whatever relevant information is captured in the Wiki to answer that question, Shannon said.
From a governance standpoint, this is how Shannon sees things unfolding:
“It’s all about this continual learning process, continual growth, being agile, and always focusing on business value for the company.”
Check out Enterprise Data Governance Online at http://datagovernanceonline.com/
Here is the video of the Enterprise Data Governance Online Presentation:
Image used under license from Shutterstock.com