In terms of a market perspective, Data Governance has increased in visibility partly because of the increase in security breaches, data security issues, and compliance requirements for various industry regulations. To secure and manage data properly, it helps to manage it at a higher level and know which of your data is sensitive in the first place. Having the foundation of a solid Data Architecture is fundamental, because without it, Data Governance and compliance are impossible. Executives will often start without knowing what processes are at work managing their data, and then realize as they start digging into it, that they’ve taken a backward approach.
“You’re not going to find the silver bullet. You’ve got to roll up your sleeves and do the work,” said Ron Huizenga, Senior Product Manager for IDERA’s ER/Studio suite. Huizenga and his colleague Joy Ruff, Product Marketing Manager for ER/Studio, recently spoke with DATAVERSITY® about current challenges in Data Governance and Data Architecture. Huizenga said, “You have to get back to basics. You’re not going to be able to just buy an off-the-shelf solution and your governance problem is solved.” Both Ruff and Huizenga believe that the key to success lies in understanding your data and how it’s used in your organization.
To govern data well, said Ruff, start down at the database level, “so that you know what your data is, how it’s related to other data, and how it’s flowing through your business.” Because “if you don’t know how your data is moving through your business, how are you going to govern it?”
For example, because compliance requires knowing who is accessing sensitive data, without identification of which data is sensitive, it’s impossible to know if it’s been accessed inappropriately. If you know the relationships between data and how it flows through the system, as well as what the lineage is from the data source to the application, “then you can know where it’s being touched all along the way,” she said.
Constraints Drive Change
General Data Protection Regulation (GDPR) and California’s Consumer Privacy Act (CCPA) are the beginning of a growing number of regulations that will increase the importance of data lineage, Ruff commented. “When a customer asks to be removed from your system, you need to figure out how to do that,” down to the database level, and possibly even in data backups.
Despite the difficulty of compliance, the regulatory process is ultimately forcing companies to get control of their data, which is something they need in order to create a successful Data Governance program.
Huizenga said that organizations are “ridiculously ineffective” at Data Governance at this point in time. Gartner recently predicted that by the end of 2019, only 50 percent of CDOs will be successful. “And it’s not because of the people in those roles; it’s because of all the constraints that are placed upon them and the complexity of the problem that they’re trying to solve.”
Documentation and Modeling: Equal Partners
Solid documentation that arises from an intimate understanding of business processes and systems—beyond just databases and NoSQL data stores—is an important piece. Without data and process modeling, it’s impossible to understand what Huizenga calls ‘swivel-chair integration’:
“A person takes information from Point A, and puts it in Point B. Whether it’s keying from one system to another or manipulating something in a spreadsheet and uploading it to a system, there’s no way automated tools are going to catch all of that.”
Having tools and technology facilitates the process of understanding the data, where it’s stored, how it’s organized, what the processes are, and how it’s all tied together, “but it’s not the ‘easy’ button that does everything for you.”
Some companies have been trying to rely on metadata repositories alone, but the real key, he said, is in modeling. “A picture’s worth a thousand words, right?” Having the metadata and being able to do analytics and queries is helpful, but without pictures that explain how all the elements are related, and understanding the data lineage and life cycle, “You don’t have a chance.”
Keeping higher-level business goals in mind is essential, but implementation should be focused on the fundamentals. “Metadata is a big piece of that too. A lot of the metadata is focused up at that higher level. Are your metadata management tools really getting down to the lower level?”
Data and process modeling in particular are more important now than they’ve ever been before, he said, but that modeling should be coupled with the reverse engineering capabilities and all the tools and processes needed to do proper governance. “Just because you don’t know that data you harvested includes customer personal data, doesn’t mean you’re not still accountable for it, so you better know what’s in there.”
Challenges Demand Flexibility
The volume of data and the number of different data sources that organizations have to deal with is still an issue for many companies. The recent pattern of increased mergers and acquisitions is adding to the challenge, because “it just layers systems upon systems,” but the solution to reduce the complexity and the redundancy is the same: an understanding of the data from the ground up, built on a solid architecture, according to Huizenga.
Ruff added that there’s a fine line between having structure and having flexibility. “This is not a ‘one-and-done’ process where you just create the architecture and walk away. It needs to evolve.” A flexible architecture will be able to respond to mergers, acquisitions, growth, technology changes, and cutbacks, she said.The ability to respond to changes and to new information coming in has to be designed into the processes themselves. “Even though business processes and data models can start at a pretty high level, they have to go deeper—and then go a little deeper.”
“We Don’t Need No Data Models”
Huizenga referenced the “extreme mindset” where formerly it was popular for some companies to say that data models were obsolete. Now some of those same companies are realizing that they do need data models.
“Too many organizations say, ‘OK, we’ll reverse engineer and document after the fact with a data modeling tool.’ That’s not going to cut it,” he said.
Instead, the approach should be to get back to where the discipline of modeling is part of the design process and push forward from there: “Otherwise you’ll never tackle the problem once and for all.” He suggested taking “slices,” starting at the top level and then working down to the implementation models. The goal is to create an enterprise data model, “But it’s not this ivory tower artifact that you’re creating—it’s continually evolving,” so that the model becomes the “Rosetta Stone” for understanding the organization’s data as a whole, in context.
Automated Tools Support Flexibility and Change Management
Ruff said there is an assumption that getting a handle on data requires spending a year building a model from the ground up before doing anything else, and once it’s in place, it’s finished. She returned to the idea of flexibility, suggesting a more iterative and interactive process. “As you go down a certain path, you realize you need to make an adjustment,” so that changes are incorporated or expanded as the business grows and as the next level of detail is required.
Huizenga said that it’s now possible to have built-in change management, which is a feature few vendors other than IDERA offer as part of the modeling process. Data model changes can be tied to tasks and user stories, creating an audit trail. “When the auditor looks at you and says, ‘Why was that change made to that database?’ you can find the requirements that drove it,” and when staff changes occur, institutional knowledge isn’t lost.
Pre-built design patterns and reusable constructs can also streamline the modeling process to maintain consistency among models, so that each new build ties back to the main framework, Huizenga said. Ruff adds that industry models with well-documented design patterns can fit a variety of business situations, making it possible to streamline the modeling process. Instead of building the entire model from the ground up, customers can simply fill in the details and be up and running much faster.
Support for DevOps
Ruff sees emerging opportunities for bringing areas together that have historically been separate, as well as support for DevOps, with a coordinated effort for cross-functional teamwork. Cross-functional teams and people “wearing multiple hats” are starting to work better together with data modelers, DBAs, developers, and analysts, “because everybody needs to know how that data flows. And if you’ve got the models to show it, the developers can work with that, the analysts know where to pull the data from, and they know they’re getting the right data.”
Huizenga said the key is to take a practical approach to design and keep evolving it as the team goes forward: “Part of the whole reason that some data modelers or enterprise architects may have found themselves on the outside looking in is that they get too focused on the academics: ‘Is this a perfect academic design?’ We don’t have room for that. We need to enable the business first.”
IDERA: Now and on the Horizon
IDERA has a wide range of products that address various areas in the database life cycle management and beyond. “We can help our customers take their data all the way from the model through the database to the applications and into testing very efficiently,” said Ruff. Huizenga adds that at the core is ER/Studio, because it facilitates a return to those basics. “Everything is there: we’ve got the integrated data modeling, process modeling, enterprise architecture, and governance.”
Companies using disparate tools located in silos, cobbling together Visio diagrams, trying to do data models in one place and process models in another need to find an integrated approach, he said, so that data can be understood in context. Integration between IDERA’s ER/Studio products allows business process models to be imported into the logical and physical modeling tool. All of those models can be published into the ER/Studio Team Server for proper Data Governance and Metadata Management. “It’s an easier approach because all those things can just talk to each other,” said Ruff.
In the future, Huizenga said, IDERA will be introducing a growing number of new platforms while also focusing on foundational ideas:
“Again, it’s back to basics. Don’t lose focus of what’s important while you’re looking at other things and building out the range of capabilities. We still have to keep looking after all those core competencies.”
Image used under license from Shutterstock.com