Every day, businesses create, collect, compile, store, and share exponentially growing amounts of data. When put to use effectively, sales teams can boost revenue, marketing can improve the customer experience, HR can keep employees happy, and so on. But we all know that cyberattacks are on the rise and evolving data privacy legislation has led to the call for responsible data use. Unfortunately, protecting it is easier said than done. Fortunately, data access governance can help.
As information and security teams work to thwart data breaches and ensure compliance, they can become overly protective of systems, leading to unnecessary lockdown against the people who need access to data. Even a lock-it-all-down approach introduces many layers of complexity, as different technologies require different skill sets. It’s a challenge that’s facing companies of all sizes and across industries and may be the reason why one C-level executive at a large financial institution was recently quoted as saying, “data lakes scare me.” Their concern is certainly understandable because protecting data lakes can be very challenging to get right. One little misconfiguration or mistake can lead to a catastrophic data breach, especially when dealing with the vast amounts of structured and unstructured data that are often contained within them.
Whether using data lakes, data clouds, data warehouses, or data lakehouses, to stay relevant, enterprises need to increase access to data to drive insights and innovation. They must find a way to trust their people to use data responsibly and without fear. As a result, they are looking to automate data access governance processes to safely, responsibly, and quickly derive value from data analytics.
What is Data Access Governance (DAG)?
Data access governance is an aspect of data governance that deals with who has – or doesn’t have – access to what data within an organization. It involves creating standard, reusable policies that govern access to data, and linking those policies to the appropriate identities (usually via roles and groups), and managing and enforcing those policies across systems.
However, manually developing and enforcing policies does not work at scale. With more and more data, applications, and platforms, and additional constraints and ambiguity from increasing external regulations such as GDPR, CCPA, and HIPAA, the percentage of governance tasks has gotten so high that it’s getting in the way of business agility. So the mandate for data access governance today is to automate.
Like most components of the modern data stack, modern data security leverages metadata – data about data – to simplify, standardize, and operationalize policy enforcement. By leveraging metadata, organizations no longer have to hard-code rules that, say George in HR, can work with the “SSN” column in the “HR” database, but Clarice in Accounting cannot, and so on across every sensitive column in each database. Instead, they can use the power of machine learning (ML) and even simple pattern matching to automatically classify data with attributes or tags, then create a policy that says people with certain HR roles can work with data classified as employee PII.
Dynamic Enforcement: The Key to DAG
The modern data management landscape is no longer a monolithic environment. Where the business once might have standardized on an Oracle database and hired talent accordingly, today there’s a wide variety of purpose-built data platforms that serve a variety of needs in the enterprise. This means it is no longer efficient, or sometimes even possible, to spend time provisioning each system with all the necessary data access control policies.
Modern data-driven organizations are adopting an approach known as ABAC (Attribute-Based Access Control) to manage flexible data policies using the earlier mentioned metadata approach. ABAC removes the need to create similar policies over and over again across all data platforms and eliminates the heavy lifting that is needed behind the scenes to actually enforce data policies consistently across the enterprise.
As the importance of data security and governance increases, many point solutions are beginning to offer their own native access control capabilities. These can be incredibly powerful, but complexity is the enemy of security. Until there is a universal policy language (which will be incredibly difficult to achieve), organizations focusing on implementing data access governance in each data platform will require their own team of experts for each of those platforms. Universal data access governance solutions simplify the creation and management of data access policies, and can implement those policies as native controls in data platforms like Snowflake Data Cloud, so it is accomplished as close to the data as possible. More importantly, it can act as the umbrella data access governance solution across all those systems, so even non-technical data stakeholders can then be part of the process.
Finding a Modern Data Access Governance Solution
When looking to evaluate a modern data access governance solution, it’s helpful to consider three fundamental questions:
- How granular can we get with access control? For example, if you need to put a filter on rows or mask a value in the middle of a big table, but the tooling won’t let you do one of these, move on and look for modern data access governance solutions that allow you to do this.
- How logical and flexible is the policy authoring? Determine things like can our policy management deal with dynamic information like geography for data sovereignty, or have individuals achieved certain credentials or clearance levels? To truly simplify and scale data access controls, the chosen solution needs to be aware of data attributes (e.g., personally identifiable, classified, etc.), user attributes (e.g., region=USA, level=11, department=finance), and query context (e.g., time of day and current location).
- Is the solution consistent across different tools? Can we write one policy and have it work – unchanged – across multiple data platforms? Ideally can there be a separation of duties, such that the data security and governance teams can write policies without ever even knowing what platforms are available? If so, then the data owners or engineering teams can simply register data to use those policies.
Modern Data Access Governance Use Cases
To bring all this home, how does data access governance allow organizations to move faster without fear? One data engineer at an international membership organization remarked that because they’re using a universal data access governance solution, their security team has given them permission to work with datasets that were previously locked up because they were classified as “confidential” and “highly confidential.” By dynamically obfuscating or hiding the sensitive information, the rest of the information is available for analysis, and now they have a complete 360-degree view of their entire business for the first time.
At a high level, here are three primary use cases for modern data access governance:
- Analytics on customer PII – Whether in the form of reports, dashboards, or real-time analytics, anyone storing and using personally identifiable customer information should look at automated data access governance. PII doesn’t have to be obviously sensitive information like social security numbers, as it can range from email addresses to phone numbers to birth dates. All of this information is worth protecting. It’s also valuable and necessary as organizations move toward data-driven decision-making to improve not only their bottom line but also the customer’s experience. Dynamic controls let the right stakeholders see the information they need and keep it out of the wrong hands.
- Digital transformation and cloud migration – As enterprises move their legacy infrastructure to the cloud, they must also consider how to deal with the data access, governance, and control policies they built up over years or even decades. Since one of the main reasons for moving to the cloud is agility, they’ll need better, faster ways to update and manage policies. In some cases they will have no choice but to rethink the process, such as when moving from an on-prem “cube” database to a modern data cloud. But there’s always the temptation to “lift and shift” policies built for on-prem Hadoop as 1:1 permissions in the cloud. This should be avoided as organizations will get much better clarity and performance by taking the time to evaluate policies (who sees PII, when and why) rather than named object-level permissions (restrictions on a specific database column). Upgrading to dynamic policy management solves this problem.
- Security and Compliance Analytics – Here’s the real value-add to stakeholders outside the data team. Because modern data access governance solutions monitor who is accessing what data, when, and how, they fulfill use cases ranging from real-time risk assessment to compliance and audit investigation. Compliance, risk, and audit teams can get the work done faster and easier when all the data and tools they need are easily accessible to authorized users.
As businesses continue their march toward data-driven decision-making and self-service analytics, proper data security is the secret accelerant to business agility. The mandate from the business is for users to be able to use data faster at the same time as cybercrime is on the rise and compliance requirements are increasing. A universal approach to data access governance dramatically reduces complexity so organizations can leverage major amounts of data to accelerate innovation and gain a significant advantage over their competitors faster, and without fear.