Many organizations see the hybrid cloud and multi-cloud models as critical to their present business requirements, and to their future ones too. Organizations are paying more attention to the use of containers to help with simple and automated portability and scaling that can play an important role in accelerating cloud projects and deliverables; to cloud marketplaces that can present curated catalogs and e-stores for external clients; and to AWS and Azure’s part in cloud use cases, such as virtual data lakes, analytics, SaaS integration, cloud data warehouse modernization, and machine learning in the cloud.
But many questions remain: There are concerns about security and cost management for data migration to the cloud. But opportunities are under discussion too. How might re-factoring or re-architecting apps affect their infrastructures regarding their migration plans, potentially enhancing the value of their applications?
With so many organizations considering a hybrid cloud deployment model and strategy, including multi-cloud, they are also looking for answers on how to best make the change. This is why data virtualization has now become a prominent discussion for so many organizations.
Goodbye to Complexity
With the prevalence of distributed environments — data warehouses in multiple public clouds, private clouds and data centers — data integration and management are critical, says Paul Moxon, SVP of Data Architecture and Chief Evangelist at data virtualization provider Denodo. A modern integration platform and data architecture looks at combining the on-premise and cloud data world — everything from a Teradata warehouse on-site to Snowflake in the cloud.
Structured and unstructured data from enterprise, big data, and cloud sources are unified for batch and real-time operations. The Denodo Platform on AWS Cloud and on Azure, for instance, enables users to quickly leverage data virtualization with cloud computing capabilities and simplify their migration journey to the cloud.
Moxon said:
“We recognize where people are going, and we help them to find and access data in the easiest way possible. We hide the complexity of IT from the users of data with a data virtualization platform, which provides an abstraction layer for that purpose.”
It doesn’t matter to users where data comes from or the format it’s in. And it makes it possible to switch out data sources without affecting users. The data stays in place, and the data virtualization layer can provide data to each type of user and application in the format that best suits their needs, which the company says results in lower costs compared to traditional approaches based on data replication.
Denodo’s virtualization platform is also designed to reduce the number of data errors that can occur when data is moved around. And by offering a single point of access to any piece of information, it unifies security management.
It’s important to remember that the majority of data users are casual ones as opposed to data scientists, who love to play with raw data in data lakes in their own silos. A lot of work data scientists do is focused on cleaning and preparing data. “They miss the 90 percent of users who want answers, not data,” Moxon said.
What you really want for significant value, he said, are data engineers — generally, the information consumers themselves. “That may be the expert who is a wizard at Excel pivot tables for manipulating data,” Moxon said. “The benefit of a virtualization platform is that these guys don’t have to worry about the cleaning and the parsing of data.”
The platform supports virtual data marts in the virtual data layer through normalized views, which can be access through Excel. “That means data users can analyze the data using pre-built templates within Excel, with which they might be comfortable.” Thus, they can focus on analyzing the data to answer their queries rather than spending time accessing the data from disparate sources.
It’s also necessary to have the ability to run in containers like Docker, which among use cases includes simplicity for moving container configurations and workloads across the cloud, and scaling up or scaling out containers in terms of needed compute capacity:
“We provide all the benefits of data virtualization including the ability to provide real-time access to integrated data across an organization’s diverse data sources, without replicating any data. The Denodo Platform offers the broadest access to structured and unstructured data residing in enterprise, big data, and cloud sources in both batch and real time, exceeding the performance needs of data-intensive organizations.”
GDPR Needs Data Virtualization, Too
Data virtualization can be used for a number of scenarios. Adhering to GDPR is one among them because data virtualization can be leveraged to manage data access from a single point.
“If you are in the U.S., you are not allowed to have a physical copy of EU citizens’ data in the U.S.,” said Moxon. “But if a U.S. corporation has customer data in the EU, it can use data virtualization to access it for analytics or physical reporting purposes.” Data stays where it is, but through data virtualization, performing global analytics on all of it is still possible.
They are also working with a big global pharmaceutical firm that has databases and a data center in Germany; it uses its data virtualization platform to give access to that data from its U.S. operations. “The virtualization layer is built to virtually combine German data with U.S. data,” he said.
Image used under license from Shutterstock.com