Click to learn more about author Tejasvi Addagada.
Data virtualization can facilitate integration of data in an agile or a change-driven style. Virtualization provides unified access to disparate data sources and provides data consumers an edge in data provisioning. The lack of platform governance has led to siloes in data platforms hosting data.
Data virtualization is all about introducing a logical layer between the data providers and the consumers. This logical layer can run on-demand retrieval, transforming and combining data virtually, to deliver required data to consumers. Often, you will hear the synonyms “Data-As-A-Service” or “Virtual Data Access.”.
Given the heterogenous nature of the landscape, it can be quite challenging for a bank to embrace customer personalization across its channels in order to collate behavioral insights . But the game changes when an architect goes into a data glossary, finds the business term and the required coverage of data, requests data, and puts an API to expose it to be consumed. Though the data exists over tens of systems, the consumer would never feel the burden of having to do this the traditional waterfall way – by planning and moving data through pipelines. And, remember, 80% of consumption requests are often for reading data rather than for posting updates.
In the erstwhile data integration approach, the solution is supposed to copy data from heterogenous data sources physically into a common integrated reference store like a warehouse, which is specifically designed for that purpose of ingestion. As advancements have come, the warehouse got replaced by a data lake for this purpose. In contrast, a virtualization solution doesn’t move data between sources, but still is able to integrate it for consumption. Data virtualization thus creates virtual access to data.
I will speak about variants of virtualization in my next piece that covers data storage virtualization, virtualization for integration, data access virtualization, and copy data virtualization.