Today’s data pipelines use transformations to convert raw data into meaningful insights. Yet, ensuring the accuracy and reliability of these transformations is no small feat – tools and methods to test the variety of data and transformation can be daunting. Transformations generally involve changing raw data that has been cleansed and validated for use by […]
Putting Data Mapping to Work
In my previous blog post, I defined data mapping and its importance. Here, I explore how it works, the most popular techniques, and the common challenges that crop up and that teams must overcome to ensure the integrity and accuracy of the mapped data. Data mapping establishes relationships and connections between data elements so we can […]
Starburst Introduces Python DataFrame Support
According to a new press release, Starburst Data has expanded its data analytics platform by introducing Python DataFrame support through the launch of PyStarburst and integrating the Ibis library. This strategic move, accomplished in collaboration with Voltron Data, empowers developers and data engineers to efficiently manage complex data transformation tasks and build data applications seamlessly […]
The Impact of Data Growth on Enterprises
Click to learn more about author Ibrahim Surani. Data has changed the world we live in. Today, enterprises utilize data to introduce new business models, enhance customer experience, generate new revenue streams, create personalized services, and become more agile. This reliance is only increasing with time as the world experiences unprecedented data growth. This data […]
Data Chef ETL Battles: WebLog Data for Clickstream Analysis
Click to learn more about co-author Maarit Widmann. Click to learn more about co-author Anna Martin. Click to learn more about co-author Rosaria Silipo. Do you remember the Iron Chef battles? It was a televised series of cook-offs in which famous chefs rolled up their sleeves to compete in making the perfect dish. Based on a […]
Decoded Data Lineage Helps Tackle Bad Data Quality
What are your outcome expectations of data lineage? No one’s just doing it for fun, after all. Generally speaking, data lineage is a major asset for: Regulatory reporting/governance; trust in decision-making; and, on-premise to cloud migrations. Data lineage tools track business data flow from originating source through all the steps in its lifecycle to destination. […]
The Three Pillars of Agile Data Mastering
Click to learn more about author Mark Marinelli. We’ve explored the benefits of an agile data mastering approach in a previous post, but let’s do a quick recap: Many businesses that collect a large amount of data have an accumulating data mastering issue that leaves their data largely untouchable and riddled with inaccuracies. The problem […]
Data Strategy Not Working? Grab a Cloud or Two or Three
Click to learn more about author Ingo Fuchs. Cloud has become a critical element of today’s IT strategies. You may not want to, but you have to consider how to best take advantage of cloud. Is your organization focusing on an IT Agenda? Is it focusing on a Transformation Agenda? If you answered yes to […]
Predictions for Big Data Analytics in 2019
Click to learn more about author James Kobielus. Big Data Analytics has been one of the dominant tech trends of this decade, and it’s also been one of the most dynamic and innovative segments of the IT market. Today’s Big Data Analytics market is quite different from the industry of even a few years ago, and […]
Data Virtualization Defined: How it Helps Organizations Succeed
Data Virtualization (DV) is unlike traditional Data Integration, where change must be made on multiple layers; Data Virtualization makes change easy for the business as new requirements and sources can be integrated and changed rapidly. The Data Management Association International (DAMA) Data Management Book of Knowledge (DMBOK), second edition, describes Data Virtualization as: “Data Virtualization […]