According to a new press release, Starburst Data has expanded its data analytics platform by introducing Python DataFrame support through the launch of PyStarburst and integrating the Ibis library. This strategic move, accomplished in collaboration with Voltron Data, empowers developers and data engineers to efficiently manage complex data transformation tasks and build data applications seamlessly within the Starburst Galaxy ecosystem.
The inclusion of PyStarburst eliminates the need for data professionals to offload their data to other frameworks like PySpark or Snowpark when handling intricate data transformation workloads. Instead, they can utilize a robust Massively Parallel Processing (MPP) engine within Starburst for both analytical and transformation tasks, reducing operational costs and simplifying their data stack. PyStarburst offers a user-friendly syntax similar to PySpark and Snowpark, facilitating the creation and execution of production-grade ETL pipelines and data transformations, thus enhancing productivity and performance.
The integration of Ibis extends the capabilities of Starburst further by providing a consistent Python API that can execute queries across various engines, including DuckDB, pandas, PostgreSQL, and Starburst Galaxy. This seamless scalability allows for a smooth transition from development on a laptop to production within the Starburst Galaxy, all without the need for code rewriting. This move underscores Starburst’s commitment to openness and interoperability, enabling users to create portable Python code for high-performance data lake analytics, supporting data from over 50 different sources, and enabling the construction of analytic expressions across multiple data sources with reusable scripts.
Read more at PR Newswire.
Image used under license from Shutterstock.com