Snowflake Adds Python Support With Winter Release


In a nod to the growing importance of data science and AI development on its platform, Snowflake today unveiled that its next winter release will support written code execution. in Python, which is the world’s most popular language and also the number one language for developing machine learning models.

Python support is in private preview and is being added to Snowpark, SnowflakeThe compute framework for automating compute workflows for data analytics, data science, and data engineering use cases. Snowflake launched Snowpark a year ago with support for Java and Scala, providing users with a Spark-like capability to initiate workflows with DataFrames. And now it is adding support for Python DataFramees due to the high demand.

“We heard it loud and clear,” said Torsten Grabs, director of product management for Snowflake, of Python’s call. “Python is the language of choice for many data scientists and many data engineers. “

Whether it’s PyTorch or scikit-learn, most popular Python machine learning frameworks will now be supported on Snowflake, the $ 118 billion cloud warehousing company that donates AWS, Azure, and Google cloud a race for its data warehousing money.

“What’s exciting is that he’s basically bringing the entire Python ecosystem to Snowflake, all the libraries and all the packages that the Python community has built,” Grabs said. “We welcome the entire Python community to this Snowflake data platform. “

Python’s popularity has grown steadily for years, and it recently dropped C from its number one language rank in the TIOBE index. While the data science community has certainly contributed much of Python’s popularity, its use is also increasing among data engineers. That’s great with Snowflake, which recorded $ 592 million in revenue in fiscal 2021 and went headless earlier this year.

“Data scientists [and] advanced analytics are key audiences for us, ”says Grab. “But we’re also seeing Python become more and more popular with data engineers. It is also very powerful for creating scripts for data pipelines, for example. “

Users can interact with Python through a number of IDEs and notebooks. For Python, this includes Visual Studio Code and PyCharm, in addition to the Jupyter notebook. For Java and Scala, Snowflake supports IntelliJ and Eclipse development environments, Grabs explains.

Snowflake’s Python environment comes through Anaconda, which manages open source tool packages often used in data science and analytics environments. Snowflake relies on Anaconda’s package manager, called Conda, to help keep Python environments up to date and well behaved from a dependency perspective, Grabs says.

“Some parts that are really important to us were making sure we were providing a well-managed environment where you avoided some of the issues that make Python difficult to use,” he says. “That’s why we’ve partnered with Anaconda to make it easier to manage packages and dependencies. “

Snowpark supports Python 3.8, with support for other language versions expected over time. The company adopts a DataFrame API for Python, similar to how Spark works. Developers can write a Python DataFrame, then point that DataFrame to a table in the Snowflake warehouse, and get the results.

Snowflake also supports the ability to save the results of a machine learning training as a User Defined Function (UDF), which can be put in the Snowflake warehouse, where it can be called via SQL. It’s an integral part of Snowflake’s plan to help its customers with analytics as well as machine learning use cases.

“It’s all running on the same compute infrastructure, so we’re not adding a separate product just for Python,” says Grabs. “We are currently integrating Python into the existing runtime environment and compute infrastructure, so that the scalability and performance benefits benefit your Python workload as much as it would benefit your Python workload. work based on SQL or Java. And that then gives you the flexibility to mix, match, and compose beyond those language boundaries, depending on the user’s preferences.

On a data cloud like Snowflake’s, the lines between what is a data analytics workload and what is a data science workload sort of disappear.

“The boundaries between these silos that we had in the past, let’s say between the data science profession, the data engineering profession, then the analytics profession, we see these silos becoming less and less relevant. over time, ”says Grabs. “So we expect these limitations to disappear. And there are huge benefits to that as well. With the data cloud, you want to have access to all kinds of data and not limit access to a particular silo… that the data is relevant for different departments, different functions.

Snowflake is organizing its “Snowday” event today

The winter version of Snowpark brings other benefits to good Snowflake customers around the world, including a new logging framework, support for unstructured file processing, and support for stored procedures. These capabilities are primarily available for Scala and Java, with Python support coming soon.

Support for stored procedures will give customers the ability to run control flow or driver logic on the Snowflake compute rather than running it on a separate virtual machine, Grabs says, while the new logging feature will give customers the option to register custom code.

Support for unstructured files will open the door to new types of ML analysis and use cases in Snowpark, such as the ability to capture audio files of call center interactions, said Grabs. “There is a lot of potential to leverage data science and machine learning, but these are also significant workloads that run on structured and semi-structured media. [data], so it’s not limited, it’s just unstructured data, ”he says.

Snowflake executives Benoit Dageville, co-founder and president of product, and Christian Kleinerman, senior vice president of product, will discuss these new features at his virtual Snowday event today. You can register for the event on the company website.

Related articles:

What is driving the massive popularity of Python?

Newly ‘seatless’ snowflake flurries announcements

Snowflake Expands Data Warehouse with Pipelines and Services

Source link


About Author

Comments are closed.