PyData NYC 2022

Distributed Python with Ray: Hands on with the Ray 2.0 APIs for scaling Python Workloads
11-11, 13:30–15:00 (America/New_York), Central Park East (6th floor)

This is an introductory and hands-on guided tutorial of Ray2.0 that covers an introductory, hands-on coding tour through the core features of Ray, which provides powerful yet easy-to-use design patterns for implementing distributed systems in Python. This tutorial includes a brief talk to provide an overview of concepts, why Ray for distributing Python and Machine Learning workloads, and a brief discussion on Ray AIR.


An introduction to Ray (https://www.ray.io/), the system for scaling your Python and machine learning applications from a laptop to a cluster. We'll start with a hands-on exploration of the core Ray API for distributed workloads, covering basic distributed Ray 2.0 Core API patterns, and then move on to a quick introduction to Ray AIR.

  • Remote Python functions as tasks
  • Remote objects as futures
  • Remote Python classes as stateful actors
  • Quick introduction to Ray AIR

Key takeaways:

  • Understand what the Ray 2.0 is and why to use it
  • Learn about Ray Core basic Python APIs
  • Use Ray APIs to convert Python functions and classes into distributed stateless and stateful tasks
  • Use Dashboard for inspection
  • Learn about Ray AIR for building end-to-end ML applications

To follow this tutorial in class, follow the instructions on how to setup your laptop.
https://github.com/dmatrix/ray-core-tutorial


Prior Knowledge Expected

Previous knowledge expected

Richard Liaw is an engineering manager at Anyscale, where he leads a team in building open source libraries on top of Ray. He is on leave from the PhD program at UC Berkeley, where he worked at the RISELab advised by Ion Stoica, Joseph Gonzalez, and Ken Goldberg. In his time in the PhD program, he was part of the Ray team, building scalable ML libraries on top of Ray.

Jules S. Damji is a lead developer advocate at Anyscale and an MLflow contributor. He is a hands-on developer with over 20 years of experience and has worked at leading companies such as Sun Microsystems, Netscape, @Home, Opsware/Loudcloud, VeriSign, ProQuest, Hortonworks, and Databricks, building large-scale distributed systems. He holds a BSc and MSc in computer science (from Oregon State University and Cal State, Chico, respectively), and an MA in political advocacy and communication (from Johns Hopkins University).