PyData NYC 2022

Troubleshooting your Data Workflows with Noteable + Dagster: A live debugging of failed jobs.
11-11, 13:30–15:00 (America/New_York), Music Box (5th floor)

Data engineers waste a lot of time troubleshooting long-running pipelines and know only too well the frustration of minor errors consuming hours of work. In this practical tutorial we will demonstrate an innovative solution for dramatically shortening testing cycles and reducing the number of reruns required, boosting developer/practitioner productivity, and reducing frustration on the team.

A productionalized notebook integrated with an orchestration platform provides an excellent balance of reproducibility, flexibility, and intent in a way that will be quickly consumable. This tutorial is valuable to data scientists and data engineers. This setup makes it easy to take notebooks from exploratory to production, but even easier to debug and ensure quality over time. This tutorial will show how you can achieve:

  • Time-saving in initiating jobs: Allowing users to seamlessly transition an exploratory workflow created within a Noteable notebook, into a productionalized scheduled workflow in Dagster.
  • Time and Cost Saving for debugging failed runs: Allowing users to immediately dive into a live running notebook at the point of failure, with all of the in-memory state preserved. This saves the users' time, as well as saves companies' compute costs by not requiring debugging to re-execute previous steps of the workflow.

Prior Knowledge Expected

Previous knowledge expected

Pierre Brunelle is the CEO and Co-Founder of Noteable, a collaborative data notebook that enables data-driven teams to use and visualize data, together. Prior to Noteable, Pierre led Amazon’s notebook initiatives both for internal use as well as for SageMaker. He also worked on many open source initiatives including a standard for Data Quality work and an open source collaboration between Amazon and UC Berkeley to advance AI and machine learning. Pierre helped launch the first Amazon online car leasing store in Europe. At Amazon Pierre also launched a Price Elasticity Service and pushed investments in Probabilistic Programming Frameworks. And Pierre represented Amazon on many occasions to teach Machine Learning or at conferences such as NeurIPS. Pierre also writes about Time in Organization Studies. Pierre holds an MS in Building Engineering from ESTP Paris and an MRes in Decision Sciences and Risk Management from Arts et Métiers ParisTech.

Jamie is a software engineer working on Dagster. She has also built data analysis tools (using Dagster!) for a robotics startup and developed software to train mission planners for the Mars Curiosity rover.