PyData NYC 2022

How we upstreamed our internal goals to JupyterLab 4
11-10, 14:15–15:00 (America/New_York), Central Park West (6th floor)

Two Sigma's financial scientists spend many of their most productive hours testing their hypotheses about the market in JupyterLab. Like with any research environment, responsiveness and easy collaboration are key to keep our users focused on their hypotheses, not the tools they use.

In this talk I will discuss Two Sigma's partnership with QuantStack's team to deliver direct-to-upstream contributions to improve JupyterLab's performance and furtheir its Real-Time Collaboration (RTC) initiative, all to be soon seen in JupyterLab 4.

Beyond just a teaser of new features to get excited for, this talk is a testimonial of a fruitful and successful relationship between two private entities, Two Sigma and QuantStack, which not only furthers each partner's own internal goals but also improves quality of life of all Jupyter users.


Brief Summary

Two Sigma's financial scientists spend many of their most productive hours testing their hypotheses about the market in JupyterLab. Like with any research environment, responsiveness and easy collaboration are key to keep our users focused on their hypotheses, not the tools they use.

In this talk I will discuss Two Sigma's partnership with QuantStack's team to deliver direct-to-upstream contributions to improve JupyterLab's performance and furtheir its Real-Time Collaboration (RTC) initiative, all to be soon seen in JupyterLab 4.

Beyond just a teaser of new features to get excited for, this talk is a testimonial of a fruitful and successful relationship between two private entities, Two Sigma and QuantStack, which not only furthers each partner's own internal goals but also improves quality of life of all Jupyter users.

Key takeaways

  • Get excited for JupyterLab 4, which will bring large performance improvements throughout the whole application.

  • JupyterLab 4 will pave the way for a Real-Time Colalboration mode in JupyterLab, similar to Google Docs. Come join the effort!

  • For small and medium-size organizations, an external partnership with a team like QuantStack can help them exponentiate their impact in the open-source community.

  • Making sure your organization runs on the latest version of Jupyter is key to shipping your contributions directly to your users.

  • JupyterLab's extensibility is a great way to deliver value to an organization's internal users. But it can also undermine your organization's ability to stay current with open-source.

Call to action

Two Sigma, QuantStack, and many other members of the Jupyter community are actively contributing to Jupyter's general performance and making Real-Time Collaboration a reality. Use the lessons you learn in this talk to get your organization involved, or get involved yourself!

Audience

This talk is aimed at 1. Jupyter users looking to hear about new and exciting features and 2. users of Jupyter at private entities who are interested in ways to get their organization involved in open-source development. You do not need any specific knowledge of how Jupyter works to get value from this talk. If you're interested in contributing to Jupyter, the talk will also give you an idea of features that are being actively developed by the project's contributors and hopefully get you excited about contributing!

Outline

The talk will be structured as follows:

  • (4 minutes total) Introduction to:
  • (1 minute) The speaker
  • (2 minutes) Two Sigma
  • (1 minute) QuantStack

  • (8 minutes total) Contributions to performance

  • (2 minutes) The problems Two Sigma ran into with performance

  • (2 minutes) How we measure performance: JupyterLab's new UI performance benchmarking tool
  • (4 minutes) Improvements we've made coming in JupyterLab 4:

    • Using "virtual rendering" to avoid rendering the whole notebook
    • Improving performance by migrating to CodeMirror 6
    • Optimizations to the Jupyter server protocol.
  • (8 minutes total) Contributions to real-time collaboration

  • (2 minutes) What is Real Time Collaboration?
  • (4 minutes) This is a large project. What have we contributed?
    • Collaboration UI elements
    • Data replication with yjs (a framework for distributed types)
    • Remote user management
    • Undo/redo support
  • (2 minutes) What major features are left?

    • Other UI elements
    • Authentication
  • (9 minutes total) Getting the most out of your organization's open-source contributions

  • (3 minutes) How Two Sigma and QuantStack work together

  • (3 minutes) How and why Two Sigma keeps internal users up to date in Jupyter versions
  • (3 minutes) Beware of extensions, they can help you deliver internal value, at the cost of agility

  • (2 minutes) Key takeaways, closing notes

  • (9 minutes) Q&A


Prior Knowledge Expected

No previous knowledge expected

I am an engineering manager at Two Sigma, where my team is in charge of maintaining the base tools of the PyData stack internally. Together, we build an ecosystem for our internal researchers and contribute back to open-source. If you're excited about the PyData stack, my team is hiring! Shoot me an email at [email protected] if you're interested.

I was born and raised in Monterrey, México, and moved to the US in 2012 to start college. In my free time I love to ride my bicycle, read fiction, and volunteering with organizations that work with the Hispanic community of New York.