PyData NYC 2022

pandas at a Crossroads, the Past, Present, and Future
11-10, 11:00–11:45 (America/New_York), Central Park West (6th floor)

The pandas library for data manipulation and data analysis is the most widely used open source data science software library in the world. While we don’t have exact numbers, we expect it is used by tens of millions of data scientists every day, in every Fortune 1000 company and every university in their data science curriculum.


Pandas appears to be a healthy, well maintained library that will continue to serve the data science community for many years to come. However, the strategy for Pandas is at a crossroads and the decisions we make about its future will have influence far beyond Two Sigma.

We'll learn about the history of pandas, from its inception as a data analysis library, to the explosive growth as an open source library. We’ll discuss how the growth of big data and big compute challenges panda’s capabilities and is a threat to its continued dominance in the market. Finally, we’ll talk about the crossroads; which paths we are considering at Twos Sigma, and how others can get involved.


Prior Knowledge Expected

No previous knowledge expected

As a former quant Jeff Reback has much experience in building financial trading systems, using python and working with very large data. He has been a core committer to the pandas project since 2011, and has managed the project since 2013. Jeff is a Managing Director at Two Sigma, overseeing the research environment. Jeff holds a B.S. in Computer Science from the Massachusetts Institute of Technology.