Moussa Taifi
Moussa Taifi is currently a Senior Data Science Platform Engineer II at Xandr-Microsoft.
He holds a PhD in Computer and Information Science from Temple university. He is a machine learning and big data systems engineer, focused on data science productivity, reliability, performance and cost. He is interested in designing and implementing large scale AI products, through data collection, analysis and warehousing.

Sessions
Machine Learning (ML) systems don’t exist until they are deployed. Unfortunately, prediction latency is one of those edges that hurt badly, and it hurts too late in the product cycle. Stop optimizing that offline TensorFlow/Scikit-learn/PyTorch model performance! Focus on the ML serving latency first, that’s what the client sees first! So, what are some common ways to reduce ML Latency?
This presentation will introduce the audience to the most useful patterns for deploying low-latency ML serving systems.