You may have heard of data science’s 80/20 rule. It states that 80% of a data scientist’s time is spent dealing with messy data and only 20% of it is spent performing analysis. However, a critical aspect that was overlooked, until recently, was the operationalization and deployment of data science and specifically, machine learning pipelines. Be it a startup or enterprise, it’s common to hear of ML projects getting stuck in the proof-of-concept phase.
This hack session will walk participants through deploying machine learning models locally and on a cloud platform. To do this, we’ll borrow relevant principles from software engineering and DataOps disciplines. We’ll cover various concepts including the need for CI/CD pipelines for ML, retraining, versioning of code/model/data, containerization, inference APIs, and monitoring. We’ll particularly focus on doing all of these at scale.
Key Takeaways:
- Understanding of the need for and basic principles of MLOps
- Basic building blocks of production ML pipelines
- Learn how to use cloud platforms for handling scale