Hack Session: All You Need to Know about Deploying DL Models using Tensorflow Serving

With the advent of Deep Learning and Machine Learning libraries, model prototyping has become very convenient. Consequently, the time taken to build a fairly accurate model has reduced drastically. But the business impact of a model can be observed only after the model is deployed, and thus exposed to real-world data.

Model deployment is a key component in the Machine Learning pipeline. Deploying a model poses several challenges, such as model versioning, containerization of the model, etc. Web frameworks like Flask and Django can be used to wrap the model into a REST API and expose the API. But this solution requires developers to write and maintain code to handle requests to the model and support other deployment-related features as well.

To tackle this problem, Tensorflow introduced Tensorflow Serving which is a flexible, high-performance serving system for machine learning models, designed for production environments. The goal of this talk is to give a brief introduction of TensorFlow Serving and illustrate its features using an example use case.

Key Takeaways for the Audience

How to deploy models using TensorFlow Serving
Interacting with the served model using gRPC or REST
How to deploy different versions of a model using TensorFlow Serving
Advanced configuration in TensorFlow Serving

Check out the video below to know more about the session.

SHOW INTEREST

Hack Session: All You Need to Know about Deploying DL Models using Tensorflow Serving

Tata Ganesh