Working on more complex datasets is becoming everyday’s life of a Machine Learning Engineer. Since the first Youtube-8M Kaggle competition, video classification has been gaining momentum Also, being able to learn a robust representation of a video is not an easy task as it is a complex mix of sequential data (like time series) and images (RGB tensor).
I will cover both unsupervised and supervised tasks using different types of Deep Learning algorithms (CNN, RNN, …). Finally, I will end up with some insights on multi-modal video classification (using images & textual information like descriptions/titles).
Key Takeaways:
- Learn the key concepts of the different deep learning architectures (RNN, CNN, …)
- Transfer Learning using SOTA algorithms (InceptionV3, …)
- Video encoding for video similarity and video classification
- Multi-modal classification