In this talk, I will speak about recent advances in attention-based models for deep learning, and describe an end-to-end example of converting content images in the wild into fine-grained text details.
I will specifically talk about parsing e-commerce product images to describe them accurately in the form of structured text. Such structured text extraction from images immensely helps e-commerce search engines to index and retrieve relevant product results for various search queries.