Multimodal GenAI in Action: Bridging Text, Vision, and Beyond
Multimodal GenAI in Action: Bridging Text, Vision, and Beyond
28 Mar 202514:03pm - 28 Mar 202515:03pm
Multimodal GenAI in Action: Bridging Text, Vision, and Beyond
About the Event
Join this insightful session to explore the exciting world of Multimodal Generative AI and its real-world impact. Discover how these powerful systems combine text, images, audio, and video to deliver richer, more human-like interactions. We’ll break down core architectures, explore alignment techniques, and showcase practical applications. Dive into two innovative systems: LLaVA, a vision-language assistant for visual Q&A, and AI Guide Dog (AIGD), which helps visually impaired users navigate in real time. Whether you're an AI enthusiast or a tech professional, this session will equip you with actionable insights into the future of multimodal AI.
Key Takeaways:
- Understand how multimodal Generative AI integrates text, images, audio, and video for richer interactions.
- Explore the core architectures and techniques that align multiple data modalities effectively.
- Discover real-world applications of multimodal GenAI in healthcare, entertainment, and navigation.
- Gain insights into systems like LLaVA and AI Guide Dog, showcasing practical multimodal AI implementations.
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
Who is this DataHour for?
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
About the Speaker
Aishwarya Jadhav is a Machine Learning Engineer at Waymo (Google), specializing in Perception systems for autonomous robo-taxis. Previously, she worked on Tesla's Autopilot team, contributing to AI models for Full Self-Driving and the Optimus Robot. With expertise in computer vision and large-scale ML, Aishwarya has also worked at Google Ads and Morgan Stanley. She holds a Master’s in Computer and Data Science from Carnegie Mellon University, where she led the AI Guide Dog project, developing real-time navigation systems for the visually impaired. You can reach her on LinkedIn.
Participate in discussion
Registration Details
Registered
Become a Speaker
Share your vision, inspire change, and leave a mark on the industry. We're calling for innovators and thought leaders to speak at our event
- Professional Exposure
- Networking Opportunities
- Thought Leadership
- Knowledge Exchange
- Leading-Edge Insights
- Community Contribution
