Explore the future of AI with Dr. Vikas Agrawal, Senior Principal Data Scientist at Oracle Analytics Cloud. In this Leading with Data session, he shares insights on problem-solving in data science, MLops, and the impact of generative AI on enterprise solutions. The discussion spans from practical approaches to pitfalls in data science projects, offering essential advice for aspiring data scientists.
In my day-to-day work, I owe a lot to my mentors from various esteemed institutions and companies who instilled in me the philosophy that technology is a means to an end, not the end itself. The key is to spend a significant amount of time understanding the problem – about 90% of the effort goes there. The rest is about finding solutions, which often involves looking at how others have approached similar issues and what the customer ultimately needs. This approach has been fundamental in connecting technology with business impact.
Once we’ve identified a problem worth solving, we first ensure we have the data needed to address it. Then we assess whether the technology exists to solve the problem within a reasonable timeframe. If we see a path, even if it’s a couple of years out, we’ll proceed with a proof of concept (POC). This POC is comprehensive, covering everything from data pipelines to end-to-end functionality, although scalability at this stage is not the primary concern. The goal is to have a clear path to the algorithms, data sources, and nature of the output we’re aiming for.
After a successful POC, we enter the optimization phase, which is where the bulk of the work lies. This involves ensuring the model adapts to different business processes and geographies, and can correct itself when it goes out of distribution. It’s also about ensuring the model can be retrained efficiently and scales appropriately. This phase is critical because it’s where the model transitions from a concept to a practical, deployable solution.
The most costly mistakes usually revolve around AI hype and miscommunication. It’s crucial to set clear and mutual expectations with the customer. Often, customers have high expectations due to the industry buzz around AI, not realizing that the state of the art may not always provide the correct answers they seek. Another pitfall is defining the problem incorrectly, either by not addressing the customer’s issue directly or by attempting to ‘boil the ocean.’
Generative AI is not widely used in most enterprises due to concerns about copyright and IP contamination. However, we do leverage commercially available open-source material. Generative AI has advanced significantly in areas like text summarization, expanding text, and providing explanations. Trustworthiness remains a challenge, and we’re exploring techniques to filter outputs from large language models (LLMs) to ensure they’re reliable for enterprise use.
Generative AI will likely have the most significant impact on workflows involving running text, such as information retrieval and user interfaces. For example, it can dramatically improve enterprise search by retrieving semantically similar pieces of text. It can also revolutionize natural language interfaces for databases, allowing users to ask questions in natural language and receive accurate SQL responses.
It’s an exciting time to be in data science, but it’s crucial to have a strong foundation in mathematics and understand the algorithms you’re working with. As AI tools become more sophisticated, the ability to augment and improve them will be a valuable skill. Those who can create new algorithms or understand the intricacies of existing ones will be in high demand.
In this insightful session, Dr. Vikas Agrawal shared key insights for success in data science career. From emphasizing problem comprehension to navigating pitfalls and embracing generative AI, the interview provides a roadmap. Aspiring data scientists are advised to build a robust foundation in mathematics and algorithms for a field in constant evolution. This interview heralds a new era of innovation in AI.
Stay tuned with us on Leading with Data to catch-up with the journeys of more such pioneer AI and Data Science leaders in the industry. You can checkout our upcoming Leading with Data sessions here!