Monitoring Production-grade Agentic RAG Pipelines

About

It's time to say bye-bye to naive/vanilla RAG systems where we could easily plug in our sample clean data to query using LLM. Several parameters are needed to upgrade from PoC to production, where performance is a key factor in achieving enhanced results. Search and retrieval systems need proper data preprocessing before being ingested into vector databases. Let us head over to take up a few building blocks to set up such an advanced RAG pipeline that can be deployed and scaled in real time.

Implementing robust and performant RAG systems is the industry's next big goal. Handling multiple operations along with low latency capabilities could present challenges. AI agents have been handy for automating such routing tasks. Observatory tools are the next step for scalability factors, allowing LLM debugging on each encountered step of such workflows. The stack trace helps with app session handling and a deep dive into inferencing flow and outcome. 

This helps uniquely implement LLMs in large knowledge bases and augment them. Manage and control the data to make informed decisions for your business use cases across BFSI, legal, healthcare, and many other similar domains. 

 

Key Takeaways:

  • LLM observatory, evaluation, and monitoring tools for stack trace, debugging, and more.
  • Hands-on implementation of building an end-to-end agentic RAG pipeline.
  • LLM inference optimization techniques like cost/API calls, caching, chaining, etc.
  • Efficient ways of using data and language frameworks for RAG systems with vector data stores. 
  • A brief on grounding and hallucination related to LLM and generative AI concepts.

Speaker

video thumbnail
Book Tickets
Stay informed about DHS 2025

Download agenda

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details