From Theory to Practice: Training LLMs, Reinforcement Learning, and AI

23 August 2025 | 09:30AM - 05:30PM

About the workshop

In this hands-on workshop, participants will explore the cutting-edge world of Large Language Models (LLMs), Reinforcement Learning (RL), and building autonomous AI agents. Combining theory with hands-on coding examples, this session is designed to bridge the gap between theoretical concepts and real-world applications. By the end of the workshop, participants will have a solid understanding of how to build, train, and fine-tune an LLM for specific applications as well as how to increase their utility with RAG and AI Agents.

*Note: These are tentative details and are subject to change.

Instructor

Modules

In this module, we will review the structure and concepts behind Large Language Models (LLMs). Specifically, we'll focus on Decoder-Only Transformers, which serve as the backbone for generative AI models like ChatGPT and DeepSeek. We'll then review the mathematics required for LLMs and finish by coding a Decoder-Only Transformer from scratch in PyTorch.

In this module, we will learn the essential concepts of Reinforcement Learning (RL), including environments, rewards, and policies. We'll then code an example of RL that can make optimal decisions in an environment with unknown outcomes.

Neural Networks trained with RL have become masters at playing games and can even drive cars. In this module, we will learn the details of how RL is applied to neural networks. Specifically, we'll learn the Policy Gradients algorithm for training a neural network with limited training data. We'll then code a neural network in PyTorch that is trained with Policy Gradients.

In this module, we will learn how Reinforcement Learning can cost-effectively fine-tune an LLM. Specifically, we'll learn how a relatively small amount of human feedback can allow LLMs to train themselves to generate useful and helpful responses to prompts. We'll then use RLHF to train the decoder-only transformer that we coded in the first module.

This module dives into advanced alignment techniques for refining LLM behavior. We’ll implement Proximal Policy Optimization (PPO) to stabilize RL fine-tuning, explore Direct Preference Optimization (DPO)—a non-RL method using KL-divergence to control outputs—and analyze how systems like DeepSeek leverage frameworks like GRPO for efficiency.

In this module, we will learn how to enhance LLMs with external knowledge using Retrieval-Augmented Generation (RAG). We’ll implement semantic search with transformer-based embeddings, use a vector database for efficient nearest-neighbor search (KNN), and refine results using the rerank tool. Finally, we’ll build a RAG pipeline that combines retrieval from a database and generation of a response.

In this module, we will learn how Large Language Models can act as autonomous agents that plan, reason, and interact with tools. We’ll study architectures for breaking tasks into reasoning steps and explore how agents use memory (e.g., vector databases) and tools (APIs, code execution) to solve complex problems. We’ll then design a basic agent that chains LLM decisions into goal-driven workflows and analyze how it balances exploration vs. exploitation.

Basic Python programming skills

*Note: These are tentative details and are subject to change.
Download Brochure

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details