With the advancements in Artificial Intelligence, developing and deploying large language model (LLM) applications has become increasingly complex and demanding. To address these challenges, let’s explore LangSmith. LangSmith is a new cutting-edge DevOps platform designed to develop, collaborate, test, deploy, and monitor LLM applications. This article will explore how to debug and test LLMs in LangSmith.
LangSmith is a comprehensive platform that streamlines the entire lifecycle of LLM application development, from ideation to production. It is a robust solution tailored to the unique requirements of working with LLMs, which are inherently massive and computationally intensive. When these LLM applications are deployed into production or specific use cases, they require a robust platform to evaluate their performance, enhance their speed, and trace their operational metrics.
As the adoption of LLMs soars, the need for a dedicated platform to manage their complexities has become clear. Large Language Models are computationally intensive and require continuous monitoring, optimization, and collaboration for real-world effectiveness and reliability. LangSmith addresses these needs by providing a comprehensive suite of features, including the productionization of LLM applications, ensuring seamless deployment, efficient monitoring, and collaborative development.
LangSmith offers a comprehensive suite of features for bringing LLMs into real-world production. Let’s explore these features:
LangChain, a popular framework for building applications with large language models, simplifies the prototyping of LLM applications and agents. However, transitioning these applications to production can be unexpectedly challenging. Iterating on prompts, chains, and other components is essential for creating a high-quality product, and LangSmith streamlines this process by offering dedicated tools and features.
LangSmith addresses the critical needs of developing, deploying, and maintaining high-quality LLM applications in a production environment. With LangSmith, you can:
In addition to its core features, LangSmith offers several powerful services specifically tailored for LLM application development and deployment:
With its comprehensive suite of features and services, LangSmith is poised to revolutionize the way LLM applications are developed, deployed, and maintained. By addressing the unique challenges of working with these powerful models, LangSmith empowers developers and organizations to unlock the full potential of LLMs, paving the way for a future where AI-driven applications become an integral part of our daily lives.
LangSmith UI comprises four core components:
With its comprehensive features and robust architecture, LangSmith empowers developers to efficiently build, test, and refine LLM applications throughout their entire lifecycle. From leveraging the latest LLM models to incorporating human feedback and managing datasets, LangSmith provides a seamless and streamlined experience, enabling developers to unlock the full potential of these powerful AI technologies.
Upon signing up for LangSmith, you’ll find that a default project is already enabled and ready to explore. However, as you delve deeper into LLM application development, you’ll likely want to create custom projects tailored to your needs.
To embark on this journey, simply navigate to the “Create New Project” section within the LangSmith platform. Here, you’ll be prompted to provide a name for your project, which should be descriptive and representative of the project’s purpose or domain.
Additionally, LangSmith offers the option to include a detailed description of your project. This description can serve as a comprehensive overview, outlining the project’s objectives, intended use cases, or any other relevant information that will help you and your team members effectively collaborate and stay aligned throughout the development process.
One of LangSmith’s key features is its ability to incorporate datasets for evaluation and training purposes. When creating a new project, you’ll notice a dropdown menu labeled “Choose Default.” Initially, this menu may not display any available datasets. However, LangSmith provides a seamless way to add your custom datasets.
By clicking on the “Add Dataset” button, you can upload or import the dataset you wish to use for your project. This could be a collection of text files, structured data, or any other relevant data source that will be the foundation for evaluating and fine-tuning your LLM models.
Furthermore, LangSmith allows you to include metadata with your project. Metadata can encompass a wide range of information, such as project tags, categories, or any other relevant details that will help you organize and manage your projects more effectively.
Once you’ve provided the necessary project details, including the name, description (if applicable), dataset, and metadata, you can submit your new project for creation. With just a few clicks, LangSmith will set up a dedicated workspace for your LLM application development with the tools and resources you need to bring your ideas to life.
After creating your new project in LangSmith, easily access it by navigating to the “Projects” icon and sorting the list alphabetically by name.
Your newly created project will be visible. Simply click on its name or details to open the dedicated workspace tailored for LLM application development. Within this workspace, you’ll find all the necessary tools and resources to develop, test, and refine your LLM application.
As you delve into your new project within LangSmith, you’ll notice the “Test-1-Demo” section. This area provides a comprehensive overview of your project’s performance, including detailed information about prompt testing, LLM calls, input/output data, and latency metrics.
Initially, since you haven’t yet tested any prompts using the Prompt Playground or executed any Root Runs or LLM Calls, the sections for “All Runs,” “Input,” “Output,” and “All About Latency” may appear empty. However, this is where LangSmith’s analysis and filtering capabilities truly shine.
On the right-hand side, you’ll find the “Stats Total Tokens” section, which offers various filtering options to help you gain insights into your project’s performance. For instance, you can apply filters to identify whether there were any interruptions during the execution or to analyze the time taken to generate the output.
Let’s explore LangSmith’s default project to understand these filtering capabilities better. By navigating to the default project and accessing the “Test-1-Demo” section, you can observe real-world examples of how these filters can be applied and the insights they can provide.
The filtering options within LangSmith allow you to slice and dice the performance data. Moreover, they enable you to identify bottlenecks, optimize prompts, and fine-tune your LLM models for optimal efficiency and accuracy. Whether you’re interested in analyzing latency, token counts, or any other relevant metrics, LangSmith’s powerful filtering tools empower you to comprehensively understand your project’s performance, paving the way for continuous improvement and refinement.
You’ll find various options and filters to explore under the “Default” project in the “Test-1-Demo” section. One option lets you view data from the “Last 2 Days,” providing insights into recent performance metrics. Additionally, you can access the “LLM Calls” option. This option offers detailed information about the interactions between your application and the LLMs employed. Therefore, enabling you to optimize performance and resource utilization.
To analyze your project’s performance, you’ll need to begin by creating a prompt. Navigate to the left-hand icons and select the “Prompts” option, the last icon in the list. Here, you can create a new prompt by providing a descriptive name. Once you’ve created the prompt, proceed to the “Prompt Playground” section. In this area, you can input your prompt, execute it, and observe various factors such as latency, outputs, and other performance metrics. By leveraging the “Prompt Playground,” you can gain valuable insights into your project’s behavior, enabling you to optimize root runs, LLM calls, and overall efficiency.
To explore LangSmith’s capabilities, start by navigating to the “Prompts” section, represented by the last icon on the left-hand side of the interface. Here, you can create a new prompt by providing a descriptive name. Once you’ve named your prompt, proceed to the “Prompt Playground” area. This dedicated space allows you to input and execute your prompt, enabling you to analyze its performance and observe various metrics, such as latency and outputs.
Next, click on the “+prompt” button. You will find fields for a System Message and a Human Message. Furthermore, you can also provide your OpenAI API key to use models like ChatGPT 3.5 or enter their respective API keys to use other available models. You can test several free models.
Here’s a sample System Message and Human Message to experiment with and analyze using LangSmith:
You are a counselor who answers students’ general questions to help them with their career options. You need to extract information from the user’s message, including the student’s name, level of studies, current grades, and preferable career options.
Good morning. I am Shruti, and I am very confused about what subjects to take in high school next semester. In class 10, I took mathematics majors and biology. I am also interested in arts as I am very good at fine arts. However, my grades in maths and biology were not very good. They went down by 0.7 CGPA from a 4 CGPA in class 9. The response should be formatted like this: {student name: “”, current level of studies: “”, current grades: “”, career: “”}
When you submit it by selecting the model, you can adjust parameters like temperature to fine-tune, tweak, and improve its performance. After receiving the output, you can monitor the results for further performance enhancement.
Return to the project icon to see an update regarding the prompt experimentation. Click on it to review and analyze the results.
When you select the prompt versions you have tested, you can review their detailed characteristics to refine and enhance the output responses.
You will see information such as the number of tokens used, latency, and associated costs. Additionally, you can apply filters on the right-side panel to identify failed prompts or those that took more than 10 seconds to generate. This allows you to experiment, conduct further analysis, and improve performance.
Using the WebUI provided by LangSmith, you can trace, evaluate, and monitor your prompt versions. You can create prompts and choose to keep them public for sharing or private. Additionally, you can experiment with annotations and datasets for benchmarking purposes.
In conclusion, you can create a Retrieval-Augmented Generation (RAG) application with a vector database and integrate it seamlessly with LangChain and LangSmith. This integration allows for automated updates within LangSmith, enhancing the efficiency and effectiveness of your LLM development and its application. Stay tuned for the next article to delve deeper into this process. Additionally, we will explore additional advanced features and techniques to optimize your LLM workflows further.
A. LangSmith is a DevOps platform designed for developing, testing, deploying, and monitoring large language model (LLM) applications. It offers tools for performance monitoring, dataset management, and collaborative development. LangChain, on the other hand, is a framework for building applications using LLMs, focusing on creating and managing prompts and chains. While LangChain aids in prototyping LLM applications, LangSmith supports their productionization and operational monitoring.
A. LangSmith offers a free tier that provides access to its core features, allowing users to start developing, testing, and deploying LLM applications without initial cost. However, for advanced features, larger datasets, and more extensive usage, LangSmith may require a subscription plan or pay-as-you-go model.
A. Yes, LangSmith can be used independently of LangChain.
A. Currently, LangSmith is primarily a cloud-based platform, providing a comprehensive suite of tools and services for LLM application development and deployment. While local usage is limited, LangSmith offers robust API and integration capabilities, allowing developers to manage aspects of their LLM applications locally while leveraging cloud resources for more intensive tasks such as monitoring and dataset management.