Efficient LLM Workflows with LangChain Expression Language

Ritika Gupta 01 Jul, 2024
10 min read

Introduction

The advancements in LLM world is growing fast and the next chapter in AI application development is here.  LangChain Expression Language (LCEL) isn’t just an upgrade—it’s a game-changer. Initially known for proof-of-concepts, LangChain has rapidly evolved into a powerhouse Python library for LLM interactions. With the introduction of LCEL in August 2023, it’s now easier than ever to turn ideas into robust, scalable applications. This blog dives deep into LCEL, demonstrating its knack for simplifying complex workflows and empowering developers to harness the full potential of AI. Whether you’re new to LLM applications or a seasoned coder, LCEL promises to revolutionize how you build and deploy custom LLM chains.

In this article, we’ll learn what LCEL is, how it works, and the essentials of LCEL chains, pipes, and Runnables.

Learning Objectives

  • Understand the Chaining operator (|) and how it functions.
  • Gain an in-depth insight into usage of LCEL .
  • Learn to create simple chain using LCEL.
  • Learn to create advanced RAG application using LCEL.
  • Implement Runnable Parallel , Runnable Passthrough and Runnable Lambda using LCEL.

This article was published as a part of the Data Science Blogathon.

What is LangChain Expression Language(LCEL) ? 

A “minimalist” code layer for creating chains of LangChain components is made possible by the LangChain Expression Language (LCEL), which is an abstraction of some intriguing Python ideas. It basically uses the pipe operator which is similar to Unix commands where we can pass output of previous function to next function using pipe operator.

LCEL comes with strong support for:

  • Superfast development of chains.
  • Advanced features such as streaming, async, parallel execution, and more.
  • Easy integration with LangSmith and LangServe.

LCEL Syntax

Using  LCEL we create our chain differently using pipe operators (|) rather than Chains objects.

Let us first refresh some concepts related to LLM chain creation . A basic LLM Chain consists of following 3 components there can be many variations into this which we will learn later in code examples.

  • LLM: An abstraction over the paradigm used in Langchain to create completions like Claude, OpenAI GPT3.5, and so on.
  • Prompt: The LLM object uses this as its input to provide inquiries to the LLM and specify its goals. It is basically a string template which we define with certain placeholders for our variables.
  • Output Parser :  A parser defines how to extract output from response and display it as final response.
  • Chain :A chain ties up all the above components. It is a series of calls to an LLM, or any stage in the data processing process.  
LangChain Expression Language 

How the Pipe( | ) Operator Works ?

Let us understand how pipe operator works by creating our own small pipe friendly function. 

When the Python interpreter sees the operator between two objects (like a | b) it attempts to feed object a into the __or__ method of object b. That means these patterns are equivalent:

 Pipe Operator

Let us use this pipe operator to create our own Runnable Class. It will consume a function and turn it into a function which can be chained with other functions using | operator.

class Runnable:
    def __init__(self, func):
        self.func = func

    def __or__(self, other):
        print('or')
        def chained_func(*args, **kwargs):
            # this is nested function in which we create chain of funtion
            #here the other function will consume output on this first function
            #upon which we call the or operator first element
            return other(self.func(*args, **kwargs))
        print('chained func end')
        return Runnable(chained_func)

    def __call__(self, *args, **kwargs):
        return self.func(*args, **kwargs)


#Let's implement this to take the value 3, add 5

Now let us use  this runnable class to chain 2 functions together one is double and second is add one . The below code chains these 2 functions together on input 5.

def double(x):
  return 2 * x

def add_one(x):
  return x + 1

# wrap the functions with Runnable
runnnable_double = Runnable(double)
runnable_add_one = Runnable(add_one)

# run them using the object approach
chain = runnnable_double.__or__(runnable_add_one)
chain(5)  # should return 11

#chain the runnable functions together
double_then_add_one = runnnable_double | runnable_add_one

#invoke the chainLCEL
result = double_then_add_one(5) 
print(result)  # Output: 11

Let us understand the working of above code one by one : 

Creating Runnable Objects

  • Runnable(double): This creates a Runnable object that encapsulates the double function. Let’s call this object runnable_double.
  • Runnable(add_one): Similarly, this one.

Chaining with the | Operator

runnable_double | runnable_add_one: This operation triggers the __or__ magic method (operator method) of runnable_double.

  • Inside __or__, a new function called chained_func is defined. In this function we do chaining of 2 functions on which or operator has been called. This function takes any arguments (*args, **kwargs) and does the following:
    • It calls runnable_double.func(*args, **kwargs) (which is essentially calling double with the given arguments) and passes the result to runnable_add_one.func (which calls add_one).
    • Finally, it returns the output of add_one b in the return statement.
  • The __or__ method returns a new Runnable object (let’s call it double_then_add_one) that stores this chained_func. Note this chained function is returned when we use or symbol or call method __or__ on the runnable object func 1 | func 2.

Calling the Chained Runnable Object

double_then_add_one(5): This calls the calls the __call__ method of the double_then_add_one object.

  • The __call__ method in turn executes the chained_func with the argument 5.
  • As explained in step 2, chained_func calls double(5) (resulting in 10) and then add_one(10)
  • The final result, 11, is returned and assigned to the variable result.

In essence, the Runnable class and the overloaded | operator provide a mechanism to chain functions together, where the output of one function becomes the input of the next. This can lead to more readable and maintainable code when dealing with a series of function calls.

Simple LLM Chain Using LCEL

Now we will create a simple LLM chain using LCEL to see how it makes code more readable and intuitive. 

# Install Libraries
!pip install langchain_cohere langchain --quiet

Generate the Cohere API keys 

We need to generate the free API key for using Cohere LLM. Visit website  and log in using Google account or github account. Once logged in you will land at a cohere dashboard page as shown below.

 Cohere API Keys

Click on API Keys option . You will see a  Trial Free API key is generated. 

### Setup Keys
import os

os.environ["COHERE_API_KEY"] = "YOUR API KEY"

Create prompt , model , parser and chain 

from langchain_core.prompts import PromptTemplate


from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_cohere import ChatCohere

from langchain.schema.output_parser import StrOutputParser

# LLM  Instance
llm = ChatCohere(model="command-r", temperature=0)

#Create Prompt
template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate.from_template(template)
#Create Ouput Parser
output_parser = StrOutputParser()

# LCEL CHAIN
chain = prompt | llm | output_parser

question = """
I have five apples. I throw two away. I eat one. How many apples do I have left?
"""
response = chain.invoke({"question": question})

print(response)

Runnables Interface Langchain

When we are working with LCEL we may have the need to modify the flow of values, or the values themselves as they are passed between components — for this, we can use runnables. We can understand how to use Runnables  class provided by Langchain using RAG example. 

One point about LangChain Expression Language is that any two runnables can be “chained” together into sequences. The output of the previous runnable’s .invoke() call is passed as input to the next runnable. This can be done using the pipe operator (|), or the more explicit .pipe() method, which does the same thing.

We shall learn about 3 types of Runnables  

  • RunnablePassThrough: Passes any input as it is to the next component in chain.
  • RunnableParallel: Passes input to parallel paths simultaneously.
  • RunnableLambda: Allows to convert any Python function into runnable object which can then be used in chain.

RAG Using Runnable Pass Through and Runnable Parallel

The workflow for the RAG is defined in the image below . Let us now build this RAG to understand usage of Runnable Interfaces.

 RAG using LCEL

Installation of Packages

!pip install --quiet langchain langchain_cohere  langchain_community docarray

Define Vector Stores

We create 2 vector stores to demonstrate the use of Runnable Parallel and Pass through

from langchain.embeddings import CohereEmbeddings
from langchain.vectorstores import DocArrayInMemorySearch


embedding = CohereEmbeddings(
    model="embed-english-light-v3.0",
)


vecstore_a = DocArrayInMemorySearch.from_texts(
    ["half the info will be here", "Zoozoo birthday is the 17th September"],
    embedding=embedding
)
vecstore_b = DocArrayInMemorySearch.from_texts(
    ["and half here", "Zoozoo  was born in 1990"],
    embedding=embedding
)

Define Retriever and Chain 

Here the input to the “chain.invoke” will be passed to component retrieval where this input is simultaneously passed to two different paths. One is to retriever_a  whose output is stored in context and passed to next component in chain. The RunnablePassthrough object is used as a “passthrough” take takes any input to the current component (retrieval) and allows us to provide it in the component output via the “question” key. Thus input question is available to prompt component in “question” key.

from langchain_core.runnables import (
    RunnableParallel,
    RunnablePassthrough
)

retriever_a = vecstore_a.as_retriever()
retriever_b = vecstore_b.as_retriever()

# LLM  Instance
llm = ChatCohere(model="command-r", temperature=0)

prompt_str = """Answer the question below using the context:

Context: {context}

Question: {question}

Answer: """
prompt = ChatPromptTemplate.from_template(prompt_str)

retrieval = RunnableParallel(
    {"context": retriever_a, "question": RunnablePassthrough()}
)

chain = retrieval | prompt | llm | output_parser

Invoke chain 

out = chain.invoke("when was Zoozoo born exact year?")
print(out)

Output:

LangChain Expression Language

Using both retrievers parallelly

We now pass the question to both retrievers parallelly to provide additional context in the prompt.

# Using Both retrievers parallely

prompt_str = """Answer the question below using the context:

Context:
{context_a}
{context_b}

Question: {question}

Answer: """
prompt = ChatPromptTemplate.from_template(prompt_str)

retrieval = RunnableParallel(
    {
        "context_a": retriever_a, "context_b": retriever_b,
        "question": RunnablePassthrough()
    }
)

chain = retrieval | prompt | llm | output_parser

Output:

out = chain.invoke("when was Zoozoo born exact date?")
print(out)
LangChain Expression Language

Runnable Lambda 

Now we will see an example of using runnable Lambda for normal python function similar to what we did earlier in understanding or operator

from langchain_core.runnables import RunnableLambda

def add_five(x):
    return x + 5

def multiply_by_two(x):
    return x * 2

# wrap the functions with RunnableLambda
add_five = RunnableLambda(add_five)
multiply_by_two = RunnableLambda(multiply_by_two)

chain = add_five | multiply_by_two
chain.invoke(3)

Custom Function into Runnable chain

We can use runnable lambda to define our own custom functions and add them into llm chain.

The output of LLM response contains different attributes we will create a custom function extract_token to display token count for input question and output response

prompt_str = "You know 1 short line about {topic}?"
prompt = ChatPromptTemplate.from_template(prompt_str)


def extract_token(x):
    token_count = x.additional_kwargs['token_count']
    response=f'''{x.content} \n Input Token Count: {token_count['input_tokens']} 
    \n Output Token Count:{token_count['output_tokens']}'''
    return response
    
get_token = RunnableLambda(extract_token)

chain = prompt | llm  | get_token 

Output:

output = chain.invoke({"topic": "Artificial Intelligence"})
print(output)
LangChain Expression Language

Other Features of LCEL

LCEL has a number of other features also such as async stream batch processing . 

  • .invoke(): The goal is to pass in an input and receive the output—neither more nor less.
  • .batch(): This is faster than using invoke three times when you wish to supply several inputs to get multiple outputs because it handles the parallelization for you.
  • .stream():  We may begin printing the response before the entire response is complete.
prompt_str = "You know 1 short line about {topic}?"
prompt = ChatPromptTemplate.from_template(prompt_str)

chain = prompt | llm | output_parser

# ---------invoke--------- #
result_with_invoke = chain.invoke("AI")

# ---------batch--------- #
result_with_batch = chain.batch(["AI", "LLM", "Vector Database"])
print(result_with_batch)

# ---------stream--------- #
for chunk in chain.stream("Artificial Intelligence write 5 lines"):
  print(chunk, flush=True, end="")

Async Methods of LCEL

Your application’s frontend and backend are typically independent, which means that requests are made to the backend from the frontend. You may need to manage several requests on your backend at once if you have numerous users.

Since most of the code in LangChain is just waiting between API calls, we can leverage asynchronous code to improve API scalability, if you want to understand why it is important I recommend reading the concurrent burgers story of the FastAPI documentation. There is no need to worry about the implementation, because async methods are already available if you use LCEL:

  We can use asynchronous code to increase API scalability because the majority of LangChain’s code consists of basically waiting between API requests. If we use LCEL, async methods are already accessible, thus we don’t need to bother about implementation:

.ainvoke() / .abatch() / .astream: asynchronous versions of invoke, batch and stream.

Langchain achieved those “out of the box” features by creating a unified interface called “Runnable”.

Conclusion

LangChain Expression Language introduces a revolutionary approach to Python application development. Despite its unique syntax, LCEL offers a unified interface that streamlines industrialization with built-in features like streaming, asynchronous processing, and dynamic configurations. Automatic parallelization enhances performance by executing tasks concurrently, enhancing overall efficiency. Furthermore, LCEL’s composability empowers developers to effortlessly create and customize chains, ensuring code remains flexible and adaptable to changing requirements. Embracing LCEL promises not only streamlined development but also optimized execution, making it a compelling choice for modern Python applications.

Key Takeaways

  • LangChain Expression Language (LCEL) introduces a minimalist code layer for creating chains of LangChain components.
  • The pipe operator in LCEL simplifies the creation of function chains by passing the output of one function directly to the next.
  • LCEL enables the creation of simple LLM chains by chaining prompts, LLM models, and output parsers.
  • Custom functions can be included in LLM chains to manipulate or analyze outputs, enhancing the flexibility of the development process.
  • Built-in integrations with LangSmith and LangServe further enhance the capabilities of LCEL, facilitating seamless deployment and management of LLM chains.

Frequently Asked Questions

Q1. How does LCEL improve application performance?

A. LCEL enables automatic parallelization of tasks, which enhances execution speed by running multiple operations concurrently.

Q2. What is the key benefit of using Runnable interfaces in LCEL?

A. Runnable interfaces allow developers to chain functions easily, improving code readability and maintainability.

Q3.  How does LCEL support asynchronous processing?

A. LCEL provides async methods like .ainvoke(), .abatch(), and .astream(), which handle multiple requests efficiently, enhancing API scalability.

Q4. What are the drawbacks of LCEL?

A. LCEL is Not fully PEP compliant , LCEL is DSL (domain specific language) , there is input output dependencies, if we want to access intermediate outputs then we have to pass it all the way to the end of the chain

Q5. Why should developers consider using LCEL in Python applications?

A. Developers should consider LCEL for its unified interface, composability, and advanced features, making it ideal for building scalable and efficient Python applications.

References

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Ritika Gupta 01 Jul, 2024

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear