Efficient LLM Workflows with LangChain Expression Language

Ritika Last Updated : 01 Jul, 2024

10 min read

Introduction

The advancements in LLM world is growing fast and the next chapter in AI application development is here. LangChain Expression Language (LCEL) isn’t just an upgrade—it’s a game-changer. Initially known for proof-of-concepts, LangChain has rapidly evolved into a powerhouse Python library for LLM interactions. With the introduction of LCEL in August 2023, it’s now easier than ever to turn ideas into robust, scalable applications. This blog dives deep into LCEL, demonstrating its knack for simplifying complex workflows and empowering developers to harness the full potential of AI. Whether you’re new to LLM applications or a seasoned coder, LCEL promises to revolutionize how you build and deploy custom LLM chains.

In this article, we’ll learn what LCEL is, how it works, and the essentials of LCEL chains, pipes, and Runnables.

Learning Objectives

Understand the Chaining operator (|) and how it functions.
Gain an in-depth insight into usage of LCEL .
Learn to create simple chain using LCEL.
Learn to create advanced RAG application using LCEL.
Implement Runnable Parallel , Runnable Passthrough and Runnable Lambda using LCEL.

This article was published as a part of the Data Science Blogathon.

What is LangChain Expression Language(LCEL) ?
How the Pipe( | ) Operator Works ?
Simple LLM Chain Using LCEL
Runnables Interface Langchain
RAG Using Runnable Pass Through and Runnable Parallel
Runnable Lambda
Other Features of LCEL
Async Methods of LCEL
Frequently Asked Questions

What is LangChain Expression Language(LCEL) ?

A “minimalist” code layer for creating chains of LangChain components is made possible by the LangChain Expression Language (LCEL), which is an abstraction of some intriguing Python ideas. It basically uses the pipe operator which is similar to Unix commands where we can pass output of previous function to next function using pipe operator.

LCEL comes with strong support for:

Superfast development of chains.
Advanced features such as streaming, async, parallel execution, and more.
Easy integration with LangSmith and LangServe.

LCEL Syntax

Using LCEL we create our chain differently using pipe operators (|) rather than Chains objects.

Let us first refresh some concepts related to LLM chain creation . A basic LLM Chain consists of following 3 components there can be many variations into this which we will learn later in code examples.

LLM: An abstraction over the paradigm used in Langchain to create completions like Claude, OpenAI GPT3.5, and so on.
Prompt: The LLM object uses this as its input to provide inquiries to the LLM and specify its goals. It is basically a string template which we define with certain placeholders for our variables.
Output Parser : A parser defines how to extract output from response and display it as final response.
Chain :A chain ties up all the above components. It is a series of calls to an LLM, or any stage in the data processing process.

How the Pipe( | ) Operator Works ?

Let us understand how pipe operator works by creating our own small pipe friendly function.

When the Python interpreter sees the | operator between two objects (like a | b) it attempts to feed object a into the __or__ method of object b. That means these patterns are equivalent:

Let us use this pipe operator to create our own Runnable Class. It will consume a function and turn it into a function which can be chained with other functions using | operator.

class Runnable:
    def __init__(self, func):
        self.func = func

    def __or__(self, other):
        print('or')
        def chained_func(*args, **kwargs):
            # this is nested function in which we create chain of funtion
            #here the other function will consume output on this first function
            #upon which we call the or operator first element
            return other(self.func(*args, **kwargs))
        print('chained func end')
        return Runnable(chained_func)

    def __call__(self, *args, **kwargs):
        return self.func(*args, **kwargs)


#Let's implement this to take the value 3, add 5

Now let us use this runnable class to chain 2 functions together one is double and second is add one . The below code chains these 2 functions together on input 5.

def double(x):
  return 2 * x

def add_one(x):
  return x + 1

# wrap the functions with Runnable
runnnable_double = Runnable(double)
runnable_add_one = Runnable(add_one)

# run them using the object approach
chain = runnnable_double.__or__(runnable_add_one)
chain(5)  # should return 11

#chain the runnable functions together
double_then_add_one = runnnable_double | runnable_add_one

#invoke the chainLCEL
result = double_then_add_one(5) 
print(result)  # Output: 11

Let us understand the working of above code one by one :

Creating Runnable Objects

Runnable(double): This creates a Runnable object that encapsulates the double function. Let’s call this object runnable_double.
Runnable(add_one): Similarly, this one.

Chaining with the | Operator

runnable_double | runnable_add_one: This operation triggers the __or__ magic method (operator method) of runnable_double.

Inside __or__, a new function called chained_func is defined. In this function we do chaining of 2 functions on which or operator has been called. This function takes any arguments (*args, **kwargs) and does the following:
- It calls runnable_double.func(*args, **kwargs) (which is essentially calling double with the given arguments) and passes the result to runnable_add_one.func (which calls add_one).
- Finally, it returns the output of add_one b in the return statement.
The __or__ method returns a new Runnable object (let’s call it double_then_add_one) that stores this chained_func. Note this chained function is returned when we use or symbol or call method __or__ on the runnable object func 1 | func 2.

Calling the Chained Runnable Object

double_then_add_one(5): This calls the calls the __call__ method of the double_then_add_one object.

The __call__ method in turn executes the chained_func with the argument 5.
As explained in step 2, chained_func calls double(5) (resulting in 10) and then add_one(10)
The final result, 11, is returned and assigned to the variable result.

In essence, the Runnable class and the overloaded | operator provide a mechanism to chain functions together, where the output of one function becomes the input of the next. This can lead to more readable and maintainable code when dealing with a series of function calls.

Simple LLM Chain Using LCEL

Now we will create a simple LLM chain using LCEL to see how it makes code more readable and intuitive.

# Install Libraries
!pip install langchain_cohere langchain --quiet

Generate the Cohere API keys

We need to generate the free API key for using Cohere LLM. Visit website and log in using Google account or github account. Once logged in you will land at a cohere dashboard page as shown below.

Click on API Keys option . You will see a Trial Free API key is generated.

### Setup Keys
import os

os.environ["COHERE_API_KEY"] = "YOUR API KEY"

Create prompt , model , parser and chain

from langchain_core.prompts import PromptTemplate


from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_cohere import ChatCohere

from langchain.schema.output_parser import StrOutputParser

# LLM  Instance
llm = ChatCohere(model="command-r", temperature=0)

#Create Prompt
template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate.from_template(template)
#Create Ouput Parser
output_parser = StrOutputParser()

# LCEL CHAIN
chain = prompt | llm | output_parser

question = """
I have five apples. I throw two away. I eat one. How many apples do I have left?
"""
response = chain.invoke({"question": question})

print(response)

Runnables Interface Langchain

When we are working with LCEL we may have the need to modify the flow of values, or the values themselves as they are passed between components — for this, we can use runnables. We can understand how to use Runnables class provided by Langchain using RAG example.

One point about LangChain Expression Language is that any two runnables can be “chained” together into sequences. The output of the previous runnable’s .invoke() call is passed as input to the next runnable. This can be done using the pipe operator (|), or the more explicit .pipe() method, which does the same thing.

We shall learn about 3 types of Runnables

RunnablePassThrough: Passes any input as it is to the next component in chain.
RunnableParallel: Passes input to parallel paths simultaneously.
RunnableLambda: Allows to convert any Python function into runnable object which can then be used in chain.

RAG Using Runnable Pass Through and Runnable Parallel

The workflow for the RAG is defined in the image below . Let us now build this RAG to understand usage of Runnable Interfaces.

Installation of Packages

!pip install --quiet langchain langchain_cohere  langchain_community docarray

Define Vector Stores

We create 2 vector stores to demonstrate the use of Runnable Parallel and Pass through

from langchain.embeddings import CohereEmbeddings
from langchain.vectorstores import DocArrayInMemorySearch


embedding = CohereEmbeddings(
    model="embed-english-light-v3.0",
)


vecstore_a = DocArrayInMemorySearch.from_texts(
    ["half the info will be here", "Zoozoo birthday is the 17th September"],
    embedding=embedding
)
vecstore_b = DocArrayInMemorySearch.from_texts(
    ["and half here", "Zoozoo  was born in 1990"],
    embedding=embedding
)

Define Retriever and Chain

Here the input to the “chain.invoke” will be passed to component retrieval where this input is simultaneously passed to two different paths. One is to retriever_a whose output is stored in context and passed to next component in chain. The RunnablePassthrough object is used as a “passthrough” take takes any input to the current component (retrieval) and allows us to provide it in the component output via the “question” key. Thus input question is available to prompt component in “question” key.

from langchain_core.runnables import (
    RunnableParallel,
    RunnablePassthrough
)

retriever_a = vecstore_a.as_retriever()
retriever_b = vecstore_b.as_retriever()

# LLM  Instance
llm = ChatCohere(model="command-r", temperature=0)

prompt_str = """Answer the question below using the context:

Context: {context}

Question: {question}

Answer: """
prompt = ChatPromptTemplate.from_template(prompt_str)

retrieval = RunnableParallel(
    {"context": retriever_a, "question": RunnablePassthrough()}
)

chain = retrieval | prompt | llm | output_parser

Invoke chain

out = chain.invoke("when was Zoozoo born exact year?")
print(out)

Output:

Using both retrievers parallelly

We now pass the question to both retrievers parallelly to provide additional context in the prompt.

# Using Both retrievers parallely

prompt_str = """Answer the question below using the context:

Context:
{context_a}
{context_b}

Question: {question}

Answer: """
prompt = ChatPromptTemplate.from_template(prompt_str)

retrieval = RunnableParallel(
    {
        "context_a": retriever_a, "context_b": retriever_b,
        "question": RunnablePassthrough()
    }
)

chain = retrieval | prompt | llm | output_parser

Output:

out = chain.invoke("when was Zoozoo born exact date?")
print(out)

Runnable Lambda

Now we will see an example of using runnable Lambda for normal python function similar to what we did earlier in understanding or operator

from langchain_core.runnables import RunnableLambda

def add_five(x):
    return x + 5

def multiply_by_two(x):
    return x * 2

# wrap the functions with RunnableLambda
add_five = RunnableLambda(add_five)
multiply_by_two = RunnableLambda(multiply_by_two)

chain = add_five | multiply_by_two
chain.invoke(3)

Custom Function into Runnable chain

We can use runnable lambda to define our own custom functions and add them into llm chain.

The output of LLM response contains different attributes we will create a custom function extract_token to display token count for input question and output response

prompt_str = "You know 1 short line about {topic}?"
prompt = ChatPromptTemplate.from_template(prompt_str)


def extract_token(x):
    token_count = x.additional_kwargs['token_count']
    response=f'''{x.content} \n Input Token Count: {token_count['input_tokens']} 
    \n Output Token Count:{token_count['output_tokens']}'''
    return response
    
get_token = RunnableLambda(extract_token)

chain = prompt | llm  | get_token

Output:

output = chain.invoke({"topic": "Artificial Intelligence"})
print(output)

Other Features of LCEL

LCEL has a number of other features also such as async stream batch processing .

.invoke(): The goal is to pass in an input and receive the output—neither more nor less.
.batch(): This is faster than using invoke three times when you wish to supply several inputs to get multiple outputs because it handles the parallelization for you.
.stream(): We may begin printing the response before the entire response is complete.

prompt_str = "You know 1 short line about {topic}?"
prompt = ChatPromptTemplate.from_template(prompt_str)

chain = prompt | llm | output_parser

# ---------invoke--------- #
result_with_invoke = chain.invoke("AI")

# ---------batch--------- #
result_with_batch = chain.batch(["AI", "LLM", "Vector Database"])
print(result_with_batch)

# ---------stream--------- #
for chunk in chain.stream("Artificial Intelligence write 5 lines"):
  print(chunk, flush=True, end="")

Async Methods of LCEL

Your application’s frontend and backend are typically independent, which means that requests are made to the backend from the frontend. You may need to manage several requests on your backend at once if you have numerous users.

Since most of the code in LangChain is just waiting between API calls, we can leverage asynchronous code to improve API scalability, if you want to understand why it is important I recommend reading the concurrent burgers story of the FastAPI documentation. There is no need to worry about the implementation, because async methods are already available if you use LCEL:

We can use asynchronous code to increase API scalability because the majority of LangChain’s code consists of basically waiting between API requests. If we use LCEL, async methods are already accessible, thus we don’t need to bother about implementation:

.ainvoke() / .abatch() / .astream: asynchronous versions of invoke, batch and stream.

Langchain achieved those “out of the box” features by creating a unified interface called “Runnable”.

Conclusion

LangChain Expression Language introduces a revolutionary approach to Python application development. Despite its unique syntax, LCEL offers a unified interface that streamlines industrialization with built-in features like streaming, asynchronous processing, and dynamic configurations. Automatic parallelization enhances performance by executing tasks concurrently, enhancing overall efficiency. Furthermore, LCEL’s composability empowers developers to effortlessly create and customize chains, ensuring code remains flexible and adaptable to changing requirements. Embracing LCEL promises not only streamlined development but also optimized execution, making it a compelling choice for modern Python applications.

Key Takeaways

LangChain Expression Language (LCEL) introduces a minimalist code layer for creating chains of LangChain components.
The pipe operator in LCEL simplifies the creation of function chains by passing the output of one function directly to the next.
LCEL enables the creation of simple LLM chains by chaining prompts, LLM models, and output parsers.
Custom functions can be included in LLM chains to manipulate or analyze outputs, enhancing the flexibility of the development process.
Built-in integrations with LangSmith and LangServe further enhance the capabilities of LCEL, facilitating seamless deployment and management of LLM chains.

Frequently Asked Questions

Q1. How does LCEL improve application performance?

A. LCEL enables automatic parallelization of tasks, which enhances execution speed by running multiple operations concurrently.

Q2. What is the key benefit of using Runnable interfaces in LCEL?

A. Runnable interfaces allow developers to chain functions easily, improving code readability and maintainability.

Q3. How does LCEL support asynchronous processing?

A. LCEL provides async methods like .ainvoke(), .abatch(), and .astream(), which handle multiple requests efficiently, enhancing API scalability.

Q4. What are the drawbacks of LCEL?

A. LCEL is Not fully PEP compliant , LCEL is DSL (domain specific language) , there is input output dependencies, if we want to access intermediate outputs then we have to pass it all the way to the end of the chain

Q5. Why should developers consider using LCEL in Python applications?

A. Developers should consider LCEL for its unified interface, composability, and advanced features, making it ideal for building scalable and efficient Python applications.

References

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Ritika

I am a professional working as data scientist after finishing my MBA in Business Analytics and Finance. A keen learner who loves to explore and understand and simplify stuff! I am currently learning about advanced ML and NLP techniques and reading up on various topics related to it including research papers .

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

Efficient LLM Workflows with LangChain Expression Language

Introduction

Learning Objectives

Table of contents

What is LangChain Expression Language(LCEL) ?

LCEL Syntax

How the Pipe( | ) Operator Works ?

Creating Runnable Objects

Chaining with the | Operator

Calling the Chained Runnable Object

Simple LLM Chain Using LCEL

Generate the Cohere API keys

Create prompt , model , parser and chain

Runnables Interface Langchain

RAG Using Runnable Pass Through and Runnable Parallel

Installation of Packages

Define Vector Stores

Define Retriever and Chain

Using both retrievers parallelly

Runnable Lambda

Custom Function into Runnable chain

Other Features of LCEL

Async Methods of LCEL

Conclusion

Key Takeaways

Frequently Asked Questions

References

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID