LangChain-Kùzu Integration: Transforming Text into Graphs

Adarsh Balan Last Updated : 17 Jan, 2025
5 min read

The new langchain-kuzu integration package is now available on PyPI! This package bridges the powerful capabilities of LangChain with Kùzu’s cutting-edge graph database, enabling seamless transformation of unstructured text into structured graphs. Whether you’re a data scientist, developer, or AI enthusiast, this integration simplifies complex tasks like entity extraction, graph creation, and natural language querying. Let’s explore what makes this package a game-changer for your data workflows.

Learning Objectives

  • Understand the capabilities of the LangChain-Kùzu integration for transforming unstructured text into structured graph databases.
  • Learn how to define graph schemas, including nodes and relationships, tailored to your specific data needs.
  • Master the process of creating, updating, and querying graphs using Kùzu and LangChain’s LLM-driven tools.
  • Explore natural language querying of graph databases with LangChain’s GraphQAChain for intuitive data insights.
  • Discover advanced features like dynamic schema updates, custom LLM pairing, and flexible data import options in Kùzu.

This article was published as a part of the Data Science Blogathon.

Quick Installation of Kuzu

To get started, simply install the package on Google Colab:

pip install -U langchain-kuzu langchain-openai langchain-experimental

This installation includes dependencies for LangChain and Kùzu, along with support for LLMs like OpenAI’s GPT models. If you prefer other LLM providers, you can install their respective Python packages supported by LangChain.

Why Choose LangChain-Kùzu for Your Projects?

If you work with unstructured text data and want to create graph-based representations, this package is designed for you.

Key features include:

  • Customizable Schemas: Define and extract specific entities and relationships from text data effortlessly.
  • Text-to-Graph Transformation: Leverage the power of LLMs to structure meaningful graphs from raw text.
  • Natural Language Querying: Query graphs intuitively using natural language, powered by LangChain’s GraphQAChain.
  • Seamless Integration: Quickly connect LangChain’s LLM capabilities with Kùzu for a unified workflow.

Let’s walk through a practical example to see this integration in action.

Creating a Graph from Unstructured Text

First create a Kùzu database on your local machine and connect to it:

import kuzu

db = kuzu.Database("test_db")
conn = kuzu.Connection(db)

Getting Started with LangChain-Kùzu

Kùzu’s integration with LangChain makes it convenient to create and update graphs from unstructured text, and also to query graphs via a Text2Cypher pipeline that utilizes the power of LangChain’s LLM chains. To begin, we create a KuzuGraph object that uses the database object we created above in combination with the KuzuGraph constructor.

from langchain_kuzu.graphs.kuzu_graph import KuzuGraph
graph = KuzuGraph(db, allow_dangerous_requests=True)

Imagine we want to transform the following text into a graph:

  • “Tim Cook is the CEO of Apple. Apple has its headquarters in California.”
text to graph: LangChain-Kùzu Integration
text = "Tim Cook is the CEO of Apple. Apple has its headquarters in California."

Step1: Define the Graph Schema

First, define the types of entities (nodes) and relationships you want to include.

# Define schema
allowed_nodes = ["Person", "Company", "Location"]
allowed_relationships = [
    ("Person", "IS_CEO_OF", "Company"),
    ("Company", "HAS_HEADQUARTERS_IN", "Location"),
]

Step2: Transform Text into Graph Documents

Use the LLMGraphTransformer class to process the text into structured graph documents:

from langchain_core.documents import Document
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_openai import ChatOpenAI

# Define the LLMGraphTransformer
llm_transformer = LLMGraphTransformer(
    llm=ChatOpenAI(model="gpt-4o-mini", temperature=0, api_key='OPENAI_API_KEY'),  # noqa: F821
    allowed_nodes=allowed_nodes,
    allowed_relationships=allowed_relationships,
)

documents = [Document(page_content=text)]
graph_documents = llm_transformer.convert_to_graph_documents(documents)

Step3: Add Graph Documents to Kùzu

Load the graph documents into Kùzu for further use:

from langchain_kuzu.graphs.kuzu_graph import KuzuGraph

graph = KuzuGraph(db)
graph.add_graph_documents(graph_documents, include_source=True, allow_dangerous_requests= True)
graph_documents[:2]

Note: In KuzuGraph method, set ‘allow_dangerous_requests’ parameter to True if you get an error.

Output:

[GraphDocument(nodes=[Node(id='Tim Cook', type='Person', properties={}), 
Node(id='Apple', type='Company', properties={}), Node(id='California', \
type='Location', properties={})], relationships=[Relationship(source=Node(id='Tim
Cook', type='Person', properties={}), target=Node(id='Apple', type='Company',
properties={}), type='IS_CEO_OF', properties={}),
Relationship(source=Node(id='Apple', type='Company', properties={}),
target=Node(id='California', type='Location', properties={}),
type='HAS_HEADQUARTERS_IN', properties={})], source=Document(metadata={},
page_content='Tim Cook is the CEO of Apple. Apple has its headquarters in
California.'))]

Querying the Graph

With the KuzuQAChain, you can query the graph using natural language:

# Add the graph document to the graph
graph.add_graph_documents(
    graph_documents,
    include_source=True,
)

from langchain_kuzu.chains.graph_qa.kuzu import KuzuQAChain

# Create the KuzuQAChain with verbosity enabled to see the generated Cypher queries
chain = KuzuQAChain.from_llm(
    llm=ChatOpenAI(model="gpt-4o-mini", temperature=0.3, api_key='OPENAI_API_KEY'),  # noqa: F821
    graph=graph,
    verbose=True,
    allow_dangerous_requests=True,
)

chain.invoke("Where is Apple headquartered?")

Output:

> Entering new KuzuQAChain chain...
Generated Cypher:
MATCH (c:Company {id: 'Apple'})-[:HAS_HEADQUARTERS_IN]->(l:Location) RETURN l
Full Context:
[{'l': {'_id': {'offset': 0, 'table': 1}, '_label': 'Location', 'id': 'California', 'type': 'entity'}}]

> Finished chain.
{'query': 'Where is Apple headquartered?',
 'result': 'Apple is headquartered in California.'}

Unlocking Advanced Features

The LangChain-Kùzu integration offers several advanced features to enhance your workflows:

  • Dynamic Schema Updates: Automatically refresh schemas when the graph is updated.
  • Custom LLM Pairing: Use separate LLMs for Cypher generation and answer generation to optimize performance.
  • Comprehensive Graph Inspection: Easily inspect nodes, relationships, and schema with intuitive commands.

Kùzu is a high-performance, embeddable graph database built for modern applications. Key highlights include:

  • Cypher Query Support: Declaratively query property graphs using Cypher.
  • Embedded Architecture: Run in-process without requiring server setup.
  • Flexible Data Import: Handle data from various formats like CSV, JSON, and relational databases.

Explore more in the Kùzu documentation.

Getting Started with LangChain-Kùzu

To begin your journey:

  • Install the Package : Start with pip install langchain-kuzu.
  • Define Your Graph Schema: Tailor it to your specific needs.
  • Leverage LLMs: Use LangChain’s tools to create and query graphs effortlessly.

Visit the PyPI page for more detailed examples and updates. Don’t forget to star our repository on GitHub and share your feedback—your input drives our progress!

Conclusion

The langchain-kuzu integration redefines how you interact with unstructured data. Whether it’s transforming text into structured graphs or querying those graphs with natural language, this package unlocks powerful possibilities for AI-driven data insights. Try it today and discover a more intuitive way to work with graph data!

Key Takeaways

  • The LangChain-Kùzu integration simplifies transforming unstructured text into structured graph databases effortlessly.
  • Define customizable graph schemas to extract meaningful entities and relationships tailored to your data.
  • Leverage LangChain’s LLMs for intuitive natural language querying and text-to-graph conversion.
  • Kùzu offers dynamic schema updates, seamless integration, and support for Cypher queries for enhanced workflows.
  • This integration empowers AI enthusiasts, developers, and data scientists to unlock powerful insights from graph data.

Frequently Asked Questions

Q1. How do I install the langchain-kuzu package?

A. Simply run the command pip install langchain-kuzu. Ensure you have Python 3.7 or later installed on your system.

Q2. What LLMs are supported by the package?

A. The package supports OpenAI’s GPT models and can be extended to other LLM providers supported by LangChain.

Q3. Can I use a custom schema for my graph?

A. Yes, you can define your own schema by specifying the nodes and relationships you want to extract from the text.

Q4. What should I do if my graph schema doesn’t update after adding documents?

A. The schema refreshes automatically when you invoke the chain. However, you can manually call the refresh_schema() method on the KuzuGraph object.

Q5. Is it possible to use different LLMs for Cypher generation and answer generation?

A. Absolutely! You can configure separate LLMs for these tasks by specifying cypher_llm and qa_llm parameters in the KuzuQAChain object.

Q6. What formats are supported for importing data into Kùzu?

A. Kùzu supports data from CSV, JSON, and relational databases, making it highly versatile.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Hi! I'm Adarsh, a Business Analytics graduate from ISB, currently deep into research and exploring new frontiers. I'm super passionate about data science, AI, and all the innovative ways they can transform industries. Whether it's building models, working on data pipelines, or diving into machine learning, I love experimenting with the latest tech. AI isn't just my interest, it's where I see the future heading, and I'm always excited to be a part of that journey!

Responses From Readers

Clear

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details