LangChain for experienced – memory and agents. part 2

Here

As noted in the previous part, agents are one of the most powerful tools of the LangChain framework. They allow you to connect to external resources, which greatly expands the capabilities of the language model.

In this part, we will move on to the more advanced features of agents and learn how to use them to work with your own database and modeling.

Create a database

To connect an agent to a database, you must first create the database itself. Let’s imagine that our bot customer is a small shop that sells printed t-shirts. Let’s say its base has the following structure:

Table shirts:

  • Fields:

    • id: Unique identifier for the T-shirt

    • type: T-shirt type (e.g. “Type 1”, “Type 2”, etc.)

    • color: T-shirt color

    • size: T-shirt size

    • quantity: Number of T-shirts in stock

    • print_id: Foreign key to table “prints”

    • price: T-shirt price including attributes

Table prints:

  • Fields:

    • id: Unique identifier for the print

    • name: The name of the print

    • image_url: URL of the print image

Table clients:

  • Fields:

    • id: Unique client identifier

    • name: Client name

    • address: Client’s address

    • contact_info: Customer contact information

Table orders :

  • Fields:

    • id: Unique order identifier

    • order_date: Order date

    • client_id: Foreign key to “clients” table

    • price: Total price of the order

The structure is not perfect, but it is better not to complicate it yet.

Now we can translate the idea into code. I will use SQLAlchemy. By using SQLAlchemy you can create objects that represent tables in a database and work with them through python. It simplifies interaction with databases and helps to avoid errors associated with writing SQL queries.

You may have a question: why bother when you can just download a ready-made database and test it. Indeed, this is an easier way. But for simplicity, we will pay with freedom and flexibility. In addition, LangChain tools work with alchemy, so it will be useful to try this framework in work.

Base creation code:

from sqlalchemy import Column, Integer, String, ForeignKey, Date, Numeric, Table  
from sqlalchemy.orm import relationship  
from sqlalchemy.orm import declarative_base  
  
from sqlalchemy import create_engine  
  
Base = declarative_base()  
  
  
class TShirt(Base):  
    __tablename__ = 'tshirts'  
    id = Column(Integer, primary_key=True)  
    type = Column(String)  # Тип футболки (например, "Type 1", "Type 2" и т.д.)  
    color = Column(String)   
    size = Column(String)  
    quantity = Column(Integer)  # Количество футболок в наличии  
    print_id = Column(Integer, ForeignKey('prints.id'))  # Внешний ключ на таблицу "Prints"  
    print = relationship("Print")  # Отношение между футболками и принтами  
    price = Column(Numeric(10, 2))  # Цена футболки с учетом атрибутов  
  
    def calculate_tshirt_price(self):  
        base_price = 500.0  
        type_multiplier = 1.2  
        color_multiplier = 1.3  
        size_multiplier = 1  
        if self.size in ['XL', 'XXL']:  
            size_multiplier = 1.5  
        price = base_price * type_multiplier * color_multiplier * size_multiplier  
        self.price = round(price, 2)  
  
  
class Print(Base):  
    __tablename__ = 'prints'  
    id = Column(Integer, primary_key=True)  
    name = Column(String)  # Название принта  
    image_url = Column(String)  # URL изображения принта  
  
  
class Client(Base):  
    __tablename__ = 'clients'  
    id = Column(Integer, primary_key=True)  
    name = Column(String)  # Имя клиента  
    address = Column(String)  # Адрес клиента  
    contact_info = Column(String)  # Контактная информация клиента  
    orders = relationship("Order", back_populates="client")  # Отношение  
  
  
order_tshirt_table = Table('order_tshirt', Base.metadata,  
                           Column('order_id', Integer, ForeignKey('orders.id'), primary_key=True),  
                           Column('tshirt_id', Integer, ForeignKey('tshirts.id'), primary_key=True)  
                           )  
  
  
class Order(Base):  
    __tablename__ = 'orders'  
    id = Column(Integer, primary_key=True)  
    order_date = Column(Date)  # Дата заказа  
    client_id = Column(Integer, ForeignKey('clients.id'))  # Внешний ключ на таблицу "Clients"  
    client = relationship("Client", back_populates="orders")  # Отношение между заказами и клиентами  
    tshirts = relationship("TShirt", secondary=order_tshirt_table)  
    quantity = Column(Integer)  
    price = Column(Numeric(10, 2))  # Общая стоимость заказа  
  
    def calculate_order_price(self):  
        total_price = 0.0  
        self.quantity = len(self.tshirts)  
        for tshirt in self.tshirts:  
            total_price += tshirt.price  
        self.price = round(total_price, 2) * self.quantity  
  
  
def create_tables(engine):  
    Base.metadata.create_all(engine)  
    print('Tables created successfully!')  
  
  
if __name__ == '__main__':  
    # Создаем экземпляр движка базы данных  
    engine = create_engine('sqlite:///tshirt_store.db')  
    print('Сreate engine')  
  
    # Создаем таблицы  
    create_tables(engine)

Filling the base

Okay, let’s populate the database with t-shirts. Here is a sample code that will help us sample any number of T-shirts using SQLAlchemy:

import random  
from random import randint  
  
from create_database import TShirt, Print  
from sqlalchemy import create_engine  
from sqlalchemy.orm import sessionmaker  
  
  
def random_color():  
    colors = ['Red', 'Blue', 'Green', 'Yellow', 'Black', 'White']  
    return random.choice(colors)  
  
  
def random_size():  
    sizes = ['S', 'M', 'L', 'XL', 'XXL']  
    return random.choice(sizes)  
  
  
def create_tshirt_samples(session, num_samples):  
    tshirt_samples = []  
    print_samples = []  
    for _ in range(num_samples):  
        tshirt = TShirt(  
            type=f'Type {randint(1, 5)}',  
            color=random_color(),  
            size=random_size(),  
            quantity=randint(1, num_samples // 5),  
            print_id=0,  # Заглушка для внешнего ключа, будет обновлено ниже  
            price=0.0  
        )  
        tshirt.calculate_tshirt_price()  # Вычисление цены футболки  
        tshirt_samples.append(tshirt)  
  
        print = Print(  
            name=f'Print_{randint(1, num_samples * 2)}',  
            image_url=f'https://example.com/print_{randint(1, 10)}.jpg'  
        )  
        print_samples.append(print)  
  
    session.add_all(print_samples)  
    session.flush()  # Получение сгенерированных ID принтов  
  
    for i, tshirt in enumerate(tshirt_samples):  
        tshirt.print_id = print_samples[i].id  
  
    session.add_all(tshirt_samples)  
    session.commit()  
  
  
if __name__ == '__main__':  
    # Создаем экземпляр движка базы данных  
    engine = create_engine('sqlite:///tshirt_store.db')  
    print('Сreate engine')  
  
    Session = sessionmaker(bind=engine)  
    session = Session()  
  
    create_tshirt_samples(session, 50)  
    print('Shirts successfully added into DB')  
  
    session.close()

If nothing falls, then you should have a sign with T-shirts:

Table with t-shirts

Table with t-shirts

The next step is to place orders. Let’s try to do it according to a certain functional dependence: for example, periodic with a small trend.

import random
from datetime import datetime, timedelta
from random import randint
from create_database import Order, Client, TShirt
from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine, func

from math import sin, pi


def create_random_client(n):
    # Создание случайного клиента с использованием случайно сгенерированных имени, адреса и контактной информации
    name = f'Client_{randint(1, n)}'
    address = f'Address_{randint(1, n)}'
    contact_info = f'Contact_{randint(1, n)}'
    return Client(name=name, address=address, contact_info=contact_info)


def get_random_client(session, r_id=1000, client_proba=0.5):
    # Получение случайного клиента из базы данных
    client = session.query(Client).order_by(func.random()).first()
    if client is None or client_proba < 0.5:
        # Создание случайного клиента, если база данных клиентов пуста или вероятность меньше 0.5
        client = create_random_client(r_id)
        session.add(client)
        session.commit()
    return client


def get_random_tshirts(session, num_tshirts):
    # Получение случайного количества футболок из базы данных
    tshirts = session.query(TShirt).order_by(func.random()).limit(num_tshirts).all()
    return tshirts


def create_orders(
        session,
        start_date,
        end_date,
        order_function,
):
    current_date = start_date
    while current_date <= end_date:
        # Получение количества заказов на текущую дату с использованием функции order_function
        num_orders = order_function(current_date)
        for _ in range(num_orders):
            # Получение случайного клиента
            client = get_random_client(session, client_proba=random.random())
            # Создание экземпляра класса Order с указанной датой и клиентом
            order = Order(order_date=current_date, client=client)
            order.client = client
            # Получение случайных футболок
            tshirts = get_random_tshirts(session, randint(1, num_orders))
            order.tshirts = tshirts

            # Вычисление общей цены заказа
            order.calculate_order_price()

            session.add(order)

        current_date += timedelta(days=1)

    session.commit()


def order_function(date):  
    day_of_year = date.timetuple().tm_yday  
    amplitude = 10  # Амплитуда количества заказов  
    period = 7  # Период синусоиды в днях  
  
    angle = 2 * pi * (day_of_year % period) / period  
    num_orders = amplitude * (sin(angle) + 1)  
  
    return int(num_orders * day_of_year / 365)


if __name__ == '__main__':
    # Создаем экземпляр движка базы данных
    engine = create_engine('sqlite:///tshirt_store.db')
    print('Сreate engine')

    Session = sessionmaker(bind=engine)
    session = Session()

    start_date = datetime(2023, 5, 1)
    end_date = datetime(2023, 6, 30)
    
	create_orders(session, start_date, end_date, order_function)  
	
	print('Orders successfully added into DB')
	  
	session.close()

A table similar to this should appear:

Table with orders

Table with orders

Now we have the foundation for the agent to work.

Create an agent

For the convenience of further reproduction, I will write the code in snippets, and display the results as screenshots of jupyter notebook cells. Let me remind you of three sources, three components of an agent:

  1. language model

  2. Tool

  3. Agent body

import os

from langchain.sql_database import SQLDatabase
from langchain.chains import SQLDatabaseChain

from langchain.chat_models import ChatOpenAI
from langchain.agents import Tool
from langchain.agents import initialize_agent

# Инициализируем языковую модель
llm = ChatOpenAI(
	openai_api_key=open_ai_api_key,
	model_name="gpt-3.5-turbo",
	temperature=0
)
# Создаем движок для работы с БД
engine = create_engine('sqlite:///tshirt_store.db')

# Регистрируем движок в langchain
db = SQLDatabase(engine)

# Создаем цепочку для работы с базой
sql_chain = SQLDatabaseChain(
	llm=llm,
	database=db,
	verbose=True
)

# Создаем на основе цепочки инструмент
db_tool = Tool(
    name="tshirt store db",
    func=sql_chain.run,
    description='Use for extaract data from database' 
)

# Ну и наконец создаем самого агента(пока без использования памяти)
db_agent = initialize_agent(
    agent="zero-shot-react-description", 
    tools=[db_tool], 
    llm=llm,
    verbose=True,
    max_iterations=5,
)

It’s time to start asking questions to our expert:

Order Quantity Request

Order Quantity Request

Okay, let’s try to ask a more difficult question:

Popular T-shirt Type Request

Popular T-shirt Type Request

It can be seen that the chain of reasoning is correct. The agent even correctly figured out how to link different tables to get the right answer.

In fact, a request to our database can be made without an agent, but directly through the chain sql_chain:

Request the number of orders through the chain

Request the number of orders through the chain

But now the chain failed with the second request:

Let’s see what our agent’s prompt template looks like:

Agent Prompt

Agent Prompt

And this is how the prompt for the chain looks like:

chain prompt

chain prompt

This example once again reminds us of the importance of prompts for the language model. If you want to learn the literacy of prompting, then Andrew Ng washed down a free wellwhich focuses on the basics but is still quite useful.

The agent seems to work perfectly, but breaking it turned out to be quite simple:

Fall of an agent

Fall of an agent

What to do? You can try to modify the prompt. However, there is another way – to use a more powerful tool. LangChain has SQLDatabaseToolkitunder the hood of this Swiss knife there are 4 tools at once:

  • QuerySQLDataBaseTool

  • InfoSQLDatabaseTool

  • ListSQLDatabaseTool

  • QueryCheckerTool

Let’s see if they can help us. But first we need to initialize the new agent:

from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit

toolkit = SQLDatabaseToolkit(db=db, llm=llm)

db_agent_toolkit = create_sql_agent(
    llm=llm,
    toolkit=toolkit,
    verbose=True
)

We ask the same question:

Using SQLDatabaseToolkit

Using SQLDatabaseToolkit

The advantages are obvious: firstly, the agent understands which tables are available to him; secondly, the request handler works, which does not allow the agent to fall if the wording is incorrect.

We have learned how to extract information from the database. Why don’t we teach the agent to build models based on this data? Initially, I planned to write my own tool for this task, but it turned out that there is already a ready-made implementation in the form PythonREPLTool. Well, let’s add this tool to our arsenal.

from langchain.agents.agent_toolkits import create_python_agent
from langchain.tools.python.tool import PythonREPLTool

agent_model = create_python_agent(
    llm=llm,
    tool=PythonREPLTool(),
    verbose=True
)

Let’s try to predict orders for T-shirts:

Agent calls ml model

Agent calls ml model

Well, leather, impressed? Time elves people are gone, the time of agents has come. On this optimistic note, we can finish with the agents for now. In the final part, we will try to write our own tools for working with agents. Thank you for your attention!

I write about AI and NLP in telegram.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *