LangChain for experienced – memory and agents. part 2
As noted in the previous part, agents are one of the most powerful tools of the LangChain framework. They allow you to connect to external resources, which greatly expands the capabilities of the language model.
In this part, we will move on to the more advanced features of agents and learn how to use them to work with your own database and modeling.
Create a database
To connect an agent to a database, you must first create the database itself. Let’s imagine that our bot customer is a small shop that sells printed t-shirts. Let’s say its base has the following structure:
Table shirts:
Fields:
id: Unique identifier for the T-shirt
type: T-shirt type (e.g. “Type 1”, “Type 2”, etc.)
color: T-shirt color
size: T-shirt size
quantity: Number of T-shirts in stock
print_id: Foreign key to table “prints”
price: T-shirt price including attributes
Table prints:
Fields:
id: Unique identifier for the print
name: The name of the print
image_url: URL of the print image
Table clients:
Fields:
id: Unique client identifier
name: Client name
address: Client’s address
contact_info: Customer contact information
Table orders :
Fields:
id: Unique order identifier
order_date: Order date
client_id: Foreign key to “clients” table
price: Total price of the order
The structure is not perfect, but it is better not to complicate it yet.
Now we can translate the idea into code. I will use SQLAlchemy. By using SQLAlchemy you can create objects that represent tables in a database and work with them through python. It simplifies interaction with databases and helps to avoid errors associated with writing SQL queries.
You may have a question: why bother when you can just download a ready-made database and test it. Indeed, this is an easier way. But for simplicity, we will pay with freedom and flexibility. In addition, LangChain tools work with alchemy, so it will be useful to try this framework in work.
Base creation code:
from sqlalchemy import Column, Integer, String, ForeignKey, Date, Numeric, Table
from sqlalchemy.orm import relationship
from sqlalchemy.orm import declarative_base
from sqlalchemy import create_engine
Base = declarative_base()
class TShirt(Base):
__tablename__ = 'tshirts'
id = Column(Integer, primary_key=True)
type = Column(String) # Тип футболки (например, "Type 1", "Type 2" и т.д.)
color = Column(String)
size = Column(String)
quantity = Column(Integer) # Количество футболок в наличии
print_id = Column(Integer, ForeignKey('prints.id')) # Внешний ключ на таблицу "Prints"
print = relationship("Print") # Отношение между футболками и принтами
price = Column(Numeric(10, 2)) # Цена футболки с учетом атрибутов
def calculate_tshirt_price(self):
base_price = 500.0
type_multiplier = 1.2
color_multiplier = 1.3
size_multiplier = 1
if self.size in ['XL', 'XXL']:
size_multiplier = 1.5
price = base_price * type_multiplier * color_multiplier * size_multiplier
self.price = round(price, 2)
class Print(Base):
__tablename__ = 'prints'
id = Column(Integer, primary_key=True)
name = Column(String) # Название принта
image_url = Column(String) # URL изображения принта
class Client(Base):
__tablename__ = 'clients'
id = Column(Integer, primary_key=True)
name = Column(String) # Имя клиента
address = Column(String) # Адрес клиента
contact_info = Column(String) # Контактная информация клиента
orders = relationship("Order", back_populates="client") # Отношение
order_tshirt_table = Table('order_tshirt', Base.metadata,
Column('order_id', Integer, ForeignKey('orders.id'), primary_key=True),
Column('tshirt_id', Integer, ForeignKey('tshirts.id'), primary_key=True)
)
class Order(Base):
__tablename__ = 'orders'
id = Column(Integer, primary_key=True)
order_date = Column(Date) # Дата заказа
client_id = Column(Integer, ForeignKey('clients.id')) # Внешний ключ на таблицу "Clients"
client = relationship("Client", back_populates="orders") # Отношение между заказами и клиентами
tshirts = relationship("TShirt", secondary=order_tshirt_table)
quantity = Column(Integer)
price = Column(Numeric(10, 2)) # Общая стоимость заказа
def calculate_order_price(self):
total_price = 0.0
self.quantity = len(self.tshirts)
for tshirt in self.tshirts:
total_price += tshirt.price
self.price = round(total_price, 2) * self.quantity
def create_tables(engine):
Base.metadata.create_all(engine)
print('Tables created successfully!')
if __name__ == '__main__':
# Создаем экземпляр движка базы данных
engine = create_engine('sqlite:///tshirt_store.db')
print('Сreate engine')
# Создаем таблицы
create_tables(engine)
Filling the base
Okay, let’s populate the database with t-shirts. Here is a sample code that will help us sample any number of T-shirts using SQLAlchemy:
import random
from random import randint
from create_database import TShirt, Print
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
def random_color():
colors = ['Red', 'Blue', 'Green', 'Yellow', 'Black', 'White']
return random.choice(colors)
def random_size():
sizes = ['S', 'M', 'L', 'XL', 'XXL']
return random.choice(sizes)
def create_tshirt_samples(session, num_samples):
tshirt_samples = []
print_samples = []
for _ in range(num_samples):
tshirt = TShirt(
type=f'Type {randint(1, 5)}',
color=random_color(),
size=random_size(),
quantity=randint(1, num_samples // 5),
print_id=0, # Заглушка для внешнего ключа, будет обновлено ниже
price=0.0
)
tshirt.calculate_tshirt_price() # Вычисление цены футболки
tshirt_samples.append(tshirt)
print = Print(
name=f'Print_{randint(1, num_samples * 2)}',
image_url=f'https://example.com/print_{randint(1, 10)}.jpg'
)
print_samples.append(print)
session.add_all(print_samples)
session.flush() # Получение сгенерированных ID принтов
for i, tshirt in enumerate(tshirt_samples):
tshirt.print_id = print_samples[i].id
session.add_all(tshirt_samples)
session.commit()
if __name__ == '__main__':
# Создаем экземпляр движка базы данных
engine = create_engine('sqlite:///tshirt_store.db')
print('Сreate engine')
Session = sessionmaker(bind=engine)
session = Session()
create_tshirt_samples(session, 50)
print('Shirts successfully added into DB')
session.close()
If nothing falls, then you should have a sign with T-shirts:
The next step is to place orders. Let’s try to do it according to a certain functional dependence: for example, periodic with a small trend.
import random
from datetime import datetime, timedelta
from random import randint
from create_database import Order, Client, TShirt
from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine, func
from math import sin, pi
def create_random_client(n):
# Создание случайного клиента с использованием случайно сгенерированных имени, адреса и контактной информации
name = f'Client_{randint(1, n)}'
address = f'Address_{randint(1, n)}'
contact_info = f'Contact_{randint(1, n)}'
return Client(name=name, address=address, contact_info=contact_info)
def get_random_client(session, r_id=1000, client_proba=0.5):
# Получение случайного клиента из базы данных
client = session.query(Client).order_by(func.random()).first()
if client is None or client_proba < 0.5:
# Создание случайного клиента, если база данных клиентов пуста или вероятность меньше 0.5
client = create_random_client(r_id)
session.add(client)
session.commit()
return client
def get_random_tshirts(session, num_tshirts):
# Получение случайного количества футболок из базы данных
tshirts = session.query(TShirt).order_by(func.random()).limit(num_tshirts).all()
return tshirts
def create_orders(
session,
start_date,
end_date,
order_function,
):
current_date = start_date
while current_date <= end_date:
# Получение количества заказов на текущую дату с использованием функции order_function
num_orders = order_function(current_date)
for _ in range(num_orders):
# Получение случайного клиента
client = get_random_client(session, client_proba=random.random())
# Создание экземпляра класса Order с указанной датой и клиентом
order = Order(order_date=current_date, client=client)
order.client = client
# Получение случайных футболок
tshirts = get_random_tshirts(session, randint(1, num_orders))
order.tshirts = tshirts
# Вычисление общей цены заказа
order.calculate_order_price()
session.add(order)
current_date += timedelta(days=1)
session.commit()
def order_function(date):
day_of_year = date.timetuple().tm_yday
amplitude = 10 # Амплитуда количества заказов
period = 7 # Период синусоиды в днях
angle = 2 * pi * (day_of_year % period) / period
num_orders = amplitude * (sin(angle) + 1)
return int(num_orders * day_of_year / 365)
if __name__ == '__main__':
# Создаем экземпляр движка базы данных
engine = create_engine('sqlite:///tshirt_store.db')
print('Сreate engine')
Session = sessionmaker(bind=engine)
session = Session()
start_date = datetime(2023, 5, 1)
end_date = datetime(2023, 6, 30)
create_orders(session, start_date, end_date, order_function)
print('Orders successfully added into DB')
session.close()
A table similar to this should appear:
Now we have the foundation for the agent to work.
Create an agent
For the convenience of further reproduction, I will write the code in snippets, and display the results as screenshots of jupyter notebook cells. Let me remind you of three sources, three components of an agent:
language model
Tool
Agent body
import os
from langchain.sql_database import SQLDatabase
from langchain.chains import SQLDatabaseChain
from langchain.chat_models import ChatOpenAI
from langchain.agents import Tool
from langchain.agents import initialize_agent
# Инициализируем языковую модель
llm = ChatOpenAI(
openai_api_key=open_ai_api_key,
model_name="gpt-3.5-turbo",
temperature=0
)
# Создаем движок для работы с БД
engine = create_engine('sqlite:///tshirt_store.db')
# Регистрируем движок в langchain
db = SQLDatabase(engine)
# Создаем цепочку для работы с базой
sql_chain = SQLDatabaseChain(
llm=llm,
database=db,
verbose=True
)
# Создаем на основе цепочки инструмент
db_tool = Tool(
name="tshirt store db",
func=sql_chain.run,
description='Use for extaract data from database'
)
# Ну и наконец создаем самого агента(пока без использования памяти)
db_agent = initialize_agent(
agent="zero-shot-react-description",
tools=[db_tool],
llm=llm,
verbose=True,
max_iterations=5,
)
It’s time to start asking questions to our expert:
Okay, let’s try to ask a more difficult question:
It can be seen that the chain of reasoning is correct. The agent even correctly figured out how to link different tables to get the right answer.
In fact, a request to our database can be made without an agent, but directly through the chain sql_chain
:
But now the chain failed with the second request:
Let’s see what our agent’s prompt template looks like:
And this is how the prompt for the chain looks like:
This example once again reminds us of the importance of prompts for the language model. If you want to learn the literacy of prompting, then Andrew Ng washed down a free wellwhich focuses on the basics but is still quite useful.
The agent seems to work perfectly, but breaking it turned out to be quite simple:
What to do? You can try to modify the prompt. However, there is another way – to use a more powerful tool. LangChain has SQLDatabaseToolkit
under the hood of this Swiss knife there are 4 tools at once:
QuerySQLDataBaseTool
InfoSQLDatabaseTool
ListSQLDatabaseTool
QueryCheckerTool
Let’s see if they can help us. But first we need to initialize the new agent:
from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
toolkit = SQLDatabaseToolkit(db=db, llm=llm)
db_agent_toolkit = create_sql_agent(
llm=llm,
toolkit=toolkit,
verbose=True
)
We ask the same question:
The advantages are obvious: firstly, the agent understands which tables are available to him; secondly, the request handler works, which does not allow the agent to fall if the wording is incorrect.
We have learned how to extract information from the database. Why don’t we teach the agent to build models based on this data? Initially, I planned to write my own tool for this task, but it turned out that there is already a ready-made implementation in the form PythonREPLTool
. Well, let’s add this tool to our arsenal.
from langchain.agents.agent_toolkits import create_python_agent
from langchain.tools.python.tool import PythonREPLTool
agent_model = create_python_agent(
llm=llm,
tool=PythonREPLTool(),
verbose=True
)
Let’s try to predict orders for T-shirts:
Well, leather, impressed? Time elves people are gone, the time of agents has come. On this optimistic note, we can finish with the agents for now. In the final part, we will try to write our own tools for working with agents. Thank you for your attention!
I write about AI and NLP in telegram.