Function Calls here and now for the smallest… resources

Introduction

You can skip the introduction if you know:

Why Open Source is Better than Open AI API

And for those who remained: Let's look at the pros and cons!

Pros:

Free models.
The image shows Khaby Lame
Data management and privacy: Hosting models locally ensures your data never leaves your system.
Having the source code available allows for better understanding and bug fixing.
Possibility of fine-tuning for your needs.
Community support with constant updates.
No dependence on vendors and their policies.

Additionally, it is possible to use a locally deployed LLM together with the OpenAI API.
If, for example, your account runs out of funds or your code stops working with the OpenAI API for some other reason, a local model can come to the rescue, allowing your workflow to continue without interruption.

Minuses:

The need to independently monitor the support of the model's operability.
Responsible for deployment and configuration of the model.
The need to rent or purchase computing resources.
Proprietary models usually have much larger parameters and can be more “intelligent”.
The need to have qualified specialists to manage and maintain the model.

Theoretical basis

What is Function Calling?

In the context of large language models, Function Calling refers to the ability of an LLM to determine from a user request the appropriate function to execute from an available set of functions and the correct parameters to pass to that function. Instead of generating plain text responses, an LLM for function calls is typically configured to return responses with structured data, most often in JSON format. This structured data can be used to perform predefined functions, such as retrieving data from data stores or functionality, retrieving real-time data, or calling third-party APIs.

Differences between Function Calling and Tools

This question has been asked many times on the Internet. We think many people understand the differences, but it is worth clarifying.
Tools is a broader concept that includes Function Calling and many other tools. For example, Assistantscreated with Assistants API in the OpenAI API, support not only Function Calling, but also File Search (built-in RAG tool for processing and searching in files), Code Interpreter (allows you to write and run Python code, process files and various data), and also create your own tools.

From theory to practice

Yes, as it has already become clear, tools are really a very cool and powerful tool. However, it is necessary for someone to use them. OpenAI APIs exist for this Assistantsbut how to implement this with local LLMs? This is exactly why we are here!

The Model That Could: Mistral

Introducing our Employee of the Month: Mistral-7B-Instruct-v0.3:

*Instead of Krabby Patties, he will prepare you with many projects!*

This model was released May 22, 2024so she may not be employee of the month anymore, but we just started working with her now, and for pet projects she's pretty great.

Improvements in Mistral-7B-v0.3

IN Mistral-7B-v0.3 The following changes have been made compared to Mistral-7B-v0.2:

Vocabulary expanded to 32,768 tokens
Previously, the dictionary size in v0.2 was 32,000 tokens. An increase of 768 tokens may not seem like much, but these additional tokens can now cover rare words, new jargon, or specific terms.
For reference: Llama 3 It has dictionary of 128,256 tokens. While GPT-4o surpasses both of them, expanding the vocabulary to 200,019 tokens compared to 100,277 in GPT-4.
Tokenizer v3 supported
Supported Function Calling
The most interesting thing, and why we came here, is the ability to call functions. Now the model can be used to its full potential and in different applications, allowing it to perform tasks beyond just creating text.

Comparison with GPT-4o

If you are ready to pay money, then you can use Mistral AI API. Here you can get cheaper and get almost the same quality of text generation as Open AI GPT-4o.

Here is the pricing for you GPT-4o and models from Mistral:

Well, what about the quality? Mistral Large 2 similar to GPT-4oHere is a table with the characteristics of the models with of this site:

And also this one:

Well, actually, Mistral Large 2 Not bad, you caught up GPT-4oand it’s also cheaper to use.

Think about whose code is given below: description of tools using Mistral AI API or OpenAI API?

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                },
                "required": ["location", "format"],
            },
        }
    }
]

Correct answer: It will work in both places. The guys from Mistral AI have made almost exactly the same API as OpenAI.
They are the same, Natasha!
Great, that means if you want to switch from using the Open AI API to the Mistral AI API, some of the code can remain the same.

But we came here for free Open Source, what API are we talking about? That's the point, their API works for free, with a model deployed on LM Studio or other analogues.

Here is an example of code:

from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage
from mistralai.models.chat_completion import ChatMessage
from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage

def get_current_weather(location, format):  
    # Тело функции опускается 
    return {
        "location": location,
        "temperature": 23,
        "format": format
    }


model="MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF", 
api_key = "lm-studio"
endpoint = "http://localhost:1234/"

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                },
                "required": ["location", "format"],
            },
        }
    },
]

messages = [
    ChatMessage(role="user", content="What's is the current weather in Minsk Belarus?")
]

client = MistralClient(api_key=api_key, endpoint=endpoint)
response = client.chat(model=model, messages=messages, tools=tools, tool_choice="any")
print(response)

And here the question arises: that is, the guys from Mistral AI provided both the model and the API – and who will buy their keys then? Yes, not all models are openly available, plus they deploy them on their servers, but we don’t need a rocket in space. And everything seems so tempting, but either there is no such thing as a free lunch, or the hands are crooked, but we still couldn’t get functions to call through their API.

Well, in this way we smoothly moved on to using what is great for us crewAI.

The Greatness of CrewAI

In the end, replacing the framework with AutoGen with the similar crewAI with the same deployment on LM Studio of the already described model Mistral-7B-Instruct-v0.3 has borne fruit.

The agent based on this LLM actually understood what functions it has and in what situations they should be used. This is where we will stop in more detail. Now we will show how to repeat our results step by step.

The entire system ran on a regular office laptop with 16GB RAM and processor AMD Ryzen 7 4700U without a single hint of such luxury as GPU. The speed of work is not discussed in this article, but this is a definite plus, because almost anyone can launch their own small LLM with cool features for personal purposes.

Now to the implementation:

We used Anaconda to create a Python virtual environment, LM Studio installed on Windows OS. To get started with CrewAI, we only needed to install two dependencies: pip install crewai crewai_tools.
Let's move on to the code. The task before us is as follows:
1. Load the model.
2. Connect an agent to her.
3. Describe the tool functions and inform the agent about their existence.
4. Create a task.
5. Start the entire system.

Creating a model object:

from langchain_community.chat_models.openai import ChatOpenAI

OPENAI_API_BASE_URL = "http://localhost:1234/v1"  # Адрес, по которому мы будем обращаться к модели в LM Studio
OPENAI_API_KEY = "NA"  # В базовой конфигурации LM Studio не важен
MODEL_NAME = "MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF"  # Необязательно, важна лишь фактическая модель, которую развернули в LM Studio

default_llm = ChatOpenAI(
    openai_api_base=OPENAI_API_BASE_URL,
    openai_api_key=OPENAI_API_KEY,
    model_name=MODEL_NAME,
)

Creating a tool function for an agent:

from crewai_tools import tool

# Аннотирование типов и docstring строго обязательны согласно документации CrewAI
# Формат docstring формально не важен, но тем подробнее объяснишь суть функции агенту, тем лучше
@tool("Summator") # Имя, под которым агент будет упоминать данную тулу
def find_sum(a: int, b: int) -> int: 
    """
    Function for finding the sum of two `int` numbers.
    Args:
        a (int): The first number.
        b (int): The second number.
    Returns:
        int: The sum of `a` and `b`.
    """
      return a + b

Creating an agent:

from crewai import Agent

general_agent = Agent(
    role="Assistant",
    goal="Answer Questions",
    backstory="",
    allow_delegation=False,
    verbose=True,
    llm=default_llm,
    tools=[find_sum], # Именно здесь мы отдаём агенту возможные функции, которые он может применить
)

All that remains is to create a task for the agent and launch Crew:

from crewai import Crew, Task

task = Task(
    description="12345 + 54321",
    expected_output="Int number",
    agent=general_agent,
)

crew = Crew(agents=[general_agent], tasks=[task], verbose=2)
result = crew.kickoff()
print(result)

We provide the results of running an agent using the described function for calculating the sum of numbers:

As you can see, Function Call actually worked as expected: the model decided that it needed to call an external function, passed the parameters to the agent, and the agent in turn directly executed the corresponding code and gave the response to the model. — As you can see, Function Call actually worked as expected: the model decided that it needed to call an external function, passed the parameters to the agent, and the agent, in turn, directly executed the corresponding code and gave the response to the model.

What we originally wanted to get and even better!well almost…read more).

Now let's provide a slightly more interesting example by rewriting the function so that it takes as its argument a list of integers to sum. Models often have problems with the math, so let's try to partially correct this shortcoming.

Rewritten Tula:

@tool("Summator")

def find_sum(nums: List[int]) -> int:
    """
    Function for finding the sum of array of numbers.
    Args:
        nums (List[int]): Array of numbers to find their sum.
    Returns:
        int: The sum of all numbers in array.
    """
    return sum(nums)

Below are the results of the query without using any functions and tools and with the tool:

*In the first case, LMM itself, unfortunately, failed to cope with the task.*

In the second case, she was quite capable of selecting a list of numbers and delegating the responsibility for summing them to an external contractor. Simply excellent! — In the second case, she was quite capable of selecting a list of numbers and delegating the responsibility for summing them to an external contractor.
Just excellent!

Okay, adding numbers is all cool and dandy, but let's provide a more interesting example. Let's implement an analogue of the demo function from the OpenAI documentation for querying local weather.

Third time, the code of the function itself using the Open-Meteo API:

@tool("Weather API")
def get_current_weather(location: str) -> str:
    """
    Function to get current weather in provided city
    Args:
        location (str): The target city.
    Returns:
        str: Weather info as JSON string if answer is valid or error message
    """
    CITY_LATLONG_URL = "https://geocoding-api.open-meteo.com/v1/search?name={}&count=1&language=en&format=json"
    url = CITY_LATLONG_URL.format(location)

    response = requests.get(url)
    if response.status_code == 200:
        try:
            data = response.json()["results"][0]
        except KeyError:
            return f"ERROR: Nonexisting city {location}"
        latitude = data["latitude"]
        longitude = data["longitude"]
    else:
        return f"ERROR: Failed to get city location {location}"
    
    WEATHER_CURRENT_URL = "https://api.open-meteo.com/v1/forecast?latitude={}&longitude={}&current=temperature_2m,relative_humidity_2m,apparent_temperature,rain,showers,snowfall,wind_gusts_10m,wind_direction_10m&wind_speed_unit=ms&timezone=auto"
    url = WEATHER_CURRENT_URL.format(latitude, longitude)
    
    response = requests.get(url)
    if response.status_code == 200:
        return str(response.json())
    else:
        return f"ERROR: Failed to get current weather in {location}"

I also had to slightly modify the description of the agent and his task:

general_agent = Agent(
    role="Wheather Assistant",
    goal="Answer Questions and reformat data as human-readable form. Only use tools to get weather info, formatting is your own task",
    backstory="",
    allow_delegation=False,
    verbose=True,
    llm=default_llm,
    tools=[get_current_weather],
)

task = Task(
    description="What is the weather in Minsk now?",
    expected_output="Weather description",
    agent=general_agent,
)

After some time of waiting, the final result is before us:

*Note that at the time of testing the function, the weather in Minsk was exactly like this, i.e. the Open-Meteo API works*

The agent worked correctly, the result was obtained, everyone is happy – peace, friendship, chewing gum! However… there is a small fly in the ointment:

With three tools loaded into the agent at the same time, the model began to get confused and launch functions that were completely unrelated to the current request, with fictitious parameters. It is quite possible that this can be mitigated by a much more detailed explanation of the essence of the tools and in what situations they should be called.
The presence of bodies makes the model lazy. The first time after receiving weather data, LLM decided to call a function to format JSON into readable text, which did not even exist, instead of doing its main task itself. This was corrected by writing a clearer TOR for the model.

Unlike the OpenAI API and classic Function Calling in general, crewAI agents also take on the processing of the model's response. And while in other cases we get back JSON with function parameters that need to be processed manually one way or another, CrewAI pushes all this under the hood, calling the function itself.

Conclusion and pitfalls

This approach has both pros and cons.

Pros: even with 8GB of RAM you can run a quantized Mistral of 3-5GB and enjoy calling functions.

Minuses: Well…let's look under the hood. This is what a POST request to a model on LLM looked like:

{
  "messages": [
    {
      "content": "You are Wheather Assistant. \nYour personal goal is: Answer Questions and reformat data as human-readable form. Only use tools to get weather info, formatting is your own task\nYou ONLY have access to the following tools, and should NEVER make up tools that are not listed here:\n\nTool Name: Weather API(*args: Any, **kwargs: Any) -> Any\nTool Description: Weather API(location: 'string') -      Function to get current weather in provided city      Args:         location (str): The target city.      Returns:         str: Weather info as JSON string if answer is valid or error message      \nTool Arguments: {'location': {'title': 'Location', 'type': 'string'}}\n\nUse the following format:\n\nThought: you should always think about what to do\nAction: the action to take, only one name of [Weather API], just the name, exactly as it's written.\nAction Input: the input to the action, just a simple python dictionary, enclosed in curly braces, using \" to wrap keys and values.\nObservation: the result of the action\n\nOnce all necessary information is gathered:\n\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nCurrent Task: What is the weather in Minsk now?\n\nThis is the expect criteria for your final answer: Weather description \n you MUST return the actual complete content as the final answer, not a summary.\n\nBegin! This is VERY important to you, use the tools available and give your best Final Answer, your job depends on it!\n\nThought:\n",
      "role": "user"
    }
  ],
  "model": "MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF",
  "logprobs": false,
  "n": 1,
  "stop": [
    "\nObservation"
  ],
  "stream": true,
  "temperature": 0.7
}

And this is what the request to the model looked like for a non-working version of calling a function via the Mistral AI API:

{
  "messages": [
    {
      "role": "user",
      "content": "What's is the current weather in Minsk Belarus?"
    }
  ],
  "model": [
    "MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF"
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            },
            "format": {
              "type": "string",
              "enum": [
                "celsius",
                "fahrenheit"
              ],
              "description": "The temperature unit to use. Infer this from the users location."
            }
          },
          "required": [
            "location",
            "format"
          ]
        }
      }
    }
  ],
  "stream": false,
  "tool_choice": "any"
}

In the case of crewAI, the tools are added to the main prompt, and in the case of Mistral AI API, they are passed as a separate parameter in the request. Therefore, we assume (but we do not confirm), that the model could handle poorly in the case where we gave it several functions, due to the overload of the prompt content. In general, passing all the information about each function simply to the prompt looks a bit like a crutch and not even quite like a Function Call, but despite such pitfalls, everything looks pretty good: the parameters of the functions are extracted correctly, the functions are called themselves – what else is needed?

We continue to look for various better ways to implement function calls, agent creation, and other cool features on Open Source solutions. Therefore, we will be glad to any comment or suggestion. In the meantime, we understand that in the field between Open Source and OpenAI, the second player is winning, but the first is no longer an outsider.

Thank you all for reading this article!

Function Calls here and now for the smallest… resources

Introduction

Theoretical basis

What is Function Calling?

Differences between Function Calling and Tools

From theory to practice

The Model That Could: Mistral

Improvements in Mistral-7B-v0.3

Comparison with GPT-4o

More non-working solutions!

The Greatness of CrewAI

Conclusion and pitfalls

Authors of the article

20 Questions to Ask a Vendor Before Implementation

Evolution of Logging and Enrichment Needs [Оголяемся технологически. MaxPatrol SIEM]

nginx: [emerg] unknown directive

Mikrotik IKEv2 + MacOS + iOS + Cert Auth

Best architecture for MVP: monolith, SOA, microservices or serverless? .. Part 2

The Legendary Series in Monte Carlo and the Birth of the “Gambler's Fallacy”

Leave a Reply Cancel reply

Introduction

Theoretical basis

What is Function Calling?

Differences between Function Calling and Tools

From theory to practice

The Model That Could: Mistral

Improvements in Mistral-7B-v0.3

Comparison with GPT-4o

More non-working solutions!

The Greatness of CrewAI

Conclusion and pitfalls

Authors of the article

Similar Posts

Leave a Reply Cancel reply