Tutorial: Building a Chat Agent with Function Calling

_{Last Updated:
January 30, 2025}

Level: Advanced
Time to complete: 20 minutes
Components Used: InMemoryDocumentStore, SentenceTransformersDocumentEmbedder, SentenceTransformersTextEmbedder, InMemoryEmbeddingRetriever, ChatPromptBuilder, OpenAIChatGenerator, ToolInvoker
Prerequisites: You must have an OpenAI API Key and be familiar with creating pipelines
Goal: After completing this tutorial, you will have learned how to build chat applications that demonstrate agent-like behavior using OpenAI’s function calling feature.

Overview

📚 Useful Sources:

OpenAI’s function calling connects large language models to external tools. By providing a tools list with functions and their specifications to the OpenAI API calls, you can easily build chat assistants that can answer questions by calling external APIs or extract structured information from text.

In this tutorial, you’ll learn how to convert your Haystack pipeline into a function-calling tool and how to implement applications using OpenAI’s Chat Completion API through OpenAIChatGenerator for agent-like behavior.

Setting up the Development Environment

Install Haystack and sentence-transformers using pip:

%%bash

pip install haystack-ai
pip install "sentence-transformers>=3.0.0"

Enable Telemetry

Knowing you’re using this tutorial helps us decide where to invest our efforts to build a better product but you can always opt out by commenting the following line. See Telemetry for more details.

from haystack.telemetry import tutorial_running

tutorial_running(40)

Save your OpenAI API key as an environment variable:

import os
from getpass import getpass

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass("Enter OpenAI API key:")

Learning about the OpenAIChatGenerator

OpenAIChatGenerator is a component that supports the function calling feature of OpenAI through Chat Completion API. In contrary to OpenAIGenerator, the way to communicate with OpenAIChatGenerator is through ChatMessage list. Read more about the difference between them in Generators vs Chat Generators.

To start working with the OpenAIChatGenerator, create a ChatMessage object with “SYSTEM” role using ChatMessage.from_system() and another ChatMessage with “USER” role using ChatMessage.from_user(). Then, pass this messages list to OpenAIChatGenerator and run:

from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat import OpenAIChatGenerator

messages = [
    ChatMessage.from_system("Always respond in German even if some input data is in other languages."),
    ChatMessage.from_user("What's Natural Language Processing? Be brief."),
]

chat_generator = OpenAIChatGenerator(model="gpt-4o-mini")
chat_generator.run(messages=messages)

Basic Streaming

OpenAIChatGenerator supports streaming, provide a streaming_callback function and run the chat_generator again to see the difference.

from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.generators.utils import print_streaming_chunk

chat_generator = OpenAIChatGenerator(model="gpt-4o-mini", streaming_callback=print_streaming_chunk)
response = chat_generator.run(messages=messages)

Creating a Function Calling Tool from a Haystack Pipeline

To use the function calling of OpenAI, you need to introduce tools to your OpenAIChatGenerator.

For this example, you’ll use a Haystack RAG pipeline as one of your tools. Therefore, you need to index documents to a document store and then build a RAG pipeline on top of it.

Index Documents with a Pipeline

Create a pipeline to store the small example dataset in the InMemoryDocumentStore with their embeddings. You will use SentenceTransformersDocumentEmbedder to generate embeddings for your Documents and write them to the document store with the DocumentWriter.

After adding these components to your pipeline, connect them and run the pipeline.

If you’d like to learn about preprocessing files before you index them to your document store, follow the Preprocessing Different File Types tutorial.

from haystack import Pipeline, Document
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.writers import DocumentWriter
from haystack.components.embedders import SentenceTransformersDocumentEmbedder

documents = [
    Document(content="My name is Jean and I live in Paris."),
    Document(content="My name is Mark and I live in Berlin."),
    Document(content="My name is Giorgio and I live in Rome."),
    Document(content="My name is Marta and I live in Madrid."),
    Document(content="My name is Harry and I live in London."),
]

document_store = InMemoryDocumentStore()

indexing_pipeline = Pipeline()
indexing_pipeline.add_component(
    instance=SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"), name="doc_embedder"
)
indexing_pipeline.add_component(instance=DocumentWriter(document_store=document_store), name="doc_writer")

indexing_pipeline.connect("doc_embedder.documents", "doc_writer.documents")

indexing_pipeline.run({"doc_embedder": {"documents": documents}})

Build a RAG Pipeline

Build a basic retrieval augmented generative pipeline with SentenceTransformersTextEmbedder, InMemoryEmbeddingRetriever, ChatPromptBuilder and OpenAIChatGenerator.

For a step-by-step guide to create a RAG pipeline with Haystack, follow the Creating Your First QA Pipeline with Retrieval-Augmentation tutorial.

from haystack.components.embedders import SentenceTransformersTextEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.components.builders import ChatPromptBuilder
from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat import OpenAIChatGenerator

template = [
    ChatMessage.from_system(
        """
Answer the questions based on the given context.

Context:
{% for document in documents %}
    {{ document.content }}
{% endfor %}
Question: {{ question }}
Answer:
"""
    )
]
rag_pipe = Pipeline()
rag_pipe.add_component("embedder", SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"))
rag_pipe.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
rag_pipe.add_component("prompt_builder", ChatPromptBuilder(template=template))
rag_pipe.add_component("llm", OpenAIChatGenerator(model="gpt-4o-mini"))

rag_pipe.connect("embedder.embedding", "retriever.query_embedding")
rag_pipe.connect("retriever", "prompt_builder.documents")
rag_pipe.connect("prompt_builder.prompt", "llm.messages")

Run the Pipeline

Test this pipeline with a query and see if it works as expected before you start using it as a function calling tool.

query = "Where does Mark live?"
rag_pipe.run({"embedder": {"text": query}, "prompt_builder": {"question": query}})

Convert the Haystack Pipeline into a Tool

Wrap the rag_pipe.run call inside a function called rag_pipeline_func. This function should take a query as input and return the response generated by the LLM in the RAG pipeline you previously built. Next, initialize a Tool by following the steps in Tool Initialization; define parameters, set a name and description for the tool.

The Tool abstraction in Haystack enables seamless function calling/tool calling with multiple ChatGenerators. As of Haystack 2.9, this includes support for AnthropicChatGenerator, OllamaChatGenerator, and OpenAIChatGenerator, with more integrations expected in the future.

from haystack.tools import Tool


def rag_pipeline_func(query: str):
    result = rag_pipe.run({"embedder": {"text": query}, "prompt_builder": {"question": query}})
    return {"reply": result["llm"]["replies"][0].text}


parameters = {
    "type": "object",
    "properties": {
        "query": {
            "type": "string",
            "description": "The query to use in the search. Infer this from the user's message. It should be a question or a statement",
        }
    },
    "required": ["query"],
}

rag_pipeline_tool = Tool(
    name="rag_pipeline_tool",
    description="Get information about where people live",
    parameters=parameters,
    function=rag_pipeline_func,
)

Creating a Tool from Function

In addition to the rag_pipeline_tool, create a new tool called get_weather_tool to be used to get weather information of cities.

First, create a function that simulates an API call to an external weather service. Instead of passing parameters as JSON (like in the previous tool), use create_tool_from_function. This function requires additional details using the Annotated type to describe tool parameters. However, based on this information, create_tool_from_function can automatically infer the parameters and generate a JSON schema, so you don’t need to define parameters separately.

from typing import Annotated, Literal
from haystack.tools import create_tool_from_function

WEATHER_INFO = {
    "Berlin": {"weather": "mostly sunny", "temperature": 7, "unit": "celsius"},
    "Paris": {"weather": "mostly cloudy", "temperature": 8, "unit": "celsius"},
    "Rome": {"weather": "sunny", "temperature": 14, "unit": "celsius"},
    "Madrid": {"weather": "sunny", "temperature": 10, "unit": "celsius"},
    "London": {"weather": "cloudy", "temperature": 9, "unit": "celsius"},
}


def get_weather(
    city: Annotated[str, "the city for which to get the weather"] = "Berlin",
    unit: Annotated[Literal["Celsius", "Fahrenheit"], "the unit for the temperature"] = "Celsius",
):
    """A simple function to get the current weather for a location."""
    if city in WEATHER_INFO:
        return WEATHER_INFO[city]
    else:
        return {"weather": "sunny", "temperature": 21.8, "unit": "fahrenheit"}


weather_tool = create_tool_from_function(get_weather)

Running OpenAIChatGenerator with Tools

To use the tool calling feature, you need to pass the list of tools to OpenAIChatGenerator as tools.

Instruct the model to use provided tools with a system message and then provide a query that requires a tool call as a user message:

from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.generators.utils import print_streaming_chunk

user_messages = [
    ChatMessage.from_system(
        "Use the tool that you're provided with. Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."
    ),
    ChatMessage.from_user("Can you tell me where Mark lives?"),
]

chat_generator = OpenAIChatGenerator(model="gpt-4o-mini", streaming_callback=print_streaming_chunk)
response = chat_generator.run(messages=user_messages, tools=[rag_pipeline_tool, weather_tool])

As a response, you’ll get a ChatMessage with information about the tool name and arguments as a ToolCall object:

{'replies': [
    ChatMessage(
        _role=<ChatRole.ASSISTANT: 'assistant'>,
        _content=[TextContent(text=''), ToolCall(tool_name='rag_pipeline_tool', arguments={'query': 'Where does Mark live?'}, id='call_xHEPMFrkHEKsi7tFB8FItkFC')],
        _name=None,
        meta={'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'tool_calls', 'usage': {}}
        )
    ]
}

Next, use ToolInvoker to process ChatMessage object containing tool calls, invoke the corresponding tools and return the results as a list of ChatMessage.

from haystack.components.tools import ToolInvoker

tool_invoker = ToolInvoker(tools=[rag_pipeline_tool, weather_tool])

if response["replies"][0].tool_calls:
    tool_result_messages = tool_invoker.run(messages=response["replies"])["tool_messages"]
    print(f"tool result messages: {tool_result_messages}")

As the last step, run the OpenAIChatGenerator again with the tool call results and get the final answer.

# Pass all the messages to the ChatGenerator with the correct order
messages = user_messages + response["replies"] + tool_result_messages
final_replies = chat_generator.run(messages=messages, tools=[rag_pipeline_tool, weather_tool])["replies"]

Building the Chat Agent

As you notice above, OpenAI Chat Completions API does not call the tool; instead, the model generates JSON that you can use to call the tool in your code. That’s why, to build an end-to-end chat agent, you need to check if the OpenAI response is a tool_calls for every message. If so, you need to call the corresponding tool with the provided arguments and send the tool response back to OpenAI. Otherwise, append both user and messages to the messages list to have a regular conversation with the model. Let’s build an application that handles all cases.

To build a nice UI for your application, you can use Gradio that comes with a chat interface. Install gradio, run the code cell below and use the input box to interact with the chat application that has access to two tools you’ve created above.

Example queries you can try:

“What is the capital of Sweden?”: A basic query without any tool calls
“Can you tell me where Giorgio lives?”: A basic query with one tool call
“What’s the weather like in Berlin?”, “Is it sunny there?”: To see the messages are being recorded and sent
“What’s the weather like where Jean lives?”: To force two tool calls
“What’s the weather like today?”: To force OpenAI to ask more clarification

Keep in mind that OpenAI models can sometimes hallucinate answers or tools and might not work as expected.

%%bash

pip install -U gradio

import gradio as gr
import json

from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.tools import ToolInvoker

tool_invoker = ToolInvoker(tools=[rag_pipeline_tool, weather_tool])
chat_generator = OpenAIChatGenerator(model="gpt-4o-mini", tools=[rag_pipeline_tool, weather_tool])
response = None
messages = [
    ChatMessage.from_system(
        "Use the tool that you're provided with. Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous."
    )
]


def chatbot_with_tc(message, history):
    global messages
    messages.append(ChatMessage.from_user(message))
    response = chat_generator.run(messages=messages)

    while True:
        # if OpenAI response is a function call
        if response and response["replies"][0].tool_calls:
            tool_result_messages = tool_invoker.run(messages=response["replies"])["tool_messages"]
            messages = messages + response["replies"] + tool_result_messages
            response = chat_generator.run(messages=messages)

        # Regular Conversation
        else:
            messages.append(response["replies"][0])
            break
    return response["replies"][0].text


demo = gr.ChatInterface(
    fn=chatbot_with_tc,
    type="messages",
    examples=[
        "Can you tell me where Giorgio lives?",
        "What's the weather like in Madrid?",
        "Who lives in London?",
        "What's the weather like where Mark lives?",
    ],
    title="Ask me about weather or where people live!",
    theme=gr.themes.Ocean(),
)

## Uncomment the line below to launch the chat app with UI
# demo.launch()

What’s next

🎉 Congratulations! You’ve learned how to build chat applications that demonstrate agent-like behavior using OpenAI function calling and Haystack Pipelines.

If you liked this tutorial, there’s more to learn about Haystack:

To stay up to date on the latest Haystack developments, you can sign up for our newsletter or join Haystack discord community.

Thanks for reading!

Query Classification with TransformersTextRouter and TransformersZeroShotTextRouter

Evaluation of a QA System