Tutorial: Customizing Agent to Chat with Your Documents
Last Updated: January 15, 2025
This tutorial is based on Haystack 1.x (
farm-haystack
). If you’re using Haystack 2.x (haystack-ai
), refer to the Haystack 2.x tutorials or Haystack Cookbook.For more information on Haystack 2.x, read the Haystack 2.0 announcement.
- Level: Advanced
- Time to complete: 20 minutes
- Nodes Used:
BM25Retriever
,PromptNode
,Agent
, andMemory
- Goal: After completing this tutorial, you will have learned about how to customize an Agent to create a chat system for your documents.
- Prerequisites: An OpenAI API Key
Overview
The
Agent class is designed to use a large language model (LLM) to make decisions and determine the best course of action to find the most suitable answers. It offers great flexibility with the ability to provide multiple
Tools
,
Memory
, and a custom prompt with a parameter resolver. This means you have full control over the agent’s prompt, allowing you to customize it according to your specific use case, whether it’s a personal assistant, a chatbot, or a multihop search system.
In this tutorial, you will learn how to set up the Agent with one tool and a summarized memory, as well as how to customize the prompt and resolve parameters within it. By the end of the tutorial, you will have a system ready to engage in conversations using your own documents.
Preparing the Colab Environment
Installing Haystack
To start, let’s install the latest release of Haystack with pip
:
%%bash
pip install --upgrade pip
pip install farm-haystack[colab]
pip install "datasets>=2.6.1"
Enabling Telemetry
Knowing you’re using this tutorial helps us decide where to invest our efforts to build a better product but you can always opt out by commenting the following line. See Telemetry for more details.
from haystack.telemetry import tutorial_running
tutorial_running(25)
Creating a Generative QA Pipeline Tool
1) Initialize the DocumentStore and Write Documents
You’ll start creating a generative pipeline by initializing a DocumentStore, which will store the Documents to be chatted with.
As Documents, you will use the Wikipedia pages of Seven Wonders of the Ancient World. These pages are crawled, preprocessed and uploaded to a Hugging Face Space: Seven Wonders.
To learn about the preprocessing step, check out Tutorial: Preprocessing Your Documents.
Initialize InMemoryDocumentStore
and write Documents to the DocumentStore:
from datasets import load_dataset
from haystack.document_stores import InMemoryDocumentStore
dataset = load_dataset("bilgeyucel/seven-wonders", split="train")
document_store = InMemoryDocumentStore(use_bm25=True)
document_store.write_documents(dataset)
2) Provide an OpenAI API Key
You need an the API key to use OpenAI models for generative pipeline and the agent.
import os
from getpass import getpass
openai_api_key = os.getenv("OPENAI_API_KEY", None) or getpass("Enter OpenAI API key:")
3) Create a Generative QA Pipeline
A generative QA pipeline consists of a PromptNode and a Retriever. In this pipeline, Retriever gets the related Documents, and PromptNode generates the answer using the retrieval augmented generation ( RAG) approach.
To learn about the details of a generative pipeline with RAG, check out Tutorial: Creating a Generative QA Pipeline with Retrieval-Augmentation.
from haystack.nodes import PromptNode, PromptTemplate, AnswerParser, BM25Retriever
from haystack.pipelines import Pipeline
retriever = BM25Retriever(document_store=document_store, top_k=3)
prompt_template = PromptTemplate(
prompt="""
Answer the question truthfully based solely on the given documents. If the documents do not contain the answer to the question, say that answering is not possible given the available information. Your answer should be no longer than 50 words.
Documents:{join(documents)}
Question:{query}
Answer:
""",
output_parser=AnswerParser(),
)
prompt_node = PromptNode(
model_name_or_path="gpt-3.5-turbo-instruct", api_key=openai_api_key, default_prompt_template=prompt_template
)
generative_pipeline = Pipeline()
generative_pipeline.add_node(component=retriever, name="retriever", inputs=["Query"])
generative_pipeline.add_node(component=prompt_node, name="prompt_node", inputs=["retriever"])
Test out your pipeline:
from haystack.utils import print_answers
response = generative_pipeline.run("What does Rhodes Statue look like?")
print_answers(response, details="minimum")
To make sure that the generative pipeline uses only the provided Documents, test it out with an out-of-context query:
response = generative_pipeline.run("What does Taylor Swift look like?")
print_answers(response, details="minimum")
4) Define a Tool with the Generative QA Pipeline
Later, you will provide this Tool to your Agent.
from haystack.agents import Tool
search_tool = Tool(
name="seven_wonders_search",
pipeline_or_node=generative_pipeline,
description="useful for when you need to answer questions about the seven wonders of the world",
output_variable="answers",
)
Now that you have a working pipeline as a Tool, time to initialize an Agent!
Creating an Agent
If you are not familiar with the Agent class, you can go through Tutorial: Answering Multihop Questions with Agents before you continue.
1) Initialize PromptNode for the Agent
For your Agent to be most efficient, you need an LLM that can perform dynamic reasoning with
ReAct prompts, which can be achieved by configuring the stop_words
and temperature
. Once set, the PromptNode will be ready to use.
from haystack.nodes import PromptNode
agent_prompt_node = PromptNode(
"gpt-3.5-turbo",
api_key=openai_api_key,
max_length=256,
stop_words=["Observation:"],
model_kwargs={"temperature": 0.5},
)
2) Initialize Memory
Memory makes the interaction with the Agent more human-like. Haystack offers various memory options, one of which, ConversationSummaryMemory
, will create the summary of every three exchanges. You will be using the
philschmid/bart-large-cnn-samsum
model which is especially good at generating chat summaries.
To discover other memory options in Haystack, check out Good Listener: How Memory Enables Conversational Agents blog post.
from haystack.agents.memory import ConversationSummaryMemory
from haystack.nodes import PromptNode
memory_prompt_node = PromptNode(
"philschmid/bart-large-cnn-samsum", max_length=256, model_kwargs={"task_name": "text2text-generation"}
)
memory = ConversationSummaryMemory(memory_prompt_node, prompt_template="{chat_transcript}")
3) Define the Prompt
To have an Agent with chat ability, you need to use a prompt similar to
conversational-agent
. This prompt is useful especially when you want to inject the memory into the prompt.
agent_prompt = """
In the following conversation, a human user interacts with an AI Agent. The human user poses questions, and the AI Agent goes through several steps to provide well-informed answers.
The AI Agent must use the available tools to find the up-to-date information. The final answer to the question should be truthfully based solely on the output of the tools. The AI Agent should ignore its knowledge when answering the questions.
The AI Agent has access to these tools:
{tool_names_with_descriptions}
The following is the previous conversation between a human and The AI Agent:
{memory}
AI Agent responses must start with one of the following:
Thought: [the AI Agent's reasoning process]
Tool: [tool names] (on a new line) Tool Input: [input as a question for the selected tool WITHOUT quotation marks and on a new line] (These must always be provided together and on separate lines.)
Observation: [tool's result]
Final Answer: [final answer to the human user's question]
When selecting a tool, the AI Agent must provide both the "Tool:" and "Tool Input:" pair in the same response, but on separate lines.
The AI Agent should not ask the human user for additional information, clarification, or context.
If the AI Agent cannot find a specific answer after exhausting available tools and approaches, it answers with Final Answer: inconclusive
Question: {query}
Thought:
{transcript}
"""
4) Initialize Prompt Parameter Resolver
Keep in mind that for common use cases of Agent such as simple chatbots or ReAct-based Agents, there are pre-defined resolver functions in Haystack that you can import and start using without defining one from scratch. Here is the full list of all resolver functions. Below is an example of how you may customize it yourself.
prompt_parameter_resolver
is a callback function that returns a dictionary of parameters which will resolve the variables in the prompt_template
. This function customizes Agent’s behaviour by taking inputs from internal Agent classes and enabling preprocessing these before returning prompt parameters.
Currently, resolver functions can access to the query from the user, AgentStep
, and Agent
. Agent
class is required to get all memory related attributes or when you need some tool information in the prompt which is essential for the ReAct-based Agents to determine their next action. AgentStep
class is useful when you need the transcript of an Agent’s internal monologue for each iteration or the information of at which iteration the Agent is at that given time.
These are the parameters you need to define in the resolver function for the prompt above:
query
: User inputtool_names_with_descriptions
: Names of all tools and their descriptionstranscript
: Transcript of the ReAct-based Agent’s iterative process that generates thought, action, and observationmemory
: History of the previous conversation
Now, initialize the resolver function with these parameters using the Agent
, AgentStep
and query:
from haystack.agents import AgentStep, Agent
def resolver_function(query, agent, agent_step):
return {
"query": query,
"tool_names_with_descriptions": agent.tm.get_tool_names_with_descriptions(),
"transcript": agent_step.transcript,
"memory": agent.memory.load(),
}
5) Create the Agent
Bring together all the components and create the Agent:
from haystack.agents.base import Agent, ToolsManager
conversational_agent = Agent(
agent_prompt_node,
prompt_template=agent_prompt,
prompt_parameters_resolver=resolver_function,
memory=memory,
tools_manager=ToolsManager([search_tool]),
)
6) Start Chatting!
conversational_agent.run("What did Rhodes Statue look like?")
conversational_agent.run("When did it collapse?")
conversational_agent.run("How tall was it?")
conversational_agent.run("How long did it stand?")
Congratulations! ๐ Youโve customized an Agent to chat with your documents!