简单可组合内存¶
注意: 此内存示例已弃用,请优先使用更新、更灵活的 `Memory` 类。请参阅最新文档。
在本 notebook 中,我们将演示如何将多个内存源注入到代理中。具体来说,我们使用 `SimpleComposableMemory`,它由一个 `primary_memory` 以及潜在的多个辅助内存源 (`secondary_memory_sources`) 组成。主要区别在于,`primary_memory` 将用作代理的主要聊天缓冲区,而从 `secondary_memory_sources` 中检索到的任何消息将仅注入到系统提示消息中。
例如,在您除了默认的 `ChatMemoryBuffer` 之外,还想使用 `VectorMemory` 等长期内存的情况下,多个内存源可能会很有用。您将在此 notebook 中看到,使用 `SimpleComposableMemory`,您能够有效地将所需的长期内存消息“加载”到主内存 (即 `ChatMemoryBuffer`) 中。
`SimpleComposableMemory` 工作原理?¶
我们从 `SimpleComposableMemory` 的基本用法开始。在这里,我们构建一个 `VectorMemory` 以及一个默认的 `ChatMemoryBuffer`。`VectorMemory` 将作为我们的辅助内存源,而 `ChatMemoryBuffer` 将是主要内存源。要实例化一个 `SimpleComposableMemory` 对象,我们需要提供一个 `primary_memory` 以及 (可选的) 一个 `secondary_memory_sources` 列表。
from llama_index.core.memory import (
VectorMemory,
SimpleComposableMemory,
ChatMemoryBuffer,
)
from llama_index.core.llms import ChatMessage
from llama_index.embeddings.openai import OpenAIEmbedding
vector_memory = VectorMemory.from_defaults(
vector_store=None, # leave as None to use default in-memory vector store
embed_model=OpenAIEmbedding(),
retriever_kwargs={"similarity_top_k": 1},
)
# let's set some initial messages in our secondary vector memory
msgs = [
ChatMessage.from_str("You are a SOMEWHAT helpful assistant.", "system"),
ChatMessage.from_str("Bob likes burgers.", "user"),
ChatMessage.from_str("Indeed, Bob likes apples.", "assistant"),
ChatMessage.from_str("Alice likes apples.", "user"),
]
vector_memory.set(msgs)
chat_memory_buffer = ChatMemoryBuffer.from_defaults()
composable_memory = SimpleComposableMemory.from_defaults(
primary_memory=chat_memory_buffer,
secondary_memory_sources=[vector_memory],
)
composable_memory.primary_memory
ChatMemoryBuffer(chat_store=SimpleChatStore(store={}), chat_store_key='chat_history', token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'))
composable_memory.secondary_memory_sources
[VectorMemory(vector_index=<llama_index.core.indices.vector_store.base.VectorStoreIndex object at 0x137b912a0>, retriever_kwargs={'similarity_top_k': 1}, batch_by_user_message=True, cur_batch_textnode=TextNode(id_='288b0ef3-570e-4698-a1ae-b3531df66361', embedding=None, metadata={'sub_dicts': [{'role': <MessageRole.USER: 'user'>, 'content': 'Alice likes apples.', 'additional_kwargs': {}}]}, excluded_embed_metadata_keys=['sub_dicts'], excluded_llm_metadata_keys=['sub_dicts'], relationships={}, text='Alice likes apples.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'))]
put()
消息到内存中¶
由于 SimpleComposableMemory
本身是 BaseMemory
的子类,我们向其中添加消息的方式与对其他内存模块相同。注意,对于 SimpleComposableMemory
,调用 .put()
会有效地调用所有内存源上的 .put()
。换句话说,消息会被添加到 primary
(主)源和 secondary
(辅助)源。
msgs = [
ChatMessage.from_str("You are a REALLY helpful assistant.", "system"),
ChatMessage.from_str("Jerry likes juice.", "user"),
]
# load into all memory sources modules"
for m in msgs:
composable_memory.put(m)
从内存中 get()
消息¶
当调用 .get()
时,我们同样会执行 primary
内存以及所有 secondary
源的 .get()
方法。这会留下一个消息列表的序列,我们必须将这些消息“组合”成一个合理且单一的消息集(以便向下游传递给我们的代理)。通常需要在这里特别注意,以确保最终的消息序列既合理又符合 LLM 提供商的聊天 API。
对于 SimpleComposableMemory
,我们将来自 secondary
源的消息注入到 primary
内存的系统消息中。primary
源的其余消息历史保持不变,这种组合就是最终返回的内容。
msgs = composable_memory.get("What does Bob like?")
msgs
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Bob likes burgers.\n\tASSISTANT: Indeed, Bob likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
# see the memory injected into the system message of the primary memory
print(msgs[0])
system: You are a REALLY helpful assistant. Below are a set of relevant dialogues retrieved from potentially several memory sources: =====Relevant messages from memory source 1===== USER: Bob likes burgers. ASSISTANT: Indeed, Bob likes apples. =====End of relevant messages from memory source 1====== This is the end of the retrieved message dialogues.
连续调用 get()
¶
连续调用 get()
只会替换系统提示中加载的 secondary
内存消息。
msgs = composable_memory.get("What does Alice like?")
msgs
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Alice likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
# see the memory injected into the system message of the primary memory
print(msgs[0])
system: You are a REALLY helpful assistant. Below are a set of relevant dialogues retrieved from potentially several memory sources: =====Relevant messages from memory source 1===== USER: Alice likes apples. =====End of relevant messages from memory source 1====== This is the end of the retrieved message dialogues.
如果 get()
检索到的 secondary
消息已存在于 primary
内存中怎么办?¶
如果从 secondary
内存中检索到的消息已存在于 primary
内存中,那么这些冗余的辅助消息将不会被添加到系统消息中。在下面的例子中,消息 "Jerry likes juice." 被 put
到了所有内存源中,因此系统消息没有被改变。
msgs = composable_memory.get("What does Jerry like?")
msgs
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.', additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
如何 reset
内存¶
与 put()
和 get()
等其他方法类似,调用 reset()
将执行 primary
和 secondary
内存源上的 reset()
。如果您只想重置 primary
,则应该只调用其 reset()
方法。
只 reset()
primary 内存¶
composable_memory.primary_memory.reset()
composable_memory.primary_memory.get()
[]
composable_memory.secondary_memory_sources[0].get("What does Alice like?")
[ChatMessage(role=<MessageRole.USER: 'user'>, content='Alice likes apples.', additional_kwargs={})]
reset()
所有内存源¶
composable_memory.reset()
composable_memory.primary_memory.get()
[]
composable_memory.secondary_memory_sources[0].get("What does Alice like?")
[]
将 SimpleComposableMemory
与代理一起使用¶
在这里,我们将使用一个 SimpleComposableMemory
与代理一起,并演示如何使用辅助的长期内存源,将一个代理对话中的消息用于另一个代理会话中的对话。
from llama_index.llms.openai import OpenAI
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import FunctionCallingAgent
import nest_asyncio
nest_asyncio.apply()
定义我们的内存模块¶
vector_memory = VectorMemory.from_defaults(
vector_store=None, # leave as None to use default in-memory vector store
embed_model=OpenAIEmbedding(),
retriever_kwargs={"similarity_top_k": 2},
)
chat_memory_buffer = ChatMemoryBuffer.from_defaults()
composable_memory = SimpleComposableMemory.from_defaults(
primary_memory=chat_memory_buffer,
secondary_memory_sources=[vector_memory],
)
定义我们的代理¶
def multiply(a: int, b: int) -> int:
"""Multiply two integers and returns the result integer"""
return a * b
def mystery(a: int, b: int) -> int:
"""Mystery function on two numbers"""
return a**2 - b**2
multiply_tool = FunctionTool.from_defaults(fn=multiply)
mystery_tool = FunctionTool.from_defaults(fn=mystery)
llm = OpenAI(model="gpt-3.5-turbo-0613")
agent = FunctionCallingAgent.from_tools(
[multiply_tool, mystery_tool],
llm=llm,
memory=composable_memory,
verbose=True,
)
执行一些函数调用¶
调用 .chat()
时,消息会被放入组合内存中,从上一节我们知道这意味着所有消息都会被放入 primary
源和 secondary
源中。
response = agent.chat("What is the mystery function on 5 and 6?")
Added user message to memory: What is the mystery function on 5 and 6? === Calling Function === Calling function: mystery with args: {"a": 5, "b": 6} === Function Output === -11 === LLM Response === The mystery function on 5 and 6 returns -11.
response = agent.chat("What happens if you multiply 2 and 3?")
Added user message to memory: What happens if you multiply 2 and 3? === Calling Function === Calling function: multiply with args: {"a": 2, "b": 3} === Function Output === 6 === LLM Response === If you multiply 2 and 3, the result is 6.
新的代理会话¶
既然我们已经将消息添加到 vector_memory
中,我们可以看到将此内存用于新的代理会话与不使用此内存的效果。具体来说,我们要求新的代理“回忆”函数调用的输出,而不是重新计算。
没有我们过去内存的代理¶
llm = OpenAI(model="gpt-3.5-turbo-0613")
agent_without_memory = FunctionCallingAgent.from_tools(
[multiply_tool, mystery_tool], llm=llm, verbose=True
)
response = agent_without_memory.chat(
"What was the output of the mystery function on 5 and 6 again? Don't recompute."
)
Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute. === LLM Response === I'm sorry, but I don't have access to the previous output of the mystery function on 5 and 6.
带有我们过去内存的代理¶
我们看到无法访问我们过去内存的代理无法完成任务。对于下一个代理,我们将确实传入我们之前的长期内存(即 vector_memory
)。请注意,我们甚至使用了全新的 ChatMemoryBuffer
,这意味着此代理没有 chat_history
。尽管如此,它仍然能够从我们的长期内存中检索到所需的过去对话。
llm = OpenAI(model="gpt-3.5-turbo-0613")
composable_memory = SimpleComposableMemory.from_defaults(
primary_memory=ChatMemoryBuffer.from_defaults(),
secondary_memory_sources=[
vector_memory.copy(
deep=True
) # using a copy here for illustration purposes
# later will use original vector_memory again
],
)
agent_with_memory = FunctionCallingAgent.from_tools(
[multiply_tool, mystery_tool],
llm=llm,
memory=composable_memory,
verbose=True,
)
agent_with_memory.chat_history # an empty chat history
[]
response = agent_with_memory.chat(
"What was the output of the mystery function on 5 and 6 again? Don't recompute."
)
Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute. === LLM Response === The output of the mystery function on 5 and 6 is -11.
response = agent_with_memory.chat(
"What was the output of the multiply function on 2 and 3 again? Don't recompute."
)
Added user message to memory: What was the output of the multiply function on 2 and 3 again? Don't recompute. === LLM Response === The output of the multiply function on 2 and 3 is 6.
agent_with_memory.chat_history
[ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the mystery function on 5 and 6 again? Don't recompute.", additional_kwargs={}), ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the mystery function on 5 and 6 is -11.', additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the multiply function on 2 and 3 again? Don't recompute.", additional_kwargs={}), ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the multiply function on 2 and 3 is 6.', additional_kwargs={})]
.chat(user_input)
内部发生了什么¶
在内部,.chat(user_input)
调用实际上会调用内存的 .get()
方法,并将 user_input
作为参数。正如我们在上一节中学到的,这将最终返回 primary
内存和所有 secondary
内存源的组合。这些组合后的消息就是作为聊天历史传递给 LLM 的聊天 API 的内容。
composable_memory = SimpleComposableMemory.from_defaults(
primary_memory=ChatMemoryBuffer.from_defaults(),
secondary_memory_sources=[
vector_memory.copy(
deep=True
) # copy for illustrative purposes to explain what
# happened under the hood from previous subsection
],
)
agent_with_memory = agent_worker.as_agent(memory=composable_memory)
agent_with_memory.memory.get(
"What was the output of the mystery function on 5 and 6 again? Don't recompute."
)
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: What is the mystery function on 5 and 6?\n\tASSISTANT: None\n\tTOOL: -11\n\tASSISTANT: The mystery function on 5 and 6 returns -11.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={})]
print(
agent_with_memory.memory.get(
"What was the output of the mystery function on 5 and 6 again? Don't recompute."
)[0]
)
system: You are a helpful assistant. Below are a set of relevant dialogues retrieved from potentially several memory sources: =====Relevant messages from memory source 1===== USER: What is the mystery function on 5 and 6? ASSISTANT: None TOOL: -11 ASSISTANT: The mystery function on 5 and 6 returns -11. =====End of relevant messages from memory source 1====== This is the end of the retrieved message dialogues.