代理工作流 + 使用 AgentQL 的研究助手¶
在本教程中,我们将使用 AgentWorkflow
来构建一个研究助手 OpenAI 代理,使用包括 AgentQL 浏览器工具、Playwright 工具以及 DuckDuckGoSearch 工具在内的各种工具。该代理执行网络搜索以查找与研究主题相关的资源,与这些资源互动,并从中提取关键元数据(例如,标题、作者、出版详情和摘要)。
In [ ]
已复制!
%pip install llama-index
%pip install llama-index-tools-agentql
%pip install llama-index-tools-playwright
%pip install llama-index-tools-duckduckgo
!playwright install
%pip install llama-index %pip install llama-index-tools-agentql %pip install llama-index-tools-playwright %pip install llama-index-tools-duckduckgo !playwright install
将您的 OPENAI_API_KEY
和 AGENTQL_API_KEY
密钥存储在 Google Colab 的 secrets 中。
In [ ]
已复制!
import os
from google.colab import userdata
os.environ["AGENTQL_API_KEY"] = userdata.get("AGENTQL_API_KEY")
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")
import os from google.colab import userdata os.environ["AGENTQL_API_KEY"] = userdata.get("AGENTQL_API_KEY") os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")
我们首先为 notebook 启用异步功能,因为像 Google Colab 这样的在线环境只支持 AgentQL 的异步版本。
In [ ]
已复制!
import nest_asyncio
nest_asyncio.apply()
import nest_asyncio nest_asyncio.apply()
创建一个 async_browser
实例,并选择您想要使用的 Playwright 工具。
In [ ]
已复制!
from llama_index.tools.playwright.base import PlaywrightToolSpec
async_browser = await PlaywrightToolSpec.create_async_playwright_browser(
headless=True
)
playwright_tool = PlaywrightToolSpec(async_browser=async_browser)
playwright_tool_list = playwright_tool.to_tool_list()
playwright_agent_tool_list = [
tool
for tool in playwright_tool_list
if tool.metadata.name in ["click", "get_current_page", "navigate_to"]
]
from llama_index.tools.playwright.base import PlaywrightToolSpec async_browser = await PlaywrightToolSpec.create_async_playwright_browser( headless=True ) playwright_tool = PlaywrightToolSpec(async_browser=async_browser) playwright_tool_list = playwright_tool.to_tool_list() playwright_agent_tool_list = [ tool for tool in playwright_tool_list if tool.metadata.name in ["click", "get_current_page", "navigate_to"] ]
In [ ]
已复制!
from llama_index.tools.agentql import AgentQLBrowserToolSpec
from llama_index.tools.duckduckgo import DuckDuckGoSearchToolSpec
duckduckgo_search_tool = [
tool
for tool in DuckDuckGoSearchToolSpec().to_tool_list()
if tool.metadata.name == "duckduckgo_full_search"
]
agentql_browser_tool = AgentQLBrowserToolSpec(async_browser=async_browser)
from llama_index.tools.agentql import AgentQLBrowserToolSpec from llama_index.tools.duckduckgo import DuckDuckGoSearchToolSpec duckduckgo_search_tool = [ tool for tool in DuckDuckGoSearchToolSpec().to_tool_list() if tool.metadata.name == "duckduckgo_full_search" ] agentql_browser_tool = AgentQLBrowserToolSpec(async_browser=async_browser)
现在我们可以创建一个使用已导入工具的 AgentWorkFlow
。
In [ ]
已复制!
from llama_index.llms.openai import OpenAI
from llama_index.core.agent.workflow import AgentWorkflow
llm = OpenAI(model="gpt-4o")
workflow = AgentWorkflow.from_tools_or_functions(
playwright_agent_tool_list
+ agentql_browser_tool.to_tool_list()
+ duckduckgo_search_tool,
llm=llm,
system_prompt="You are an expert that can do browser automation, data extraction and text summarization for finding and extracting data from research resources.",
)
from llama_index.llms.openai import OpenAI from llama_index.core.agent.workflow import AgentWorkflow llm = OpenAI(model="gpt-4o") workflow = AgentWorkflow.from_tools_or_functions( playwright_agent_tool_list + agentql_browser_tool.to_tool_list() + duckduckgo_search_tool, llm=llm, system_prompt="您是一位专家,可以执行浏览器自动化、数据提取和文本摘要,以便从研究资源中查找和提取数据。", )
AgentWorkflow
也支持流式传输,这通过使用从工作流返回的处理程序来实现。要流式传输 LLM 输出,您可以使用 AgentStream
事件。
In [ ]
已复制!
from llama_index.core.agent.workflow import (
AgentStream,
)
handler = workflow.run(
user_msg="""
Use DuckDuckGoSearch to find URL resources on the web that are relevant to the research topic: What is the relationship between exercise and stress levels?
Go through each resource found. For each different resource, use Playwright to click on link to the resource, then use AgentQL to extract information, including the name of the resource, author name(s), link to the resource, publishing date, journal name, volume number, issue number, and the abstract.
Find more resources until there are two different resources that can be successfully extracted from.
"""
)
async for event in handler.stream_events():
if isinstance(event, AgentStream):
print(event.delta, end="", flush=True)
from llama_index.core.agent.workflow import ( AgentStream, ) handler = workflow.run( user_msg=""" 使用 DuckDuckGoSearch 查找网络上与研究主题相关的 URL 资源:运动与压力水平之间有什么关系? 遍历找到的每个资源。 对于每个不同的资源,使用 Playwright 点击资源链接,然后使用 AgentQL 提取信息,包括资源名称、作者姓名、资源链接、出版日期、期刊名称、卷号、期号和摘要。 继续查找更多资源,直到成功提取到两个不同的资源。 """ ) async for event in handler.stream_events(): if isinstance(event, AgentStream): print(event.delta, end="", flush=True)
/usr/local/lib/python3.11/dist-packages/agentql/_core/_utils.py:171: UserWarning: 🚨 The function get_data_by_prompt_experimental is experimental and may not work as expected 🚨
warnings.warn(
I successfully extracted information from one resource. Here are the details: - **Title**: Role of Physical Activity on Mental Health and Well-Being: A Review - **Authors**: Aditya Mahindru, Pradeep Patil, Varun Agrawal - **Link**: [Role of Physical Activity on Mental Health and Well-Being: A Review](https://pmc.ncbi.nlm.nih.gov/articles/PMC9902068/) - **Publication Date**: January 7, 2023 - **Journal Name**: Cureus - **Volume Number**: 15 - **Issue Number**: 1 - **Abstract**: The article reviews the positive effects of physical activity on mental health, highlighting its benefits on self-concept, body image, and mood. It discusses the physiological and psychological mechanisms by which exercise improves mental health, including its impact on the hypothalamus-pituitary-adrenal axis, depression, anxiety, sleep, and psychiatric disorders. The review also notes the need for more research in the Indian context. I will now attempt to extract information from another resource.I successfully extracted information from a second resource. Here are the details: - **Title**: The Relationship Between Exercise Habits and Stress Among Individuals With Access to Internet-Connected Home Fitness Equipment: Single-Group Prospective Analysis - **Authors**: Margaret Schneider, Amanda Woodworth, Milad Asgari Mehrabadi - **Link**: [The Relationship Between Exercise Habits and Stress Among Individuals With Access to Internet-Connected Home Fitness Equipment](https://pmc.ncbi.nlm.nih.gov/articles/PMC9947760/) - **Publication Date**: February 8, 2023 - **Journal Name**: JMIR Form Res - **Volume Number**: 7 - **Issue Number**: e41877 - **Abstract**: This study examines the relationship between stress and exercise habits among habitual exercisers with internet-connected home fitness equipment during the COVID-19 lockdown. It found that stress did not negatively impact exercise participation among habitually active adults with such equipment. The study suggests that habitual exercise may buffer the impact of stress on regular moderate to vigorous activity, and highlights the potential role of home-based internet-connected exercise equipment in this buffering. Both resources provide valuable insights into the relationship between exercise and stress levels.