使用 AgentQL 构建浏览器代理¶
AgentQL 工具使用 AgentQL 查询或自然语言提示,从任何网页提供 Web 交互和结构化数据提取。AgentQL 可以在多种语言和网页上使用,并且不会随着时间和变化而失效。
本教程将向您展示如何
- 使用 AgentQL 工具和 LlamaIndex 创建浏览器代理
- 如何使用 AgentQL 工具导航互联网
- 如何使用 AgentQL 工具抓取互联网上的内容
概述¶
AgentQL 提供三个函数工具。第一个不需要浏览器,依赖于 REST API
extract_web_data_with_rest_api
使用 AgentQL 查询或数据的自然语言描述,从给定 URL 的网页中提取结构化数据为 JSON。
其他两个工具必须与 Playwright
浏览器或通过 Chrome DevTools Protocol (CDP) 的远程浏览器实例一起使用
extract_web_data_from_browser
使用 AgentQL 查询或自然语言描述,从浏览器中的活动网页中提取结构化数据为 JSON。get_web_element_from_browser
使用自然语言描述在浏览器中的活动网页上查找网页元素,并返回其 CSS 选择器供进一步交互。
工具特性¶
工具 | 网页数据提取 | 网页元素提取 | 与本地浏览器一起使用 |
---|
| extract_web_data_with_rest_api | ✅ | ❌ | ❌ | extract_web_data_from_browser | ✅ | ❌ | ✅ | get_web_element_from_browser | ❌ | ✅ | ✅
设置¶
%pip install llama-index-tools-agentql llama-index-tools-playwright llama-index
凭据¶
要使用 AgentQL 工具,您需要从 AgentQL Dev Portal 获取您自己的 API 密钥,并设置 AgentQL 环境变量
import os
os.environ["AGENTQL_API_KEY"] = "YOUR_AGENTQL_API_KEY"
设置 Playwright 浏览器和 AgentQL 工具¶
要运行此 Notebook,请安装 Playwright 浏览器并配置 Jupyter Notebook 的 asyncio
循环。
!playwright install
# This import is required only for jupyter notebooks, since they have their own eventloop
import nest_asyncio
nest_asyncio.apply()
实例化¶
AgentQLRestAPIToolSpec
¶
AgentQLRestAPIToolSpec
提供 extract_web_data_with_rest_api
函数工具。
您可以使用以下参数实例化 AgentQLRestAPIToolSpec
timeout
: 等待请求超时前的秒数。如果数据提取超时,请增加此值。 默认为900
。is_stealth_mode_enabled
: 是否启用实验性反机器人规避策略。此功能可能并非在所有时间都适用于所有网站。启用此模式后,数据提取可能需要更长时间才能完成。 默认为False
。wait_for
: 等待页面加载完成后再提取数据的秒数。 默认为0
。is_scroll_to_bottom_enabled
: 是否在提取数据前滚动到页面底部。 默认为False
。mode
:"standard"
使用深度数据分析,而"fast"
则牺牲部分分析深度以换取速度,适用于大多数用例。在此指南中了解有关模式的更多信息。 默认为"fast"
。is_screenshot_enabled
: 是否在提取数据前截图。在“metadata”中作为 Base64 字符串返回。 默认为False
。
AgentQLRestAPIToolSpec
使用 AgentQL REST API,有关参数的更多详细信息,请阅读 API 参考文档。
from llama_index.tools.agentql import AgentQLRestAPIToolSpec
agentql_rest_api_tool = AgentQLRestAPIToolSpec()
AgentQLBrowserToolSpec
¶
AgentQLBrowserToolSpec
提供两个工具:extract_web_data_from_browser
和 get_web_element_from_browser
。
此工具规范可以使用以下参数实例化
async_browser
: 异步 Playwright 浏览器实例。timeout_for_data
: 提取数据请求超时前的秒数。 默认为900
。timeout_for_element
: 获取元素请求超时前的秒数。 默认为900
。wait_for_network_idle
: 是否等待网络达到完全空闲状态后再执行。 默认为True
。include_hidden_for_data
: 提取数据时是否考虑页面上视觉上隐藏的元素。 默认为True
。include_hidden_for_element
: 获取元素时是否考虑页面上视觉上隐藏的元素。 默认为False
。mode
:"standard"
使用深度数据分析,而"fast"
则牺牲部分分析深度以换取速度,适用于大多数用例。在此指南中了解有关模式的更多信息。 默认为"fast"
。
AgentQLBrowserToolSpec
使用 AgentQL SDK。您可以在 SDK API 参考中找到有关参数和函数的更多详细信息。
注意: 要实例化
AgentQLBrowserToolSpec
,您需要提供一个浏览器实例。您可以使用 LlamaIndex 的 Playwright ToolSpec 中的create_async_playwright_browser
实用方法创建一个。
from llama_index.tools.playwright.base import PlaywrightToolSpec
from llama_index.tools.agentql import AgentQLBrowserToolSpec
async_browser = await PlaywrightToolSpec.create_async_playwright_browser()
agentql_browser_tool = AgentQLBrowserToolSpec(async_browser=async_browser)
调用 AgentQL 工具¶
extract_web_data_with_rest_api
¶
此工具在底层使用 AgentQL 的 REST API,将公开可用的网页 URL 发送到 AgentQL 的端点。这不适用于私有页面或登录会话。对于这些用例,请使用 extract_web_data_from_browser
。
url
: 您要从中提取数据的网页 URL。query
: 要执行的 AgentQL 查询。如果要以您定义的结构提取数据,请使用此项。在文档中了解如何编写 AgentQL 查询。prompt
: 页面上要提取数据的自然语言描述。AgentQL 将从您的提示中推断出数据的结构。
注意: 您必须定义
query
或prompt
才能使用 AgentQL。
# You can invoke the tool with either a query or a prompt
# await agentql_rest_api_tool.extract_web_data_with_rest_api(
# url="https://www.agentql.com/blog",
# prompt="the blog posts with title, url, author and publication date",
# )
await agentql_rest_api_tool.extract_web_data_with_rest_api(
url="https://www.agentql.com/blog",
query="{ posts[] { title url author date }}",
)
{'data': {'posts': [{'title': 'AgentQL MCP Server: Structured Web Data for Claude, Cursor, Windsurf, and more', 'url': 'https://www.agentql.com/blog/2025-mcp-integration', 'author': 'Rachel-Lee Nabors', 'date': 'Mar 12, 2025'}, {'title': 'Dify + AgentQL: Build AI Apps with Live Web Data, No Code Needed', 'url': 'https://www.agentql.com/blog/2025-dify-integration', 'author': 'Rachel-Lee Nabors', 'date': 'Mar 11, 2025'}, {'title': 'Zapier + AgentQL: No-Code Web Data for Smarter Workflows', 'url': 'https://www.agentql.com/blog/2025-zapier-integration', 'author': 'Rachel-Lee Nabors', 'date': 'Mar 10, 2025'}, {'title': 'Something is coming.', 'url': 'https://www.agentql.com/blog/2025-iw-teaser', 'author': 'Rachel-Lee Nabors', 'date': 'Mar 7, 2025'}, {'title': 'Automated web application testing with AI and Playwright', 'url': 'https://www.agentql.com/blog/2025-automated-testing-web-ai-playwright', 'author': 'Vladimir de Turckheim', 'date': 'Feb 26, 2025'}]}, 'metadata': {'request_id': '5a43ab86-f68b-4470-bca9-ab51a791041a', 'generated_query': None, 'screenshot': None}}
隐身模式¶
AgentQL 提供实验性反机器人规避策略,以避免被反机器人服务检测到。
注意: 隐身模式是实验性的,可能并非在所有时间都适用于所有网站。与非隐身模式相比,数据提取可能需要更长时间才能完成。
# agentql_rest_api_tool = AgentQLRestAPIToolSpec(is_stealth_mode_enabled=True)
await agentql_rest_api_tool.extract_web_data_with_rest_api(
url="https://www.patagonia.com/shop/web-specials/womens",
query="{ items[] { name price}}",
)
{'data': {'items': [{'name': "W's Recycled Down Sweater™ Parka - Pitch Blue (PIBL) (28460)", 'price': 178.99}, {'name': "W's Recycled Down Sweater™ Parka - Shelter Brown (SHBN) (28460)", 'price': 178.99}, {'name': "W's Recycled Down Sweater™ Parka - Pine Needle Green (PNGR) (28460)", 'price': 178.99}, {'name': "W's Recycled Down Sweater™ Parka - Burnished Red (BURR) (28460)", 'price': 178.99}, {'name': "W's Nano Puff® Jacket - Burnished Red (BURR) (84217)", 'price': 118.99}, {'name': "W's Nano Puff® Jacket - Pine Needle Green (PNGR) (84217)", 'price': 118.99}, {'name': "W's Powder Town Jacket - Vivid Apricot (VAPC) (31635)", 'price': 208.99}, {'name': "W's Powder Town Jacket - Pine Needle Green (PNGR) (31635)", 'price': 208.99}, {'name': "W's Powder Town Jacket - Dulse Mauve (DLMA) (31635)", 'price': 208.99}, {'name': "W's Powder Town Jacket - Smolder Blue w/Dulse Mauve (SBMA) (31635)", 'price': 208.99}, {'name': "W's Powder Town Pants - Pine Needle Green (PNGR) (31645)", 'price': 148.99}, {'name': "W's Powder Town Pants - Thermal Blue (TMBL) (31645)", 'price': 173.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Dulse Mauve (DLMA) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Synched Flight Small: Natural (SYNL) (25455)", 'price': 96.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Thermal Blue (TMBL) (25455)", 'price': 82.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Across Oceans: Pitch Blue (ASPH) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Terra Pink (TRPI) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Small Currents: Natural (SCNL) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Nickel w/Vivid Apricot (NLVA) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Echo Purple (ECPU) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Oatmeal Heather w/Vessel Blue (OHVL) (25455)", 'price': 68.99}, {'name': "W's Down Sweater™ - Seabird Grey (SBDY) (84684)", 'price': 166.99}, {'name': "W's Pine Bank 3-in-1 Parka - Shelter Brown (SHBN) (21025)", 'price': 273.99}, {'name': "W's Pine Bank 3-in-1 Parka - Pitch Blue (PIBL) (21025)", 'price': 328.99}, {'name': "W's Pine Bank 3-in-1 Parka - Burnished Red (BURR) (21025)", 'price': 273.99}, {'name': "W's Pine Bank 3-in-1 Parka - Pine Needle Green (PNGR) (21025)", 'price': 273.99}, {'name': "W's SnowDrifter Jacket - Vessel Blue (VSLB) (30071)", 'price': 268.99}, {'name': "W's SnowDrifter Jacket - Dulse Mauve (DLMA) (30071)", 'price': 268.99}, {'name': "W's SnowDrifter Jacket - Vivid Apricot (VAPC) (30071)", 'price': 268.99}, {'name': "W's SnowDrifter Jacket - Thermal Blue (TMBL) (30071)", 'price': 268.99}, {'name': "W's Re-Tool Half-Snap Pullover - Burnished Red (BURR) (26465)", 'price': 78.99}, {'name': "W's Re-Tool Half-Snap Pullover - Vessel Blue (VSLB) (26465)", 'price': 94.99}, {'name': "W's Re-Tool Half-Snap Pullover - Dulse Mauve (DLMA) (26465)", 'price': 78.99}, {'name': "W's Re-Tool Half-Snap Pullover - Shelter Brown (SHBN) (26465)", 'price': 78.99}, {'name': "W's Insulated Storm Shift Jacket - Dulse Mauve (DLMA) (31835)", 'price': 383.99}, {'name': "W's Insulated Storm Shift Jacket - Pine Needle Green (PNGR) (31835)", 'price': 328.99}, {'name': "W's SnowDrifter Bibs - Black (BLK) (30081)", 'price': 238.99}, {'name': "W's SnowDrifter Bibs - Smolder Blue (SMDB) (30081)", 'price': 278.99}, {'name': "W's SnowDrifter Bibs - Dulse Mauve (DLMA) (30081)", 'price': 238.99}, {'name': "W's SnowDrifter Bibs - Pine Needle Green (PNGR) (30081)", 'price': 238.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Chevron Cable: Natural (CHNL) (51025)", 'price': 73.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Only Earth: Beeswax Tan (OETN) (51025)", 'price': 103.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Snowdrift: Thermal Blue (SDTL) (51025)", 'price': 88.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Ridge: Pine Needle Green (RPNG) (51025)", 'price': 88.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Chevron Cable: Madder Red (CHMR) (51025)", 'price': 88.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Smolder Blue (SMDB) (51025)", 'price': 73.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Fireside: Shelter Brown (FISN) (51025)", 'price': 73.99}, {'name': "W's Micro D® Joggers - Synched Flight Small: Natural (SYNL) (22020)", 'price': 48.99}, {'name': "W's Micro D® Joggers - Endless Blue (ENLB) (22020)", 'price': 58.99}, {'name': "W's Micro D® Joggers - Small Currents: Natural (SCNL) (22020)", 'price': 48.99}, {'name': "W's Better Sweater® 1/4-Zip - Stormy Mauve (STMA) (25618)", 'price': 68.99}, {'name': "W's Better Sweater® 1/4-Zip - Dulse Mauve (DLMA) (25618)", 'price': 82.99}, {'name': "W's Better Sweater® 1/4-Zip - Torrey Pine Green (TPGN) (25618)", 'price': 82.99}, {'name': "W's Better Sweater® 1/4-Zip - Nouveau Green (NUVG) (25618)", 'price': 68.99}, {'name': "W's Better Sweater® 1/4-Zip - Raptor Brown (RPBN) (25618)", 'price': 68.99}, {'name': "W's Insulated Powder Town Pants - Black (BLK) (31185)", 'price': 160.99}, {'name': "W's Insulated Powder Town Pants - Smolder Blue (SMDB) (31185)", 'price': 160.99}, {'name': "W's Insulated Powder Town Pants - Dulse Mauve (DLMA) (31185)", 'price': 160.99}, {'name': "W's Insulated Powder Town Pants - Vivid Apricot (VAPC) (31185)", 'price': 160.99}, {'name': "W's Insulated Powder Town Pants - Across Oceans: Smolder Blue (ASBE) (31185)", 'price': 160.99}, {'name': 'Atom Sling 8L - Vessel Blue (VSLB) (48262)', 'price': 44.99}, {'name': 'Atom Sling 8L - Buckhorn Green (BUGR) (48262)', 'price': 44.99}, {'name': 'Atom Sling 8L - Dulse Mauve (DLMA) (48262)', 'price': 44.99}, {'name': "W's Classic Retro-X® Jacket - Natural w/Smolder Blue (NTSB) (23074)", 'price': 136.99}, {'name': "W's Classic Retro-X® Jacket - Nest Brown w/Dulse Mauve (NBDU) (23074)", 'price': 113.99}, {'name': "W's Classic Retro-X® Jacket - Small Currents: Natural (SCNL) (23074)", 'price': 113.99}, {'name': "W's Los Gatos 1/4-Zip - Salt Grey (SGRY) (25236)", 'price': 53.99}, {'name': "W's Los Gatos 1/4-Zip - Dulse Mauve (DLMA) (25236)", 'price': 64.99}, {'name': "W's Stand Up® Cropped Corduroy Overalls - Nest Brown (NESB) (75100)", 'price': 68.99}, {'name': "W's Stand Up® Cropped Corduroy Overalls - Pitch Blue (PIBL) (75100)", 'price': 68.99}, {'name': "W's Stand Up® Cropped Corduroy Overalls - Beeswax Tan (BWX) (75100)", 'price': 68.99}, {'name': "W's Synchilla® Jacket - Oatmeal Heather w/Natural (OTNL) (22955)", 'price': 88.99}, {'name': "W's Synchilla® Jacket - Black (BLK) (22955)", 'price': 73.99}, {'name': "W's Synchilla® Jacket - Pitch Blue (PIBL) (22955)", 'price': 73.99}, {'name': "W's Synchilla® Jacket - Beeswax Tan (BWX) (22955)", 'price': 73.99}, {'name': "W's Insulated Powder Town Jacket - Vivid Apricot (VAPC) (31200)", 'price': 238.99}, {'name': "W's Insulated Powder Town Jacket - Black (BLK) (31200)", 'price': 278.99}, {'name': "W's Insulated Powder Town Jacket - Across Oceans: Smolder Blue (ASBE) (31200)", 'price': 238.99}, {'name': "W's Powder Town Bibs - Smolder Blue (SMDB) (31650)", 'price': 178.99}, {'name': "W's Powder Town Bibs - Dulse Mauve (DLMA) (31650)", 'price': 208.99}, {'name': "W's Powder Town Bibs - Pine Needle Green (PNGR) (31650)", 'price': 178.99}, {'name': "W's Powder Town Bibs - Seabird Grey (SBDY) (31650)", 'price': 178.99}, {'name': "W's Retro Pile Marsupial - Thermal Blue (TMBL) (22835)", 'price': 73.99}, {'name': "W's Retro Pile Marsupial - Shroom Taupe (STPE) (22835)", 'price': 88.99}, {'name': "W's Retro Pile Marsupial - Shelter Brown (SHBN) (22835)", 'price': 73.99}, {'name': "W's Cord Fjord Coat - Dulse Mauve (DLMA) (26881)", 'price': 163.99}, {'name': "W's Cord Fjord Coat - Shelter Brown (SHBN) (26881)", 'price': 163.99}, {'name': "W's Regenerative Organic Certified® Cotton Essential Top - Thermal Blue (TMBL) (42171)", 'price': 41.99}, {'name': "W's Regenerative Organic Certified® Cotton Essential Top - Pine Needle Green (PNGR) (42171)", 'price': 41.99}, {'name': "W's Lonesome Mesa Long Coat - Pitch Blue (PIBL) (26655)", 'price': 148.99}, {'name': "W's Lonesome Mesa Long Coat - Pine Needle Green (PNGR) (26655)", 'price': 148.99}]}, 'metadata': {'request_id': '0016c761-92c1-47b5-9b8f-f71f9727d58d', 'generated_query': None, 'screenshot': None}}
extract_web_data_from_browser
¶
query
: 要执行的 AgentQL 查询。如果要以您定义的结构提取数据,请使用此项。在文档中了解如何编写 AgentQL 查询。prompt
: 页面上要提取数据的自然语言描述。AgentQL 将从您的提示中推断出数据的结构。
注意: 您必须定义
query
或prompt
才能使用 AgentQL。
要提取数据,首先您必须使用 LlamaIndex 的 Playwright 点击工具导航到网页。
playwright_tool = PlaywrightToolSpec(async_browser=async_browser)
await playwright_tool.navigate_to("https://www.agentql.com/blog")
# You can invoke the tool with either a query or a prompt
# await agentql_browser_tool.extract_web_data_from_browser(
# query="{ posts[] { title url }}",
# )
await agentql_browser_tool.extract_web_data_from_browser(
prompt="the blog posts with title and url",
)
/Users/jisonz/Library/Caches/pypoetry/virtualenvs/llama-index-AJEGkUS0-py3.13/lib/python3.13/site-packages/agentql/_core/_utils.py:167: UserWarning: 🚨 The function get_data_by_prompt_experimental is experimental and may not work as expected 🚨
warnings.warn(
{'blog_post': [{'title': 'AgentQL MCP Server: Structured Web Data for Claude, Cursor, Windsurf, and more', 'url': 'https://www.agentql.com/blog/2025-mcp-integration'}, {'title': 'Dify + AgentQL: Build AI Apps with Live Web Data, No Code Needed', 'url': 'https://www.agentql.com/blog/2025-dify-integration'}, {'title': 'Zapier + AgentQL: No-Code Web Data for Smarter Workflows', 'url': 'https://www.agentql.com/blog/2025-zapier-integration'}, {'title': 'Something is coming.', 'url': 'https://www.agentql.com/blog/2025-iw-teaser'}, {'title': 'Automated web application testing with AI and Playwright', 'url': 'https://www.agentql.com/blog/2025-automated-testing-web-ai-playwright'}]}
get_web_element_from_browser
¶
prompt
: 页面上要查找的网页元素的自然语言描述。
await playwright_tool.navigate_to("https://www.agentql.com/blog")
print(await playwright_tool.get_current_page())
next_page_button = await agentql_browser_tool.get_web_element_from_browser(
prompt="The next page navigation button",
)
next_page_button
https://www.agentql.com/blog
"[tf623_id='1111']"
点击该元素并再次检查 url
await playwright_tool.click(next_page_button)
"Clicked element '[tf623_id='1111']'"
print(await playwright_tool.get_current_page())
https://www.agentql.com/blog/page/2
将 AgentQL 工具与代理一起使用¶
首先,您需要一个 OpenAI api 密钥
# set your openai key, if using openai
import os
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
from llama_index.core.agent import FunctionCallingAgent
from llama_index.llms.openai import OpenAI
# We add playwright's click, get_current_page, and navigate_to tools to the agent along with agentql tools
playwright_tool = PlaywrightToolSpec(async_browser=async_browser)
playwright_tool_list = playwright_tool.to_tool_list()
playwright_agent_tool_list = [
tool
for tool in playwright_tool_list
if tool.metadata.name in ["click", "get_current_page", "navigate_to"]
]
agent = FunctionCallingAgent.from_tools(
playwright_agent_tool_list + agentql_browser_tool.to_tool_list(),
llm=OpenAI(model="gpt-4o"),
)
print(
agent.chat(
"""
Navigate to https://blog.samaltman.com/archive,
Find blog posts titled "What I wish someone had told me", click on the link,
Extract the blog text and number of views.
"""
)
)
I have extracted the blog post titled "What I wish someone had told me" along with the number of views. Here are the details: **Blog Text:** > Optimism, obsession, self-belief, raw horsepower and personal connections are how things get started. Cohesive teams, the right combination of calmness and urgency, and unreasonable commitment are how things get finished. Long-term orientation is in short supply; try not to worry about what people think in the short term, which will get easier over time. It is easier for a team to do a hard thing that really matters than to do an easy thing that doesn’t really matter; audacious ideas motivate people. Incentives are superpowers; set them carefully. Concentrate your resources on a small number of high-conviction bets; this is easy to say but evidently hard to do. You can delete more stuff than you think. Communicate clearly and concisely. Fight bullshit and bureaucracy every time you see it and get other people to fight it too. Do not let the org chart get in the way of people working productively together. Outcomes are what count; don’t let good process excuse bad results. Spend more time recruiting. Take risks on high-potential people with a fast rate of improvement. Look for evidence of getting stuff done in addition to intelligence. Superstars are even more valuable than they seem, but you have to evaluate people on their net impact on the performance of the organization. Fast iteration can make up for a lot; it’s usually ok to be wrong if you iterate quickly. Plans should be measured in decades, execution should be measured in weeks. Don’t fight the business equivalent of the laws of physics. Inspiration is perishable and life goes by fast. Inaction is a particularly insidious type of risk. Scale often has surprising emergent properties. Compounding exponentials are magic. In particular, you really want to build a business that gets a compounding advantage with scale. Get back up and keep going. Working with great people is one of the best parts of life. **Number of Views:** 531,222
在代理工作流中使用 playwright 工具¶
from llama_index.llms.openai import OpenAI
from llama_index.core.agent.workflow import AgentWorkflow
from llama_index.core.agent.workflow import (
AgentInput,
AgentOutput,
ToolCall,
ToolCallResult,
AgentStream,
)
playwright_tool_list = playwright_tool.to_tool_list()
playwright_agent_tool_list = [
tool
for tool in playwright_tool_list
if tool.metadata.name in ["click", "get_current_page", "navigate_to"]
]
llm = OpenAI(model="gpt-4o")
workflow = AgentWorkflow.from_tools_or_functions(
playwright_agent_tool_list + agentql_browser_tool.to_tool_list(),
llm=llm,
system_prompt="You are a helpful assistant that can do browser automation, data extraction and text summarization",
)
handler = workflow.run(
user_msg="""
Navigate to https://blog.samaltman.com/archive,
Find blog posts titled "What I wish someone had told me", click on the link,
Detect if the webpage has navigated to the blog post,
then extract the blog text and number of views.
"""
)
async for event in handler.stream_events():
if isinstance(event, AgentStream):
print(event.delta, end="", flush=True)
elif isinstance(event, ToolCallResult):
print(event.tool_name) # the tool name
print(event.tool_kwargs) # the tool kwargs
print(event.tool_output) # the tool output
navigate_to {'url': 'https://blog.samaltman.com/archive'} Navigating to https://blog.samaltman.com/archive returned status code 200 get_web_element_from_browser {'prompt': "blog post titled 'What I wish someone had told me'"} [tf623_id='1849'] click {'selector': "[tf623_id='1849']"} Clicked element '[tf623_id='1849']' get_current_page {} https://blog.samaltman.com/what-i-wish-someone-had-told-me extract_web_data_from_browser {'prompt': 'Extract the blog text and number of views from the page.'} {'blog_post_text': 'Optimism, obsession, self-belief, raw horsepower and personal connections are how things get started.\nCohesive teams, the right combination of calmness and urgency, and unreasonable commitment are how things get finished. Long-term orientation is in short supply; try not to worry about what people think in the short term, which will get easier over time.\nIt is easier for a team to do a hard thing that really matters than to do an easy thing that doesn’t really matter; audacious ideas motivate people.\nIncentives are superpowers; set them carefully.\nConcentrate your resources on a small number of high-conviction bets; this is easy to say but evidently hard to do. You can delete more stuff than you think.\nCommunicate clearly and concisely.\nFight bullshit and bureaucracy every time you see it and get other people to fight it too. Do not let the org chart get in the way of people working productively together.\nOutcomes are what count; don’t let good process excuse bad results.\nSpend more time recruiting. Take risks on high-potential people with a fast rate of improvement. Look for evidence of getting stuff done in addition to intelligence.\nSuperstars are even more valuable than they seem, but you have to evaluate people on their net impact on the performance of the organization.\nFast iteration can make up for a lot; it’s usually ok to be wrong if you iterate quickly. Plans should be measured in decades, execution should be measured in weeks.\nDon’t fight the business equivalent of the laws of physics.\nInspiration is perishable and life goes by fast. Inaction is a particularly insidious type of risk.\nScale often has surprising emergent properties.\nCompounding exponentials are magic. In particular, you really want to build a business that gets a compounding advantage with scale.\nGet back up and keep going.\nWorking with great people is one of the best parts of life.', 'views_count': 531223} I have navigated to the blog post titled "What I Wish Someone Had Told Me" and extracted the following information: **Blog Text:** Optimism, obsession, self-belief, raw horsepower and personal connections are how things get started. Cohesive teams, the right combination of calmness and urgency, and unreasonable commitment are how things get finished. Long-term orientation is in short supply; try not to worry about what people think in the short term, which will get easier over time. It is easier for a team to do a hard thing that really matters than to do an easy thing that doesn’t really matter; audacious ideas motivate people. Incentives are superpowers; set them carefully. Concentrate your resources on a small number of high-conviction bets; this is easy to say but evidently hard to do. You can delete more stuff than you think. Communicate clearly and concisely. Fight bullshit and bureaucracy every time you see it and get other people to fight it too. Do not let the org chart get in the way of people working productively together. Outcomes are what count; don’t let good process excuse bad results. Spend more time recruiting. Take risks on high-potential people with a fast rate of improvement. Look for evidence of getting stuff done in addition to intelligence. Superstars are even more valuable than they seem, but you have to evaluate people on their net impact on the performance of the organization. Fast iteration can make up for a lot; it’s usually ok to be wrong if you iterate quickly. Plans should be measured in decades, execution should be measured in weeks. Don’t fight the business equivalent of the laws of physics. Inspiration is perishable and life goes by fast. Inaction is a particularly insidious type of risk. Scale often has surprising emergent properties. Compounding exponentials are magic. In particular, you really want to build a business that gets a compounding advantage with scale. Get back up and keep going. Working with great people is one of the best parts of life. **Number of Views:** 531,223