Vertex AI Search Retriever¶
本 notebook 将指导你如何设置一个可以从 Vertex AI Search 数据存储中获取数据的 Retriever。
前提条件¶
- 设置一个 Google Cloud 项目
- 设置一个 Vertex AI Search 数据存储
- 启用 Vertex AI API
安装库¶
In [ ]
已复制!
%pip install llama-index-retrievers-vertexai-search
%pip install llama-index-retrievers-vertexai-search
重启当前运行时¶
为了在本 Jupyter 运行时中使用新安装的软件包,你必须重启运行时。你可以通过运行下面的单元格来完成,它将重启当前内核。
In [ ]
已复制!
# Colab only
# Automatically restart kernel after installs so that your environment can access the new packages
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)
# Colab only # Automatically restart kernel after installs so that your environment can access the new packages import IPython app = IPython.Application.instance() app.kernel.do_shutdown(True)
验证你的 notebook 环境(仅限 Colab)¶
如果你在 Google Colab 上运行此 notebook,你需要验证你的环境。为此,请运行下面的新单元格。如果你使用 Vertex AI Workbench,则无需执行此步骤。
In [ ]
已复制!
# Colab only
import sys
if "google.colab" in sys.modules:
from google.colab import auth
auth.authenticate_user()
# Colab only import sys if "google.colab" in sys.modules: from google.colab import auth auth.authenticate_user()
In [ ]
已复制!
# If you're using JupyterLab instance, uncomment and run the below code.
#!gcloud auth login
# If you're using JupyterLab instance, uncomment and run the below code. #!gcloud auth login
In [ ]
已复制!
from llama_index.retrievers.vertexai_search import VertexAISearchRetriever
# Please note it's underscore '_' in vertexai_search
from llama_index.retrievers.vertexai_search import VertexAISearchRetriever # Please note it's underscore '_' in vertexai_search
设置 Google Cloud 项目信息并初始化 Vertex AI SDK¶
要开始使用 Vertex AI,你必须拥有一个现有的 Google Cloud 项目并启用 Vertex AI API。
了解更多关于设置项目和开发环境的信息。
In [ ]
已复制!
PROJECT_ID = "{your project id}" # @param {type:"string"}
LOCATION = "us-central1" # @param {type:"string"}
import vertexai
vertexai.init(project=PROJECT_ID, location=LOCATION)
PROJECT_ID = "{your project id}" # @param {type:"string"} LOCATION = "us-central1" # @param {type:"string"} import vertexai vertexai.init(project=PROJECT_ID, location=LOCATION)
测试结构化数据存储¶
In [ ]
已复制!
DATA_STORE_ID = "{your id}" # @param {type:"string"}
LOCATION_ID = "global"
DATA_STORE_ID = "{your id}" # @param {type:"string"} LOCATION_ID = "global"
In [ ]
已复制!
struct_retriever = VertexAISearchRetriever(
project_id=PROJECT_ID,
data_store_id=DATA_STORE_ID,
location_id=LOCATION_ID,
engine_data_type=1,
)
struct_retriever = VertexAISearchRetriever( project_id=PROJECT_ID, data_store_id=DATA_STORE_ID, location_id=LOCATION_ID, engine_data_type=1, )
In [ ]
已复制!
query = "harry potter"
retrieved_results = struct_retriever.retrieve(query)
query = "harry potter" retrieved_results = struct_retriever.retrieve(query)
In [ ]
已复制!
print(retrieved_results[0])
print(retrieved_results[0])
测试非结构化数据存储¶
In [ ]
已复制!
DATA_STORE_ID = "{your id}"
LOCATION_ID = "global"
DATA_STORE_ID = "{your id}" LOCATION_ID = "global"
In [ ]
已复制!
unstruct_retriever = VertexAISearchRetriever(
project_id=PROJECT_ID,
data_store_id=DATA_STORE_ID,
location_id=LOCATION_ID,
engine_data_type=0,
)
unstruct_retriever = VertexAISearchRetriever( project_id=PROJECT_ID, data_store_id=DATA_STORE_ID, location_id=LOCATION_ID, engine_data_type=0, )
In [ ]
已复制!
query = "alphabet 2018 earning"
retrieved_results2 = unstruct_retriever.retrieve(query)
query = "alphabet 2018 earning" retrieved_results2 = unstruct_retriever.retrieve(query)
In [ ]
已复制!
print(retrieved_results2[0])
print(retrieved_results2[0])
测试网站数据存储¶
In [ ]
已复制!
DATA_STORE_ID = "{your id}"
LOCATION_ID = "global"
website_retriever = VertexAISearchRetriever(
project_id=PROJECT_ID,
data_store_id=DATA_STORE_ID,
location_id=LOCATION_ID,
engine_data_type=2,
)
DATA_STORE_ID = "{your id}" LOCATION_ID = "global" website_retriever = VertexAISearchRetriever( project_id=PROJECT_ID, data_store_id=DATA_STORE_ID, location_id=LOCATION_ID, engine_data_type=2, )
In [ ]
已复制!
query = "what's diamaxol"
retrieved_results3 = website_retriever.retrieve(query)
query = "what's diamaxol" retrieved_results3 = website_retriever.retrieve(query)
In [ ]
已复制!
print(retrieved_results3[0])
print(retrieved_results3[0])
在查询引擎中使用¶
In [ ]
已复制!
# import modules needed
from llama_index.core import Settings
from llama_index.llms.vertex import Vertex
from llama_index.embeddings.vertex import VertexTextEmbedding
# import modules needed from llama_index.core import Settings from llama_index.llms.vertex import Vertex from llama_index.embeddings.vertex import VertexTextEmbedding
In [ ]
已复制!
vertex_gemini = Vertex(
model="gemini-1.5-pro",
temperature=0,
context_window=100000,
additional_kwargs={},
)
# setup the index/query llm
Settings.llm = vertex_gemini
vertex_gemini = Vertex( model="gemini-1.5-pro", temperature=0, context_window=100000, additional_kwargs={}, ) # setup the index/query llm Settings.llm = vertex_gemini
In [ ]
已复制!
from llama_index.core.query_engine import RetrieverQueryEngine
query_engine = RetrieverQueryEngine.from_args(struct_retriever)
from llama_index.core.query_engine import RetrieverQueryEngine query_engine = RetrieverQueryEngine.from_args(struct_retriever)
In [ ]
已复制!
response = query_engine.query("Tell me about harry potter")
print(str(response))
response = query_engine.query("Tell me about harry potter") print(str(response))