IBM watsonx.ai¶

WatsonxEmbeddings 是 IBM watsonx.ai 嵌入模型的封装。

本示例展示了如何使用 LlamaIndex 嵌入 API 与 watsonx.ai 嵌入模型进行通信。

设置¶

安装 llama-index-embeddings-ibm 包

In [ ]

已复制！

!pip install -qU llama-index-embeddings-ibm
!pip install -qU llama-index-embeddings-ibm

下方单元格定义了使用 watsonx Embeddings 所需的凭证。

操作：提供 IBM Cloud 用户 API 密钥。详情请参见管理用户 API 密钥。

In [ ]

已复制！

import os
from getpass import getpass

watsonx_api_key = getpass()
os.environ["WATSONX_APIKEY"] = watsonx_api_key
import os from getpass import getpass watsonx_api_key = getpass() os.environ["WATSONX_APIKEY"] = watsonx_api_key

此外，您还可以将其他密钥作为环境变量传递

In [ ]

已复制！





import os

os.environ["WATSONX_URL"] = "your service instance url"
os.environ["WATSONX_TOKEN"] = "your token for accessing the CPD cluster"
os.environ["WATSONX_PASSWORD"] = "your password for accessing the CPD cluster"
os.environ["WATSONX_USERNAME"] = "your username for accessing the CPD cluster"
os.environ[
    "WATSONX_INSTANCE_ID"
] = "your instance_id for accessing the CPD cluster"
import os os.environ["WATSONX_URL"] = "your service instance url" os.environ["WATSONX_TOKEN"] = "your token for accessing the CPD cluster" os.environ["WATSONX_PASSWORD"] = "your password for accessing the CPD cluster" os.environ["WATSONX_USERNAME"] = "your username for accessing the CPD cluster" os.environ[ "WATSONX_INSTANCE_ID" ] = "your instance_id for accessing the CPD cluster"

加载模型¶

您可能需要根据不同任务调整嵌入参数

In [ ]

已复制！

truncate_input_tokens = 3
truncate_input_tokens = 3

使用先前设置的参数初始化 WatsonxEmbeddings 类。

注意:

为 API 调用提供上下文，您必须传递 project_id 或 space_id。要获取您的项目或空间 ID，请打开您的项目或空间，转到管理选项卡，然后单击通用。有关更多信息，请参见：项目文档或部署空间文档。
根据您预置的服务实例所在的区域，使用 watsonx.ai API 认证中列出的某个 URL。

在此示例中，我们将使用 project_id 和 Dallas URL。

您需要指定用于推理的 model_id。您可以在支持的基础模型中找到所有可用模型的列表。

In [ ]

已复制！





from llama_index.embeddings.ibm import WatsonxEmbeddings

watsonx_embedding = WatsonxEmbeddings(
    model_id="ibm/slate-125m-english-rtrvr",
    url="https://us-south.ml.cloud.ibm.com",
    project_id="PASTE YOUR PROJECT_ID HERE",
    truncate_input_tokens=truncate_input_tokens,
)
from llama_index.embeddings.ibm import WatsonxEmbeddings watsonx_embedding = WatsonxEmbeddings( model_id="ibm/slate-125m-english-rtrvr", url="https://us-south.ml.cloud.ibm.com", project_id="PASTE YOUR PROJECT_ID HERE", truncate_input_tokens=truncate_input_tokens, )

或者，您可以使用 Cloud Pak for Data 凭证。详情请参见watsonx.ai 软件设置。

In [ ]

已复制！





watsonx_embedding = WatsonxEmbeddings(
    model_id="ibm/slate-125m-english-rtrvr",
    url="PASTE YOUR URL HERE",
    username="PASTE YOUR USERNAME HERE",
    password="PASTE YOUR PASSWORD HERE",
    instance_id="openshift",
    version="4.8",
    project_id="PASTE YOUR PROJECT_ID HERE",
    truncate_input_tokens=truncate_input_tokens,
)
watsonx_embedding = WatsonxEmbeddings( model_id="ibm/slate-125m-english-rtrvr", url="PASTE YOUR URL HERE", username="PASTE YOUR USERNAME HERE", password="PASTE YOUR PASSWORD HERE", instance_id="openshift", version="4.8", project_id="PASTE YOUR PROJECT_ID HERE", truncate_input_tokens=truncate_input_tokens, )

使用¶

嵌入查询¶

In [ ]

已复制！

query = "Example query."

query_result = watsonx_embedding.get_query_embedding(query)
print(query_result[:5])
query = "Example query." query_result = watsonx_embedding.get_query_embedding(query) print(query_result[:5])

[-0.05538924, 0.05161056, 0.01207759, 0.0017501727, -0.017691258]

嵌入文本列表¶

In [ ]

已复制！

texts = ["This is a content of one document", "This is another document"]

doc_result = watsonx_embedding.get_text_embedding_batch(texts)
print(doc_result[0][:5])
texts = ["This is a content of one document", "This is another document"] doc_result = watsonx_embedding.get_text_embedding_batch(texts) print(doc_result[0][:5])

[0.009447167, -0.024981938, -0.02601326, -0.04048393, -0.05780444]