IBM watsonx.ai¶

WatsonxLLM 是 IBM watsonx.ai 基础模型的包装器。

这些示例旨在展示如何使用 LlamaIndex LLMs API 与 watsonx.ai 模型进行通信。

设置¶

安装 llama-index-llms-ibm 包

In [ ]

已复制！

!pip install -qU llama-index-llms-ibm
!pip install -qU llama-index-llms-ibm

下面的单元格定义了使用 watsonx 基础模型推理所需的凭据。

操作：提供 IBM Cloud 用户 API 密钥。详情请参阅管理用户 API 密钥。

In [ ]

已复制！

import os
from getpass import getpass

watsonx_api_key = getpass()
os.environ["WATSONX_APIKEY"] = watsonx_api_key
import os from getpass import getpass watsonx_api_key = getpass() os.environ["WATSONX_APIKEY"] = watsonx_api_key

此外，您可以将其他密钥作为环境变量传递

In [ ]

已复制！





import os

os.environ["WATSONX_URL"] = "your service instance url"
os.environ["WATSONX_TOKEN"] = "your token for accessing the CPD cluster"
os.environ["WATSONX_PASSWORD"] = "your password for accessing the CPD cluster"
os.environ["WATSONX_USERNAME"] = "your username for accessing the CPD cluster"
os.environ[
    "WATSONX_INSTANCE_ID"
] = "your instance_id for accessing the CPD cluster"
import os os.environ["WATSONX_URL"] = "your service instance url" os.environ["WATSONX_TOKEN"] = "your token for accessing the CPD cluster" os.environ["WATSONX_PASSWORD"] = "your password for accessing the CPD cluster" os.environ["WATSONX_USERNAME"] = "your username for accessing the CPD cluster" os.environ[ "WATSONX_INSTANCE_ID" ] = "your instance_id for accessing the CPD cluster"

加载模型¶

您可能需要调整模型 parameters 以适应不同的模型或任务。详情请参阅可用 MetaNames。

In [ ]

已复制！





temperature = 0.5
max_new_tokens = 50
additional_params = {
    "decoding_method": "sample",
    "min_new_tokens": 1,
    "top_k": 50,
    "top_p": 1,
}
temperature = 0.5 max_new_tokens = 50 additional_params = { "decoding_method": "sample", "min_new_tokens": 1, "top_k": 50, "top_p": 1, }

使用先前设置的参数初始化 WatsonxLLM 类。

注意:

为了为 API 调用提供上下文，您必须传递 project_id 或 space_id。要获取您的项目或空间 ID，请打开您的项目或空间，转到管理标签页，然后单击通用。更多信息请参阅：项目文档或部署空间文档。
根据您配置的服务实例所在的区域，使用以下列出的 URL 之一 watsonx.ai API 认证。

在此示例中，我们将使用 project_id 和 Dallas URL。

您需要指定将用于推理的 model_id。您可以在支持的基础模型中找到所有可用模型的列表。

In [ ]

已复制！





from llama_index.llms.ibm import WatsonxLLM

watsonx_llm = WatsonxLLM(
    model_id="ibm/granite-13b-instruct-v2",
    url="https://us-south.ml.cloud.ibm.com",
    project_id="PASTE YOUR PROJECT_ID HERE",
    temperature=temperature,
    max_new_tokens=max_new_tokens,
    additional_params=additional_params,
)
from llama_index.llms.ibm import WatsonxLLM watsonx_llm = WatsonxLLM( model_id="ibm/granite-13b-instruct-v2", url="https://us-south.ml.cloud.ibm.com", project_id="PASTE YOUR PROJECT_ID HERE", temperature=temperature, max_new_tokens=max_new_tokens, additional_params=additional_params, )

或者，您可以使用 Cloud Pak for Data 凭据。详情请参阅 watsonx.ai 软件设置。

In [ ]

已复制！





watsonx_llm = WatsonxLLM(
    model_id="ibm/granite-13b-instruct-v2",
    url="PASTE YOUR URL HERE",
    username="PASTE YOUR USERNAME HERE",
    password="PASTE YOUR PASSWORD HERE",
    instance_id="openshift",
    version="4.8",
    project_id="PASTE YOUR PROJECT_ID HERE",
    temperature=temperature,
    max_new_tokens=max_new_tokens,
    additional_params=additional_params,
)
watsonx_llm = WatsonxLLM( model_id="ibm/granite-13b-instruct-v2", url="PASTE YOUR URL HERE", username="PASTE YOUR USERNAME HERE", password="PASTE YOUR PASSWORD HERE", instance_id="openshift", version="4.8", project_id="PASTE YOUR PROJECT_ID HERE", temperature=temperature, max_new_tokens=max_new_tokens, additional_params=additional_params, )

除了 model_id，您还可以传递先前已微调模型的 deployment_id。完整的模型调优工作流程在使用 TuneExperiment 和 PromptTuner 中描述。

In [ ]

已复制！





watsonx_llm = WatsonxLLM(
    deployment_id="PASTE YOUR DEPLOYMENT_ID HERE",
    url="https://us-south.ml.cloud.ibm.com",
    project_id="PASTE YOUR PROJECT_ID HERE",
    temperature=temperature,
    max_new_tokens=max_new_tokens,
    additional_params=additional_params,
)
watsonx_llm = WatsonxLLM( deployment_id="PASTE YOUR DEPLOYMENT_ID HERE", url="https://us-south.ml.cloud.ibm.com", project_id="PASTE YOUR PROJECT_ID HERE", temperature=temperature, max_new_tokens=max_new_tokens, additional_params=additional_params, )

创建补全¶

使用字符串类型的提示直接调用模型

In [ ]

已复制！

response = watsonx_llm.complete("What is a Generative AI?")
print(response)
response = watsonx_llm.complete("What is a Generative AI?") print(response)

A generative AI is a computer program that can create new text, images, or other types of content. These programs are trained on large datasets of existing content, and they use that data to generate new content that is similar to the training data.

从 CompletionResponse 中，您还可以检索服务返回的原始响应

In [ ]

已复制！

print(response.raw)
print(response.raw)

{'model_id': 'ibm/granite-13b-instruct-v2', 'created_at': '2024-05-20T07:11:57.984Z', 'results': [{'generated_text': 'A generative AI is a computer program that can create new text, images, or other types of content. These programs are trained on large datasets of existing content, and they use that data to generate new content that is similar to the training data.', 'generated_token_count': 50, 'input_token_count': 7, 'stop_reason': 'max_tokens', 'seed': 494448017}]}

您还可以调用提供提示模板的模型

In [ ]

已复制！

from llama_index.core import PromptTemplate

template = "What is {object} and how does it work?"
prompt_template = PromptTemplate(template=template)

prompt = prompt_template.format(object="a loan")

response = watsonx_llm.complete(prompt)
print(response)
from llama_index.core import PromptTemplate template = "What is {object} and how does it work?" prompt_template = PromptTemplate(template=template) prompt = prompt_template.format(object="a loan") response = watsonx_llm.complete(prompt) print(response)

A loan is a sum of money that is borrowed to buy something, such as a house or a car. The borrower must repay the loan plus interest. The interest is a fee charged for using the money. The interest rate is the amount of

使用消息列表调用 chat¶

创建 chat 补全，方法是提供一个消息列表

In [ ]

已复制！





from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(role="system", content="You are an AI assistant"),
    ChatMessage(role="user", content="Who are you?"),
]
response = watsonx_llm.chat(
    messages, max_new_tokens=20, decoding_method="greedy"
)
print(response)
from llama_index.core.llms import ChatMessage messages = [ ChatMessage(role="system", content="You are an AI assistant"), ChatMessage(role="user", content="Who are you?"), ] response = watsonx_llm.chat( messages, max_new_tokens=20, decoding_method="greedy" ) print(response)

assistant: I am an AI assistant.

请注意，我们将 max_new_tokens 参数更改为 20 ，并将 decoding_method 参数更改为 greedy。

流式传输模型输出¶

流式传输模型的响应

In [ ]

已复制！

for chunk in watsonx_llm.stream_complete(
    "Describe your favorite city and why it is your favorite."
):
    print(chunk.delta, end="")
for chunk in watsonx_llm.stream_complete( "Describe your favorite city and why it is your favorite." ): print(chunk.delta, end="")

I like New York because it is the city of dreams. You can achieve anything you want here.

类似地，要流式传输 chat 补全，请使用以下代码

In [ ]

已复制！

messages = [
    ChatMessage(role="system", content="You are an AI assistant"),
    ChatMessage(role="user", content="Who are you?"),
]

for chunk in watsonx_llm.stream_chat(messages):
    print(chunk.delta, end="")
messages = [ ChatMessage(role="system", content="You are an AI assistant"), ChatMessage(role="user", content="Who are you?"), ] for chunk in watsonx_llm.stream_chat(messages): print(chunk.delta, end="")

I am an AI assistant.