使用 LlamaIndex 与部署在 Amazon SageMaker Endpoint 中的 LLM 交互¶

Amazon SageMaker 端点是一种完全托管的资源，支持部署机器学习模型，特别是大型语言模型（LLM），用于对新数据进行预测。

本 Notebook 演示了如何使用 SageMakerLLM 与 LLM 端点交互，从而解锁额外的 llamaIndex 功能。因此，假定已将 LLM 部署在 SageMaker 端点上。

设置¶

如果您正在 Colab 上打开本 Notebook，您可能需要安装 LlamaIndex 🦙。

输入 [ ]

已复制！

%pip install llama-index-llms-sagemaker-endpoint
%pip install llama-index-llms-sagemaker-endpoint

输入 [ ]

已复制！

! pip install llama-index
! pip install llama-index

您需要指定端点名称来进行交互。

输入 [ ]

已复制！

ENDPOINT_NAME = "<-YOUR-ENDPOINT-NAME->"
ENDPOINT_NAME = "<-YOUR-ENDPOINT-NAME->"

需要提供凭据才能连接到端点。您可以选择以下方式：

通过指定 profile_name 参数使用 AWS 配置文件；如果未指定，将使用默认凭据配置文件。
将凭据作为参数传递（aws_access_key_id, aws_secret_access_key, aws_session_token, region_name）。

更多详细信息请查看此链接。

AWS 配置文件名

输入 [ ]

已复制！

from llama_index.llms.sagemaker_endpoint import SageMakerLLM

AWS_ACCESS_KEY_ID = "<-YOUR-AWS-ACCESS-KEY-ID->"
AWS_SECRET_ACCESS_KEY = "<-YOUR-AWS-SECRET-ACCESS-KEY->"
AWS_SESSION_TOKEN = "<-YOUR-AWS-SESSION-TOKEN->"
REGION_NAME = "<-YOUR-ENDPOINT-REGION-NAME->"
from llama_index.llms.sagemaker_endpoint import SageMakerLLM AWS_ACCESS_KEY_ID = "<-YOUR-AWS-ACCESS-KEY-ID->" AWS_SECRET_ACCESS_KEY = "<-YOUR-AWS-SECRET-ACCESS-KEY->" AWS_SESSION_TOKEN = "<-YOUR-AWS-SESSION-TOKEN->" REGION_NAME = "<-YOUR-ENDPOINT-REGION-NAME->"

输入 [ ]

已复制！





llm = SageMakerLLM(
    endpoint_name=ENDPOINT_NAME,
    aws_access_key_id=AWS_ACCESS_KEY_ID,
    aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
    aws_session_token=AWS_SESSION_TOKEN,
    region_name=REGION_NAME,
)
llm = SageMakerLLM( endpoint_name=ENDPOINT_NAME, aws_access_key_id=AWS_ACCESS_KEY_ID, aws_secret_access_key=AWS_SECRET_ACCESS_KEY, aws_session_token=AWS_SESSION_TOKEN, region_name=REGION_NAME, )

使用凭据:

输入 [ ]

已复制！

from llama_index.llms.sagemaker_endpoint import SageMakerLLM

ENDPOINT_NAME = "<-YOUR-ENDPOINT-NAME->"
PROFILE_NAME = "<-YOUR-PROFILE-NAME->"
llm = SageMakerLLM(
    endpoint_name=ENDPOINT_NAME, profile_name=PROFILE_NAME
)  # Omit the profile name to use the default profile
from llama_index.llms.sagemaker_endpoint import SageMakerLLM ENDPOINT_NAME = "<-YOUR-ENDPOINT-NAME->" PROFILE_NAME = "<-YOUR-PROFILE-NAME->" llm = SageMakerLLM( endpoint_name=ENDPOINT_NAME, profile_name=PROFILE_NAME ) # 省略 profile name 以使用默认配置文件

基本用法¶

调用 `complete` 方法并带上提示词¶

输入 [ ]

已复制！

resp = llm.complete(
    "Paul Graham is ", formatted=True
)  # formatted=True to avoid adding system prompt
print(resp)
resp = llm.complete( "Paul Graham is ", formatted=True ) # formatted=True 用于避免添加系统提示词 print(resp)

66 years old (birthdate: September 4, 1951). He is a British-American computer scientist, programmer, and entrepreneur who is known for his work in the fields of artificial intelligence, machine learning, and computer vision. He is a professor emeritus at Stanford University and a researcher at the Stanford Artificial Intelligence Lab (SAIL).

Graham has made significant contributions to the field of computer science, including the development of the concept of "n-grams," which are sequences of n items that occur together in a dataset. He has also worked on the development of machine learning algorithms and has written extensively on the topic of machine learning.

Graham has received numerous awards for his work, including the Association for Computing Machinery (ACM) A.M. Turing Award, the IEEE Neural Networks Pioneer Award, and the IJCAI Award

调用 `chat` 方法并带上消息列表¶

输入 [ ]

已复制！





from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = llm.chat(messages)
from llama_index.core.llms import ChatMessage messages = [ ChatMessage( role="system", content="You are a pirate with a colorful personality" ), ChatMessage(role="user", content="What is your name"), ] resp = llm.chat(messages)

输入 [ ]

已复制！

print(resp)
print(resp)

assistant:   Arrrr, shiver me timbers! *adjusts eye patch* Me name be Cap'n Blackbeak, the most feared and infamous pirate on the seven seas! *winks*

*ahem* But enough about me, matey. What be bringin' ye to these fair waters? Are ye here to plunder some booty, or just to share a pint o' grog with a salty old sea dog like meself? *chuckles*

流式输出¶

使用 `stream_complete` 端点¶

输入 [ ]

已复制！

resp = llm.stream_complete("Paul Graham is ", formatted=True)
resp = llm.stream_complete("Paul Graham is ", formatted=True)

输入 [ ]

已复制！

for r in resp:
    print(r.delta)
for r in resp: print(r.delta)

64 today. He’s a computer sci
ist, entrepreneur, and writer, best known for his work in the fields of artificial intelligence, machine learning, and computer graphics.
Graham was born in 1956 in Boston, Massachusetts. He earned his Bachelor’s degree in Computer Science from Harvard University in 1978 and his PhD in Computer Science from the University of California, Berkeley in 1982.
Graham’s early work focused on the development of the first computer graphics systems that could generate photorealistic images. In the 1980s, he became interested in the field of artificial intelligence and machine learning, and he co-founded a number of companies to explore these areas, including Viaweb, which was one of the first commercial web hosting services.
Graham is also a prolific writer and has published a number of influential essays on topics such as the nature

使用 `stream_chat` 端点¶

输入 [ ]

已复制！





from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(
        role="system", content="You are a pirate with a colorful personality"
    ),
    ChatMessage(role="user", content="What is your name"),
]
resp = llm.stream_chat(messages)
from llama_index.core.llms import ChatMessage messages = [ ChatMessage( role="system", content="You are a pirate with a colorful personality" ), ChatMessage(role="user", content="What is your name"), ] resp = llm.stream_chat(messages)

输入 [ ]

已复制！

for r in resp:
    print(r.delta, end="")
for r in resp: print(r.delta, end="")

  ARRGH! *adjusts eye patch* Me hearty? *winks* Me name be Captain Blackbeak, the most feared and infamous pirate to ever sail the seven seas! *chuckles* Or, at least, that's what me matey mates tell me. *winks*

So, what be bringin' ye to these waters, matey? Are ye here to plunder some booty or just to hear me tales of the high seas? *grins* Either way, I be ready to share me treasure with ye! *winks* Just don't be tellin' any landlubbers about me hidden caches o' gold, or ye might be walkin' the plank, savvy? *winks*

配置模型¶

SageMakerLLM 是一个用于与部署在 Amazon SageMaker 中的不同语言模型（LLM）交互的抽象层。所有默认参数都与 Llama 2 模型兼容。因此，如果您使用的是不同的模型，您可能需要设置以下参数：

messages_to_prompt：一个可调用对象，接受一个 ChatMessage 对象列表以及一个系统提示词（如果未在消息中指定）。它应返回一个字符串，其中包含符合端点 LLM 格式的消息。
completion_to_prompt：一个可调用对象，接受一个包含系统提示词的完成字符串，并返回一个符合端点 LLM 格式的字符串。
content_handler：一个继承自 llama_index.llms.sagemaker_llm_endpoint_utils.BaseIOHandler 的类，并实现以下方法：serialize_input, deserialize_output, deserialize_streaming_output, 和 remove_prefix。

使用 LlamaIndex 与部署在 Amazon SageMaker Endpoint 中的 LLM 交互¶

设置¶

基本用法¶

调用 complete 方法并带上提示词¶

调用 chat 方法并带上消息列表¶

流式输出¶

使用 stream_complete 端点¶

使用 stream_chat 端点¶

配置模型¶

调用 `complete` 方法并带上提示词¶

调用 `chat` 方法并带上消息列表¶

使用 `stream_complete` 端点¶

使用 `stream_chat` 端点¶