使用 Nebius 的多模态模型¶

本笔记本演示了如何将 Nebius AI Studio 的多模态模型与 LlamaIndex 一起使用。Nebius AI Studio 实现了所有可用于商业用途的最新多模态模型。

首先，让我们安装 LlamaIndex 和 Nebius AI Studio 的依赖项。由于 AI Studio 使用与 OpenAI 兼容的 OpenAI，因此还需要在 Llama-index 中安装 OpenAI Multimodal 包。

In [ ]

已复制！

%pip install llama-index-multi-modal-llms-nebius llama-index matplotlib
%pip install llama-index-multi-modal-llms-nebius llama-index matplotlib

从下面的系统变量上传您的 Nebius AI Studio 密钥，或者直接插入。您可以在 Nebius AI Studio 免费注册，然后在API Keys 区域获取密钥。"

In [ ]

已复制！

import os

NEBIUS_API_KEY = os.getenv("NEBIUS_API_KEY")  # NEBIUS_API_KEY = ""
import os NEBIUS_API_KEY = os.getenv("NEBIUS_API_KEY") # NEBIUS_API_KEY = ""

使用 Qwen 理解来自 URL 的图片¶

初始化 `NebiusMultiModal` 并加载来自 URL 的图片¶

In [ ]

已复制！

from llama_index.multi_modal_llms.nebius import NebiusMultiModal

from llama_index.core.multi_modal_llms.generic_utils import load_image_urls

image_urls = [
    "https://townsquare.media/site/442/files/2018/06/wall-e-eve.jpg",
]

image_documents = load_image_urls(image_urls)

mm_llm = NebiusMultiModal(
    model="Qwen/Qwen2-VL-72B-Instruct",
    api_key=NEBIUS_API_KEY,
    max_new_tokens=300,
)
from llama_index.multi_modal_llms.nebius import NebiusMultiModal from llama_index.core.multi_modal_llms.generic_utils import load_image_urls image_urls = [ "https://townsquare.media/site/442/files/2018/06/wall-e-eve.jpg", ] image_documents = load_image_urls(image_urls) mm_llm = NebiusMultiModal( model="Qwen/Qwen2-VL-72B-Instruct", api_key=NEBIUS_API_KEY, max_new_tokens=300, )

In [ ]

已复制！





from PIL import Image
import requests
from io import BytesIO
import matplotlib.pyplot as plt

img_response = requests.get(image_urls[0])
print(image_urls[0])
img = Image.open(BytesIO(img_response.content))
plt.imshow(img)
from PIL import Image import requests from io import BytesIO import matplotlib.pyplot as plt img_response = requests.get(image_urls[0]) print(image_urls[0]) img = Image.open(BytesIO(img_response.content)) plt.imshow(img)

https://townsquare.media/site/442/files/2018/06/wall-e-eve.jpg

Out [ ]

<matplotlib.image.AxesImage at 0x7f39b80492a0>

No description has been provided for this image

使用一组图片完成提示¶

In [ ]

已复制！

complete_response = mm_llm.complete(
    prompt="Describe the images as an alternative text",
    image_documents=image_documents,
)
complete_response = mm_llm.complete( prompt="Describe the images as an alternative text", image_documents=image_documents, )

In [ ]

已复制！

print(complete_response)
print(complete_response)

The image depicts two animated characters from a popular animated movie. On the left, there is a small, rusty, cube-shaped robot with large, expressive eyes and a pair of mechanical arms. This robot is standing on a metal platform. On the right, there is a larger, sleek, white robot with a dome-shaped head and a green logo on its chest. This robot is adorned with a string of colorful Christmas lights. The background appears to be a desolate, post-apocalyptic landscape with a reddish hue.

使用一组图片流式完成提示¶

In [ ]

已复制！

stream_complete_response = mm_llm.stream_complete(
    prompt="give me more context for this image",
    image_documents=image_documents,
)
stream_complete_response = mm_llm.stream_complete( prompt="give me more context for this image", image_documents=image_documents, )

In [ ]

已复制！

for r in stream_complete_response:
    print(r.delta, end="")
for r in stream_complete_response: print(r.delta, end="")

This image features two animated characters from the movie "WALL-E." The character on the left is WALL-E, a small, cube-shaped robot with large, expressive eyes and a yellow body. WALL-E is a waste-compacting robot designed to clean up Earth, which has become uninhabitable due to pollution and waste. The character on the right is EVE, a sleek, white robot with a dome-shaped head and a green plant symbol on her chest. EVE is an advanced probe sent to Earth to search for signs of life. The scene depicts WALL-E and EVE together, with EVE adorned with colorful string lights, suggesting a moment of connection or celebration between the two characters. The background shows a desolate, post-apocalyptic Earth, emphasizing the themes of environmental degradation and the importance of renewal and hope.

通过聊天消息列表进行对话¶

In [ ]

已复制！





from llama_index.multi_modal_llms.openai.utils import (
    generate_openai_multi_modal_chat_message,
)

chat_msg_1 = generate_openai_multi_modal_chat_message(
    prompt="Describe the image as an alternative text",
    role="user",
    image_documents=image_documents,
)

chat_msg_2 = generate_openai_multi_modal_chat_message(
    prompt='The image features two animated characters from the movie "WALL-E."',
    role="assistant",
)

chat_msg_3 = generate_openai_multi_modal_chat_message(
    prompt="can I know more?",
    role="user",
)

chat_messages = [chat_msg_1, chat_msg_2, chat_msg_3]
chat_response = mm_llm.chat(
    # prompt="Describe the images as an alternative text",
    messages=chat_messages,
)
from llama_index.multi_modal_llms.openai.utils import ( generate_openai_multi_modal_chat_message, ) chat_msg_1 = generate_openai_multi_modal_chat_message( prompt="Describe the image as an alternative text", role="user", image_documents=image_documents, ) chat_msg_2 = generate_openai_multi_modal_chat_message( prompt='The image features two animated characters from the movie "WALL-E."', role="assistant", ) chat_msg_3 = generate_openai_multi_modal_chat_message( prompt="can I know more?", role="user", ) chat_messages = [chat_msg_1, chat_msg_2, chat_msg_3] chat_response = mm_llm.chat( # prompt="Describe the images as an alternative text", messages=chat_messages, )

In [ ]

已复制！

for msg in chat_messages:
    print(msg.role, msg.content)
for msg in chat_messages: print(msg.role, msg.content)

MessageRole.USER Describe the image as an alternative text
MessageRole.ASSISTANT The image features two animated characters from the movie "WALL-E."
MessageRole.USER can I know more?

In [ ]

已复制！

print(chat_response)
print(chat_response)

assistant: Certainly! The image depicts a scene from the animated movie "WALL-E," produced by Pixar Animation Studios. The two main characters in the image are WALL-E and EVE. WALL-E is a small, cube-shaped robot with large, expressive eyes and a yellow body, while EVE is a sleek, white robot with a dome-shaped head and a green plant symbol on her chest. The scene shows WALL-E and EVE standing on a metal platform, with WALL-E holding a string of colorful lights. The background appears to be a desolate, post-apocalyptic environment, which is a central theme in the movie.

通过聊天消息列表进行流式聊天¶

In [ ]

已复制！

stream_chat_response = mm_llm.stream_chat(
    messages=chat_messages,
)
stream_chat_response = mm_llm.stream_chat( messages=chat_messages, )

In [ ]

已复制！

for r in stream_chat_response:
    print(r.delta, end="")
for r in stream_chat_response: print(r.delta, end="")

Certainly! The image depicts a scene from the animated movie "WALL-E," produced by Pixar Animation Studios. The two main characters in the image are WALL-E and EVE. WALL-E is a small, cube-shaped robot with large, expressive eyes and a yellow body, while EVE is a sleek, white robot with a dome-shaped head and a green plant symbol on her chest. The scene shows WALL-E and EVE standing on a metal platform, with WALL-E holding a string of colorful lights. The background is a desolate, post-apocalyptic landscape, typical of the setting in the movie.

异步完成¶

In [ ]

已复制！

response_acomplete = await mm_llm.acomplete(
    prompt="Describe the images as an alternative text",
    image_documents=image_documents,
)
response_acomplete = await mm_llm.acomplete( prompt="Describe the images as an alternative text", image_documents=image_documents, )

In [ ]

已复制！

print(response_acomplete)
print(response_acomplete)

The image depicts two animated characters from a popular animated movie. On the left, there is a small, rusty, cube-shaped robot with large, expressive eyes and a yellow body. This robot is standing on a metal platform. On the right, there is a larger, sleek, white robot with a dome-shaped head and a green logo on its chest. This robot is adorned with a string of colorful Christmas lights. The background features a desolate, post-apocalyptic landscape with tall, rusted structures and a warm, reddish hue.

异步流式完成¶

In [ ]

已复制！

response_astream_complete = await mm_llm.astream_complete(
    prompt="Describe the images as an alternative text",
    image_documents=image_documents,
)
response_astream_complete = await mm_llm.astream_complete( prompt="Describe the images as an alternative text", image_documents=image_documents, )

In [ ]

已复制！

async for delta in response_astream_complete:
    print(delta.delta, end="")
async for delta in response_astream_complete: print(delta.delta, end="")

The image depicts two animated robots standing on a wooden bench. The robot on the left is a small, boxy, yellow machine with large, expressive eyes and a weathered appearance, suggesting it has been around for a long time. The robot on the right is larger, with a smooth, white exterior and a large, dark visor. This robot is adorned with a string of colorful Christmas lights wrapped around its body. The background appears to be a desolate, post-apocalyptic landscape with rusted metal structures and a reddish hue, indicating a harsh environment.

异步聊天¶

In [ ]

已复制！

achat_response = await mm_llm.achat(
    messages=chat_messages,
)
achat_response = await mm_llm.achat( messages=chat_messages, )

In [ ]

已复制！

print(achat_response)
print(achat_response)

assistant: Certainly! The image depicts a scene from the animated movie "WALL-E," produced by Pixar Animation Studios. The two main characters in the image are WALL-E and EVE. WALL-E is a small, rectangular robot with large, expressive eyes and a yellow body, while EVE is a sleek, white robot with a dome-shaped head and a green plant symbol on her chest. In this scene, WALL-E is standing on a metal structure, looking up at EVE, who is adorned with a string of colorful lights. The background suggests a post-apocalyptic setting with a desolate, rusty environment. The scene captures a moment of curiosity and connection between the two robots.

异步流式聊天¶

In [ ]

已复制！

astream_chat_response = await mm_llm.astream_chat(
    messages=chat_messages,
)
astream_chat_response = await mm_llm.astream_chat( messages=chat_messages, )

In [ ]

已复制！

async for delta in astream_chat_response:
    print(delta.delta, end="")
async for delta in astream_chat_response: print(delta.delta, end="")

Certainly! The image depicts a scene from the animated movie "WALL-E," produced by Pixar Animation Studios. The two characters in the image are WALL-E and EVE. WALL-E is a small, cube-shaped robot with large, expressive eyes and a yellow body, while EVE is a sleek, white robot with a dome-shaped head and a green plant symbol on her chest. The scene shows WALL-E and EVE standing on a metal platform, with WALL-E holding a string of colorful Christmas lights. The background appears to be a desolate, post-apocalyptic environment, consistent with the setting of the movie.

使用 Qwen 从本地文件理解图像¶

In [ ]

已复制！





from llama_index.core import SimpleDirectoryReader
from llama_index.multi_modal_llms.nebius import NebiusMultiModal

# put your local directory here
path_to_images = "/mnt/share/nebius/images"
image_documents = SimpleDirectoryReader(path_to_images).load_data()

mm_llm = NebiusMultiModal(
    model="Qwen/Qwen2-VL-72B-Instruct",
    api_key=NEBIUS_API_KEY,
    max_new_tokens=300,
)

response = mm_llm.complete(
    prompt="Describe the images as an alternative text",
    image_documents=image_documents,
)
from llama_index.core import SimpleDirectoryReader from llama_index.multi_modal_llms.nebius import NebiusMultiModal # put your local directory here path_to_images = "/mnt/share/nebius/images" image_documents = SimpleDirectoryReader(path_to_images).load_data() mm_llm = NebiusMultiModal( model="Qwen/Qwen2-VL-72B-Instruct", api_key=NEBIUS_API_KEY, max_new_tokens=300, ) response = mm_llm.complete( prompt="Describe the images as an alternative text", image_documents=image_documents, )

In [ ]

已复制！

from PIL import Image
import matplotlib.pyplot as plt

for image_name in os.listdir(path_to_images):
    img = Image.open(os.path.join(path_to_images, image_name))
    plt.imshow(img)
    plt.show()
from PIL import Image import matplotlib.pyplot as plt for image_name in os.listdir(path_to_images): img = Image.open(os.path.join(path_to_images, image_name)) plt.imshow(img) plt.show()

In [ ]

已复制！

print(response)
print(response)

**Image 1:**
A soccer player is holding a large silver trophy with red ribbons attached to it. The player is wearing a red jersey with the word "Carlsberg" on it and has a gold medal around his neck. He is raising his right fist in a celebratory gesture.

**Image 2:**
Two characters are standing in a grand, stone-walled room. The character on the left is wearing a black robe with a red and yellow tie, while the character on the right is dressed in a brown robe with a red tie and a patterned shirt underneath. The character on the right has his arm around the shoulder of the character on the left.

使用 Nebius 的多模态模型¶

使用 Qwen 理解来自 URL 的图片¶

初始化 NebiusMultiModal 并加载来自 URL 的图片¶

使用一组图片完成提示¶

使用一组图片流式完成提示¶

通过聊天消息列表进行对话¶

通过聊天消息列表进行流式聊天¶

异步完成¶

异步流式完成¶

异步聊天¶

异步流式聊天¶

使用 Qwen 从本地文件理解图像¶

初始化 `NebiusMultiModal` 并加载来自 URL 的图片¶