使用 Azure OpenAI GPT-4o mini 进行图像推理的多模态 LLM¶
在此 notebook 中,我们将展示如何使用 GPT-4o mini 以及 Azure OpenAI LLM 类/抽象进行图像理解/推理。有关更完整的示例,请访问 此 notebook。
In [ ]
已复制!
%pip install llama-index-llms-azure-openai
%pip install llama-index-llms-azure-openai
使用 GPT-4o mini 理解来自 URL / base64 的图像¶
In [ ]
已复制!
import os
os.environ["AZURE_OPENAI_API_KEY"] = "xxx"
os.environ["AZURE_OPENAI_ENDPOINT"] = "https://YOUR_URL.openai.azure.com/"
os.environ["OPENAI_API_VERSION"] = "2024-02-15-preview"
import os os.environ["AZURE_OPENAI_API_KEY"] = "xxx" os.environ["AZURE_OPENAI_ENDPOINT"] = "https://YOUR_URL.openai.azure.com/" os.environ["OPENAI_API_VERSION"] = "2024-02-15-preview"
初始化 AzureOpenAI
并从 URL 加载图像¶
与普通 OpenAI
不同,除了 model
参数外,您还需要传递 engine
参数。engine
是您在 Azure OpenAI Studio 中部署模型时为其指定的名称。
In [ ]
已复制!
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.llms.azure_openai import AzureOpenAI
In [ ]
已复制!
azure_openai_llm = AzureOpenAI(
engine="my-gpt-4o-mini",
model="gpt-4o-mini",
max_new_tokens=300,
)
azure_openai_llm = AzureOpenAI( engine="my-gpt-4o-mini", model="gpt-4o-mini", max_new_tokens=300, )
或者,您也可以跳过设置环境变量,直接通过构造函数传递参数。
In [ ]
已复制!
azure_openai_llm = AzureOpenAI(
azure_endpoint="https://YOUR_URL.openai.azure.com/",
engine="my-gpt-4o-mini",
api_version="2024-02-15-preview",
model="gpt-4o-mini",
max_new_tokens=300,
api_key="xxx",
supports_content_blocks=True,
)
azure_openai_llm = AzureOpenAI( azure_endpoint="https://YOUR_URL.openai.azure.com/", engine="my-gpt-4o-mini", api_version="2024-02-15-preview", model="gpt-4o-mini", max_new_tokens=300, api_key="xxx", supports_content_blocks=True, )
In [ ]
已复制!
import base64
import requests
from llama_index.core.schema import Document, MediaResource
image_url = "https://www.visualcapitalist.com/wp-content/uploads/2023/10/US_Mortgage_Rate_Surge-Sept-11-1.jpg"
response = requests.get(image_url)
if response.status_code != 200:
raise ValueError("Error: Could not retrieve image from URL.")
img_data = base64.b64encode(response.content)
image_document = Document(image_resource=MediaResource(data=img_data))
import base64 import requests from llama_index.core.schema import Document, MediaResource image_url = "https://www.visualcapitalist.com/wp-content/uploads/2023/10/US_Mortgage_Rate_Surge-Sept-11-1.jpg" response = requests.get(image_url) if response.status_code != 200: raise ValueError("Error: Could not retrieve image from URL.") img_data = base64.b64encode(response.content) image_document = Document(image_resource=MediaResource(data=img_data))
In [ ]
已复制!
from IPython.display import HTML
src = f'<img width=400 src="data:{image_document.image_resource.mimetype};base64,{image_document.image_resource.data.decode("utf-8")}"/>'
HTML(src)
from IPython.display import HTML src = f'
' HTML(src)
Out[ ]
使用图像完成提示¶
In [ ]
已复制!
from llama_index.core.llms import (
ChatMessage,
ImageBlock,
TextBlock,
MessageRole,
)
msg = ChatMessage(
role=MessageRole.USER,
blocks=[
TextBlock(text="Describe the images as an alternative text"),
ImageBlock(image=image_document.image_resource.data),
],
)
response = azure_openai_llm.chat(messages=[msg])
from llama_index.core.llms import ( ChatMessage, ImageBlock, TextBlock, MessageRole, ) msg = ChatMessage( role=MessageRole.USER, blocks=[ TextBlock(text="将图片描述为替代文本"), ImageBlock(image=image_document.image_resource.data), ], ) response = azure_openai_llm.chat(messages=[msg])
In [ ]
已复制!
print(response)
print(response)
assistant: The image presents a graph titled "The U.S. Mortgage Rate Surge," comparing the U.S. 30-year fixed-rate mortgage rates with existing home sales from 2014 to 2023. - The vertical axis on the left represents the mortgage rate, while the right vertical axis indicates the number of existing home sales, measured in millions. - A blue line illustrates the trend of existing home sales, showing fluctuations over the years, peaking around 2020 and declining thereafter. - A red line represents the mortgage rate, which has seen a significant increase, particularly in 2022 and 2023, reaching its highest level in over 20 years. - The background includes a subtle grid, and the data sources are noted at the bottom. The overall design is clean and informative, aimed at highlighting the relationship between rising mortgage rates and declining home sales.