如果您在 colab 上打开此 Notebook,您可能需要安装 LlamaIndex 🦙。
In [ ]
已复制!
%pip install llama-index-embeddings-openvino
%pip install llama-index-embeddings-openvino
In [ ]
已复制!
!pip install llama-index
!pip install llama-index
模型导出器¶
可以使用 create_and_save_openvino_model
函数将模型导出为 OpenVINO IR 格式,并从本地文件夹加载模型。
In [ ]
已复制!
from llama_index.embeddings.huggingface_openvino import OpenVINOEmbedding
OpenVINOEmbedding.create_and_save_openvino_model(
"BAAI/bge-small-en-v1.5", "./bge_ov"
)
from llama_index.embeddings.huggingface_openvino import OpenVINOEmbedding OpenVINOEmbedding.create_and_save_openvino_model( "BAAI/bge-small-en-v1.5", "./bge_ov" )
/home2/ethan/intel/llama_index/llama_test/lib/python3.10/site-packages/openvino/runtime/__init__.py:10: DeprecationWarning: The `openvino.runtime` module is deprecated and will be removed in the 2026.0 release. Please replace `openvino.runtime` with `openvino`. warnings.warn(
Saved OpenVINO model to ./bge_ov. Use it with `embed_model = OpenVINOEmbedding(model_id_or_path='./bge_ov')`.
模型加载¶
如果您有英特尔 GPU,可以指定 device="gpu"
以在其上运行推理。
In [ ]
已复制!
ov_embed_model = OpenVINOEmbedding(model_id_or_path="./bge_ov", device="cpu")
ov_embed_model = OpenVINOEmbedding(model_id_or_path="./bge_ov", device="cpu")
In [ ]
已复制!
embeddings = ov_embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
embeddings = ov_embed_model.get_text_embedding("Hello World!") print(len(embeddings)) print(embeddings[:5])
384 [-0.0030246784444898367, -0.012189766392111778, 0.04163273051381111, -0.037758368998765945, 0.02439723163843155]
使用 OpenVINO GenAI 加载模型¶
为了避免运行时对 PyTorch 的依赖,您可以使用 OpenVINOGENAIEmbedding
类加载您的本地嵌入模型。
In [ ]
已复制!
%pip install llama-index-embeddings-openvino-genai
%pip install llama-index-embeddings-openvino-genai
In [ ]
已复制!
from llama_index.embeddings.openvino_genai import OpenVINOGENAIEmbedding
ov_embed_model = OpenVINOGENAIEmbedding(model_path="./bge_ov", device="CPU")
from llama_index.embeddings.openvino_genai import OpenVINOGENAIEmbedding ov_embed_model = OpenVINOGENAIEmbedding(model_path="./bge_ov", device="CPU")
/home2/ethan/intel/llama_index/llama_test/lib/python3.10/site-packages/openvino/runtime/__init__.py:10: DeprecationWarning: The `openvino.runtime` module is deprecated and will be removed in the 2026.0 release. Please replace `openvino.runtime` with `openvino`. warnings.warn(
In [ ]
已复制!
embeddings = ov_embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
embeddings = ov_embed_model.get_text_embedding("Hello World!") print(len(embeddings)) print(embeddings[:5])
384 [-0.0030246784444898367, -0.012189766392111778, 0.04163273051381111, -0.037758368998765945, 0.02439723163843155]
OpenClip 模型导出器¶
OpenVINOClipEmbedding
类支持使用 OpenVINO 运行时导出和加载 open_clip 模型。
In [ ]
已复制!
%pip install open_clip_torch
%pip install open_clip_torch
In [ ]
已复制!
from llama_index.embeddings.huggingface_openvino import (
OpenVINOClipEmbedding,
)
OpenVINOClipEmbedding.create_and_save_openvino_model(
"laion/CLIP-ViT-B-32-laion2B-s34B-b79K",
"ViT-B-32-ov",
)
from llama_index.embeddings.huggingface_openvino import ( OpenVINOClipEmbedding, ) OpenVINOClipEmbedding.create_and_save_openvino_model( "laion/CLIP-ViT-B-32-laion2B-s34B-b79K", "ViT-B-32-ov", )
多模态模型加载¶
如果您有英特尔 GPU,可以指定 device="GPU"
以在其上运行推理。
In [ ]
已复制!
ov_clip_model = OpenVINOClipEmbedding(
model_id_or_path="./ViT-B-32-ov", device="CPU"
)
ov_clip_model = OpenVINOClipEmbedding( model_id_or_path="./ViT-B-32-ov", device="CPU" )
使用 OpenVINO 嵌入图像和查询¶
In [ ]
已复制!
from PIL import Image
import requests
from numpy import dot
from numpy.linalg import norm
image_url = "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcStMP8S3VbNCqOQd7QQQcbvC_FLa1HlftCiJw&s"
im = Image.open(requests.get(image_url, stream=True).raw)
print("Image:")
display(im)
im.save("logo.jpg")
image_embeddings = ov_clip_model.get_image_embedding("logo.jpg")
print("Image dim:", len(image_embeddings))
print("Image embed:", image_embeddings[:5])
text_embeddings = ov_clip_model.get_text_embedding(
"Logo of a pink blue llama on dark background"
)
print("Text dim:", len(text_embeddings))
print("Text embed:", text_embeddings[:5])
cos_sim = dot(image_embeddings, text_embeddings) / (
norm(image_embeddings) * norm(text_embeddings)
)
print("Cosine similarity:", cos_sim)
from PIL import Image import requests from numpy import dot from numpy.linalg import norm image_url = "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcStMP8S3VbNCqOQd7QQQcbvC_FLa1HlftCiJw&s" im = Image.open(requests.get(image_url, stream=True).raw) print("Image:") display(im) im.save("logo.jpg") image_embeddings = ov_clip_model.get_image_embedding("logo.jpg") print("Image dim:", len(image_embeddings)) print("Image embed:", image_embeddings[:5]) text_embeddings = ov_clip_model.get_text_embedding( "Logo of a pink blue llama on dark background" ) print("Text dim:", len(text_embeddings)) print("Text embed:", text_embeddings[:5]) cos_sim = dot(image_embeddings, text_embeddings) / ( norm(image_embeddings) * norm(text_embeddings) ) print("Cosine similarity:", cos_sim)
Image:
Image dim: 512 Image embed: [-0.03019799292087555, -0.09727513045072556, -0.6659489274024963, -0.025658488273620605, 0.05379948765039444] Text dim: 512 Text embed: [-0.15816599130630493, -0.25564345717430115, 0.22376027703285217, -0.34983670711517334, 0.31968361139297485] Cosine similarity: 0.27307014923203976