NVIDIA NIM¶
`llama-index-postprocessor-nvidia-rerank` 包包含 LlamaIndex 集成,用于使用 NVIDIA NIM 推理微服务上的模型构建应用程序。NIM 支持来自社区和 NVIDIA 的跨聊天、embedding 和重排模型的模型。这些模型经过 NVIDIA 优化,可在 NVIDIA 加速的基础设施上提供最佳性能,并作为 NIM 部署。NIM 是一种易于使用的预构建容器,只需一条命令即可部署在 NVIDIA 加速的基础设施上的任何位置。
NVIDIA 托管的 NIM 部署可在 NVIDIA API 目录上进行测试。测试后,可以使用 NVIDIA AI Enterprise 许可从 NVIDIA 的 API 目录中导出 NIM,并在本地或云中运行,使企业拥有其 IP 和 AI 应用程序的所有权和完全控制权。
NIM 按模型打包为容器镜像,并通过 NVIDIA NGC 目录作为 NGC 容器镜像分发。核心来说,NIM 提供简单、一致且熟悉的 API,用于在 AI 模型上运行推理。
NVIDIA 的 Rerank 连接器¶
本示例介绍了如何使用 LlamaIndex 通过 NVIDIARerank
类与支持的 NVIDIA Retrieval QA Ranking Model 进行交互,用于检索增强生成。
合并来自多个源的结果¶
考虑一个包含来自语义存储(例如 VectorStoreIndex)和 BM25 存储数据的管道。
每个存储独立查询并返回其认为高度相关的结果。确定结果的整体相关性就是重新排序发挥作用的地方。
参照高级 - 混合检索器 + 重新排序用例,将重新排序器替换为 -
安装¶
%pip install --upgrade --quiet llama-index-postprocessor-nvidia-rerank llama-index-llms-nvidia llama-index-readers-file
import getpass
import os
# del os.environ['NVIDIA_API_KEY'] ## delete key and reset
if os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
print("Valid NVIDIA_API_KEY already in environment. Delete to reset")
else:
nvapi_key = getpass.getpass("NVAPI Key (starts with nvapi-): ")
assert nvapi_key.startswith(
"nvapi-"
), f"{nvapi_key[:5]}... is not a valid key"
os.environ["NVIDIA_API_KEY"] = nvapi_key
使用 API 目录¶
from llama_index.postprocessor.nvidia_rerank import NVIDIARerank
from llama_index.core import SimpleDirectoryReader, Settings, VectorStoreIndex
from llama_index.embeddings.nvidia import NVIDIAEmbedding
from llama_index.llms.nvidia import NVIDIA
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import Settings
import os
reranker = NVIDIARerank(top_n=4)
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
!mkdir data
!wget "https://www.dropbox.com/scl/fi/p33j9112y0ysgwg77fdjz/2021_Housing_Inventory.pdf?rlkey=yyok6bb18s5o31snjd2dxkxz3&dl=0" -O "data/housing_data.pdf"
mkdir: cannot create directory ‘data’: File exists --2024-07-03 10:33:17-- https://www.dropbox.com/scl/fi/p33j9112y0ysgwg77fdjz/2021_Housing_Inventory.pdf?rlkey=yyok6bb18s5o31snjd2dxkxz3&dl=0 Resolving www.dropbox.com (www.dropbox.com)... 162.125.81.18, 2620:100:6031:18::a27d:5112 Connecting to www.dropbox.com (www.dropbox.com)|162.125.81.18|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com/cd/0/inline/CV9Hy3nIrjnOf-Fqsgd-YhHcMaj0AHvOQaE1b4sdiKnOBqZL_u9ml6dAGctGxr5I79yD_kI8BNwDtFl_ll_sdfdt0iXcIYosfxaPr2NdbkRAMR6vg9UXuCU8kNEFi0D3Grs/file# [following] --2024-07-03 10:33:18-- https://uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com/cd/0/inline/CV9Hy3nIrjnOf-Fqsgd-YhHcMaj0AHvOQaE1b4sdiKnOBqZL_u9ml6dAGctGxr5I79yD_kI8BNwDtFl_ll_sdfdt0iXcIYosfxaPr2NdbkRAMR6vg9UXuCU8kNEFi0D3Grs/file Resolving uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com (uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com)... 162.125.81.15, 2620:100:6031:15::a27d:510f Connecting to uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com (uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com)|162.125.81.15|:443... connected. HTTP request sent, awaiting response... 302 Found Location: /cd/0/inline2/CV9Ugj_mK7TSMb3sw_BdQFrj2rzx-SI2cfGU7-VF4bcW3PdhxO4qw--AXQKUidWtDL_54rViwvbaBGHMvtMEAK_lCIwXXj5XwkKpJKTmP0mDrz8eU2qu0FGyi4uOGfO7TeNLFMFY_bBGUMHMatvKJVPF59Ps94-8LC40ba-Cgv2YKZtcU-UjFpLh-Fnf6emkG-c8eUWB2uKPX_Lx0E4hCENQEPOGOfMhDHU0DC8k6khZiilmLtjXsDJ0H4y3efQ-Fz-VsWCC2FcoGpDcxXGu1Ysp5-mP2eHpH3qOx20d2IrndwN4RGLAqzR6cfsOHPMvoYPyLjOW1322t1O46mXqcjv94OPEEIIHI-2K8xL4pBjLUQ/file [following] --2024-07-03 10:33:18-- https://uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com/cd/0/inline2/CV9Ugj_mK7TSMb3sw_BdQFrj2rzx-SI2cfGU7-VF4bcW3PdhxO4qw--AXQKUidWtDL_54rViwvbaBGHMvtMEAK_lCIwXXj5XwkKpJKTmP0mDrz8eU2qu0FGyi4uOGfO7TeNLFMFY_bBGUMHMatvKJVPF59Ps94-8LC40ba-Cgv2YKZtcU-UjFpLh-Fnf6emkG-c8eUWB2uKPX_Lx0E4hCENQEPOGOfMhDHU0DC8k6khZiilmLtjXsDJ0H4y3efQ-Fz-VsWCC2FcoGpDcxXGu1Ysp5-mP2eHpH3qOx20d2IrndwN4RGLAqzR6cfsOHPMvoYPyLjOW1322t1O46mXqcjv94OPEEIIHI-2K8xL4pBjLUQ/file Reusing existing connection to uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com:443. HTTP request sent, awaiting response... 200 OK Length: 4808625 (4.6M) [application/pdf] Saving to: ‘data/housing_data.pdf’ data/housing_data.p 100%[===================>] 4.58M 2.68MB/s in 1.7s 2024-07-03 10:33:21 (2.68 MB/s) - ‘data/housing_data.pdf’ saved [4808625/4808625]
Settings.text_splitter = SentenceSplitter(chunk_size=500)
documents = SimpleDirectoryReader("./data").load_data()
Settings.embed_model = NVIDIAEmbedding(model="NV-Embed-QA", truncate="END")
index = VectorStoreIndex.from_documents(documents)
Settings.llm = NVIDIA()
query_engine = index.as_query_engine(
similarity_top_k=20, node_postprocessors=[reranker]
)
response = query_engine.query(
"What was the net gain in housing units in the Mission in 2021?"
)
print(response)
The net gain in housing units in the Mission in 2021 was not specified in the provided context information.
使用 NVIDIA NIM¶
除了连接到托管的 NVIDIA NIM 外,此连接器还可用于连接到本地微服务实例。这有助于您在必要时将应用程序本地化。
有关如何设置本地微服务实例的说明,请参阅 https://developer.nvidia.com/blog/nvidia-nim-offers-optimized-inference-microservices-for-deploying-ai-models-at-scale/
from llama_index.llms.nvidia import NVIDIA
# connect to a rerank NIM running at localhost:1976
reranker = NVIDIARerank(base_url="http://localhost:1976/v1")