如果您在 Colab 上打开此 Notebook,您可能需要安装 LlamaIndex 🦙。
输入 [ ]
已复制!
%pip install llama-index-embeddings-huggingface
%pip install llama-index-embeddings-instructor
%pip install llama-index-embeddings-huggingface %pip install llama-index-embeddings-instructor
输入 [ ]
已复制!
!pip install llama-index
!pip install llama-index
输入 [ ]
已复制!
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
# loads BAAI/bge-small-en
# embed_model = HuggingFaceEmbedding()
# loads BAAI/bge-small-en-v1.5
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
from llama_index.embeddings.huggingface import HuggingFaceEmbedding # 加载 BAAI/bge-small-en # embed_model = HuggingFaceEmbedding() # 加载 BAAI/bge-small-en-v1.5 embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
/home/loganm/miniconda3/envs/llama-index/lib/python3.11/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML")
输入 [ ]
已复制!
embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
embeddings = embed_model.get_text_embedding("Hello World!") print(len(embeddings)) print(embeddings[:5])
Hello World! 384 [-0.030880315229296684, -0.11021008342504501, 0.3917851448059082, -0.35962796211242676, 0.22797748446464539]
InstructorEmbedding¶
Instructor Embeddings 是一类专门训练用于根据指令增强其嵌入的模型。默认情况下,查询会被赋予 query_instruction="Represent the question for retrieving supporting documents: "
,文本会被赋予 text_instruction="Represent the document for retrieval: "
。
它们依赖于 Instructor
和 SentenceTransformers
(版本 2.2.2) pip 包,您可以使用 pip install InstructorEmbedding
和 pip install -U sentence-transformers==2.2.2
进行安装。
输入 [ ]
已复制!
from llama_index.embeddings.instructor import InstructorEmbedding
embed_model = InstructorEmbedding(model_name="hkunlp/instructor-base")
from llama_index.embeddings.instructor import InstructorEmbedding embed_model = InstructorEmbedding(model_name="hkunlp/instructor-base")
/home/loganm/miniconda3/envs/llama-index/lib/python3.11/site-packages/InstructorEmbedding/instructor.py:7: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console) from tqdm.autonotebook import trange
load INSTRUCTOR_Transformer
/home/loganm/miniconda3/envs/llama-index/lib/python3.11/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML")
max_seq_length 512
输入 [ ]
已复制!
embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
embeddings = embed_model.get_text_embedding("Hello World!") print(len(embeddings)) print(embeddings[:5])
768 [ 0.02155361 -0.06098218 0.01796207 0.05490903 0.01526906]
OptimumEmbedding¶
Optimum 是 HuggingFace 的一个库,用于以 ONNX 格式导出和运行 HuggingFace 模型。
您可以使用 pip install transformers optimum[exporters]
安装依赖项。
首先,我们需要创建 ONNX 模型。ONNX 模型提供了更高的推理速度,并且可以跨平台使用(例如在 TransformersJS 中)。
输入 [ ]
已复制!
from llama_index.embeddings.huggingface_optimum import OptimumEmbedding
OptimumEmbedding.create_and_save_optimum_model(
"BAAI/bge-small-en-v1.5", "./bge_onnx"
)
from llama_index.embeddings.huggingface_optimum import OptimumEmbedding OptimumEmbedding.create_and_save_optimum_model( "BAAI/bge-small-en-v1.5", "./bge_onnx" )
/home/loganm/miniconda3/envs/llama-index/lib/python3.11/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") Framework not specified. Using pt to export to ONNX. Using the export variant default. Available variants are: - default: The default ONNX variant. Using framework PyTorch: 2.0.1+cu117 Overriding 1 configuration item(s) - use_cache -> False
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 ============= verbose: False, log level: Level.ERROR ======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ======================== Saved optimum model to ./bge_onnx. Use it with `embed_model = OptimumEmbedding(folder_name='./bge_onnx')`.
输入 [ ]
已复制!
embed_model = OptimumEmbedding(folder_name="./bge_onnx")
embed_model = OptimumEmbedding(folder_name="./bge_onnx")
输入 [ ]
已复制!
embeddings = embed_model.get_text_embedding("Hello World!")
print(len(embeddings))
print(embeddings[:5])
embeddings = embed_model.get_text_embedding("Hello World!") print(len(embeddings)) print(embeddings[:5])
384 [-0.10364960134029388, -0.20998482406139374, -0.01883639395236969, -0.5241696834564209, 0.0335749015212059]
基准测试 (Benchmarking)¶
让我们尝试比较使用一个经典的文档——IPCC 气候报告,第三章。
输入 [ ]
已复制!
!curl https://www.ipcc.ch/report/ar6/wg2/downloads/report/IPCC_AR6_WGII_Chapter03.pdf --output IPCC_AR6_WGII_Chapter03.pdf
!curl https://www.ipcc.ch/report/ar6/wg2/downloads/report/IPCC_AR6_WGII_Chapter03.pdf --output IPCC_AR6_WGII_Chapter03.pdf
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 20.7M 100 20.7M 0 0 16.5M 0 0:00:01 0:00:01 --:--:-- 16.5M
输入 [ ]
已复制!
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core import Settings
documents = SimpleDirectoryReader(
input_files=["IPCC_AR6_WGII_Chapter03.pdf"]
).load_data()
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader from llama_index.core import Settings documents = SimpleDirectoryReader( input_files=["IPCC_AR6_WGII_Chapter03.pdf"] ).load_data()
基础 HuggingFace 嵌入¶
输入 [ ]
已复制!
import os
import openai
# needed to synthesize responses later
os.environ["OPENAI_API_KEY"] = "sk-..."
openai.api_key = os.environ["OPENAI_API_KEY"]
import os import openai # 用于后续合成响应 os.environ["OPENAI_API_KEY"] = "sk-..." openai.api_key = os.environ["OPENAI_API_KEY"]
输入 [ ]
已复制!
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
# loads BAAI/bge-small-en-v1.5
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
test_emeds = embed_model.get_text_embedding("Hello World!")
Settings.embed_model = embed_model
from llama_index.embeddings.huggingface import HuggingFaceEmbedding # 加载 BAAI/bge-small-en-v1.5 embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5") test_emeds = embed_model.get_text_embedding("Hello World!") Settings.embed_model = embed_model
输入 [ ]
已复制!
%%timeit -r 1 -n 1
index = VectorStoreIndex.from_documents(documents, show_progress=True)
%%timeit -r 1 -n 1 index = VectorStoreIndex.from_documents(documents, show_progress=True)
Parsing documents into nodes: 0%| | 0/172 [00:00<?, ?it/s]
Generating embeddings: 0%| | 0/428 [00:00<?, ?it/s]
1min 27s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
Optimum 嵌入¶
我们可以使用之前创建的 onnx 嵌入
输入 [ ]
已复制!
from llama_index.embeddings.huggingface_optimum import OptimumEmbedding
embed_model = OptimumEmbedding(folder_name="./bge_onnx")
test_emeds = embed_model.get_text_embedding("Hello World!")
Settings.embed_model = embed_model
from llama_index.embeddings.huggingface_optimum import OptimumEmbedding embed_model = OptimumEmbedding(folder_name="./bge_onnx") test_emeds = embed_model.get_text_embedding("Hello World!") Settings.embed_model = embed_model
输入 [ ]
已复制!
%%timeit -r 1 -n 1
index = VectorStoreIndex.from_documents(documents, show_progress=True)
%%timeit -r 1 -n 1 index = VectorStoreIndex.from_documents(documents, show_progress=True)
Parsing documents into nodes: 0%| | 0/172 [00:00<?, ?it/s]
Generating embeddings: 0%| | 0/428 [00:00<?, ?it/s]
1min 9s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)