FlagEmbedding重排器
重排可以在不牺牲准确性(实际上可能还会提高准确性)的前提下加速 LLM 查询。它通过从上下文中剪除不相关的节点来实现这一点。
如果你在 Colab 上打开此 Notebook,你可能需要安装 LlamaIndex 🦙。
In [ ]
已复制!
%pip install llama-index-embeddings-huggingface
%pip install llama-index-llms-openai
%pip install llama-index-postprocessor-flag-embedding-reranker
%pip install llama-index-embeddings-huggingface %pip install llama-index-llms-openai %pip install llama-index-postprocessor-flag-embedding-reranker
In [ ]
已复制!
!pip install llama-index
!pip install git+https://github.com/FlagOpen/FlagEmbedding.git
!pip install llama-index !pip install git+https://github.com/FlagOpen/FlagEmbedding.git
In [ ]
已复制!
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
下载数据
In [ ]
已复制!
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/' !wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
In [ ]
已复制!
import os
OPENAI_API_KEY = "sk-"
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
import os OPENAI_API_KEY = "sk-" os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
In [ ]
已复制!
# load documents
documents = SimpleDirectoryReader("./data/paul_graham").load_data()
# load documents documents = SimpleDirectoryReader("./data/paul_graham").load_data()
In [ ]
已复制!
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
from llama_index.embeddings.huggingface import HuggingFaceEmbedding from llama_index.llms.openai import OpenAI from llama_index.core import Settings Settings.llm = OpenAI(model="gpt-3.5-turbo") Settings.embed_model = HuggingFaceEmbedding( model_name="BAAI/bge-small-en-v1.5" )
In [ ]
已复制!
# build index
index = VectorStoreIndex.from_documents(documents=documents)
# build index index = VectorStoreIndex.from_documents(documents=documents)
In [ ]
已复制!
from llama_index.postprocessor.flag_embedding_reranker import (
FlagEmbeddingReranker,
)
rerank = FlagEmbeddingReranker(model="BAAI/bge-reranker-large", top_n=5)
from llama_index.postprocessor.flag_embedding_reranker import ( FlagEmbeddingReranker, ) rerank = FlagEmbeddingReranker(model="BAAI/bge-reranker-large", top_n=5)
首先,我们尝试使用重排。我们计时以查看处理检索到的上下文中的输出需要多长时间。
In [ ]
已复制!
from time import time
from time import time
In [ ]
已复制!
query_engine = index.as_query_engine(
similarity_top_k=10, node_postprocessors=[rerank]
)
now = time()
response = query_engine.query(
"Which grad schools did the author apply for and why?",
)
print(f"Elapsed: {round(time() - now, 2)}s")
query_engine = index.as_query_engine( similarity_top_k=10, node_postprocessors=[rerank] ) now = time() response = query_engine.query( "Which grad schools did the author apply for and why?", ) print(f"Elapsed: {round(time() - now, 2)}s")
Elapsed: 5.37s
In [ ]
已复制!
print(response)
print(response)
The author applied to three grad schools: MIT, Yale, and Harvard. The reason for applying to these schools was because they were renowned for AI at the time and the author wanted to pursue a career in artificial intelligence.
In [ ]
已复制!
print(response.get_formatted_sources(length=200))
print(response.get_formatted_sources(length=200))
> Source (Doc id: f7e7f522-40ae-416a-917e-a70e59979105): I didn't want to drop out of grad school, but how else was I going to get out? I remember when my friend Robert Morris got kicked out of Cornell for writing the internet worm of 1988, I was envious... > Source (Doc id: df6c6b73-b488-4506-9ab1-ae5e8d499d44): So I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp. I knew from experience that Lisp was interesting for its own sake and not just for its association ... > Source (Doc id: 8ee64ca0-3a8d-49d2-a41d-cbf1e10216fd): [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who'd already graduated, or were about to that spring. Already this S... > Source (Doc id: e95b6077-628a-4422-baad-765638cb6978): It was as weird as it sounds. I resumed all my old patterns, except now there were doors where there hadn't been. Now when I was tired of walking, all I had to do was raise my hand, and (unless it ... > Source (Doc id: 6c54f961-c5ff-466e-861a-66f5c1c25e36): I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI. AI was in the air in t...
接下来,我们尝试不使用重排
In [ ]
已复制!
query_engine = index.as_query_engine(similarity_top_k=10)
now = time()
response = query_engine.query(
"Which grad schools did the author apply for and why?",
)
print(f"Elapsed: {round(time() - now, 2)}s")
query_engine = index.as_query_engine(similarity_top_k=10) now = time() response = query_engine.query( "Which grad schools did the author apply for and why?", ) print(f"Elapsed: {round(time() - now, 2)}s")
Elapsed: 10.35s
In [ ]
已复制!
print(response)
print(response)
The author applied to three grad schools: MIT, Yale, and Harvard. They chose these schools based on their strong reputations in the field of AI at the time. Additionally, Harvard was appealing because it was where Bill Woods, the inventor of the parser used in the author's SHRDLU clone, was located.
In [ ]
已复制!
print(response.get_formatted_sources(length=200))
print(response.get_formatted_sources(length=200))
> Source (Doc id: f7e7f522-40ae-416a-917e-a70e59979105): I didn't want to drop out of grad school, but how else was I going to get out? I remember when my friend Robert Morris got kicked out of Cornell for writing the internet worm of 1988, I was envious... > Source (Doc id: 6c54f961-c5ff-466e-861a-66f5c1c25e36): I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI. AI was in the air in t... > Source (Doc id: d258db84-0975-4de0-a19b-752f529d9e5a): What I Worked On February 2021 Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed... > Source (Doc id: 04582ebe-239a-432a-9304-611676593c66): It's not that unprestigious types of work are good per se. But when you find yourself drawn to some kind of work despite its current lack of prestige, it's a sign both that there's something real t... > Source (Doc id: 8ee64ca0-3a8d-49d2-a41d-cbf1e10216fd): [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who'd already graduated, or were about to that spring. Already this S... > Source (Doc id: d46b4c41-05f8-4492-b978-0ce1863a0f00): Now that I could write essays again, I wrote a bunch about topics I'd had stacked up. I kept writing essays through 2020, but I also started to think about other things I could work on. How should ... > Source (Doc id: df6c6b73-b488-4506-9ab1-ae5e8d499d44): So I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp. I knew from experience that Lisp was interesting for its own sake and not just for its association ... > Source (Doc id: d91c08cf-6f7d-4ac5-8cf0-d8bcba4e77ff): It was missing a lot of things you'd want in a programming language. So these had to be added, and when they were, they weren't defined using McCarthy's original axiomatic approach. That wouldn't h... > Source (Doc id: e95b6077-628a-4422-baad-765638cb6978): It was as weird as it sounds. I resumed all my old patterns, except now there were doors where there hadn't been. Now when I was tired of walking, all I had to do was raise my hand, and (unless it ... > Source (Doc id: 027ba923-2307-4e28-8e6b-53be8e4db8ec): But Interleaf still had a few years to live yet. [5] Interleaf had done something pretty bold. Inspired by Emacs, they'd added a scripting language, and even made the scripting language a dialect ...
正如我们所见,使用重排的查询引擎在更短的时间内(6秒 对 10秒)产生了更简洁的输出。虽然两个响应本质上都是正确的,但未使用重排的查询引擎包含了许多不相关的信息——这种现象我们可以归因于“上下文窗口污染”。