SentenceTransformerRerank
重排(Rerank)可以在不牺牲准确性(实际上可能会提高准确性)的情况下加速 LLM 查询。它通过从上下文中修剪掉不相关的节点来实现这一点。
如果您在 colab 上打开此 Notebook,您可能需要安装 LlamaIndex 🦙。
In [ ]
已复制!
%pip install llama-index-embeddings-huggingface
%pip install llama-index-llms-openai
%pip install llama-index-embeddings-huggingface %pip install llama-index-llms-openai
In [ ]
已复制!
!pip install llama-index
!pip install llama-index
In [ ]
已复制!
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
下载数据
In [ ]
已复制!
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/' !wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
In [ ]
已复制!
# load documents
documents = SimpleDirectoryReader("./data/paul_graham").load_data()
# load documents documents = SimpleDirectoryReader("./data/paul_graham").load_data()
In [ ]
已复制!
from llama_index.core import Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.openai import OpenAI
Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
from llama_index.core import Settings from llama_index.embeddings.huggingface import HuggingFaceEmbedding from llama_index.llms.openai import OpenAI Settings.llm = OpenAI(model="gpt-3.5-turbo") Settings.embed_model = HuggingFaceEmbedding( model_name="BAAI/bge-small-en-v1.5" )
/home/jonch/.local/lib/python3.10/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
In [ ]
已复制!
# build index
index = VectorStoreIndex.from_documents(documents=documents)
# build index index = VectorStoreIndex.from_documents(documents=documents)
In [ ]
已复制!
from llama_index.core.postprocessor import SentenceTransformerRerank
rerank = SentenceTransformerRerank(
model="cross-encoder/ms-marco-MiniLM-L-2-v2", top_n=3
)
from llama_index.core.postprocessor import SentenceTransformerRerank rerank = SentenceTransformerRerank( model="cross-encoder/ms-marco-MiniLM-L-2-v2", top_n=3 )
首先,我们尝试使用重排。我们对查询进行计时,看看处理从检索到的上下文中获得的输出需要多长时间。
In [ ]
已复制!
from time import time
from time import time
In [ ]
已复制!
query_engine = index.as_query_engine(
similarity_top_k=10, node_postprocessors=[rerank]
)
now = time()
response = query_engine.query(
"Which grad schools did the author apply for and why?",
)
print(f"Elapsed: {round(time() - now, 2)}s")
query_engine = index.as_query_engine( similarity_top_k=10, node_postprocessors=[rerank] ) now = time() response = query_engine.query( "Which grad schools did the author apply for and why?", ) print(f"Elapsed: {round(time() - now, 2)}s")
Elapsed: 4.03s
In [ ]
已复制!
print(response)
print(response)
The author applied to three grad schools: MIT and Yale, which were renowned for AI at the time, and Harvard, which the author had visited because a friend went there and it was also home to Bill Woods, who had invented the type of parser the author used in his SHRDLU clone. The author chose these schools because he wanted to learn about AI and Lisp, and these schools were known for their expertise in these areas.
In [ ]
已复制!
print(response.get_formatted_sources(length=200))
print(response.get_formatted_sources(length=200))
> Source (Doc id: 08074ca9-1806-4e49-84de-102a97f1f220): been explored. But all I wanted was to get out of grad school, and my rapidly written dissertation sufficed, just barely. Meanwhile I was applying to art schools. I applied to two: RISD in the US,... > Source (Doc id: 737f4526-2752-45e8-a59a-e1e4528cc025): about money, because I could sense that Interleaf was on the way down. Freelance Lisp hacking work was very rare, and I didn't want to have to program in another language, which in those days would... > Source (Doc id: b8883569-44f9-454c-9f62-15e926d04b98): showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed o...
接下来,我们尝试不使用重排。
In [ ]
已复制!
query_engine = index.as_query_engine(similarity_top_k=10)
now = time()
response = query_engine.query(
"Which grad schools did the author apply for and why?",
)
print(f"Elapsed: {round(time() - now, 2)}s")
query_engine = index.as_query_engine(similarity_top_k=10) now = time() response = query_engine.query( "Which grad schools did the author apply for and why?", ) print(f"Elapsed: {round(time() - now, 2)}s")
Elapsed: 28.13s
In [ ]
已复制!
print(response)
print(response)
The author applied to three grad schools: MIT and Yale, which were renowned for AI at the time, and Harvard, which the author had visited because a friend went there and was also home to Bill Woods, who had invented the type of parser the author used in his SHRDLU clone. The author chose these schools because he was interested in Artificial Intelligence and wanted to pursue it further, and they were the most renowned for it at the time. He was also inspired by a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. Additionally, the author had dropped out of RISD, where he had been learning to paint, and was looking for a new challenge. He was drawn to the idea of pursuing AI, as it was a field that was rapidly growing and he wanted to be part of the cutting edge of technology. He was also inspired by the idea of creating something unique and innovative, as he had done with his SHRDLU clone, and wanted to continue to explore the possibilities of AI.
In [ ]
已复制!
print(response.get_formatted_sources(length=200))
print(response.get_formatted_sources(length=200))
> Source (Doc id: 08074ca9-1806-4e49-84de-102a97f1f220): been explored. But all I wanted was to get out of grad school, and my rapidly written dissertation sufficed, just barely. Meanwhile I was applying to art schools. I applied to two: RISD in the US,... > Source (Doc id: 737f4526-2752-45e8-a59a-e1e4528cc025): about money, because I could sense that Interleaf was on the way down. Freelance Lisp hacking work was very rare, and I didn't want to have to program in another language, which in those days would... > Source (Doc id: b8883569-44f9-454c-9f62-15e926d04b98): showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed o... > Source (Doc id: 599f469b-9a92-4952-8753-a063c31a953b): I didn't know but would turn out to like a lot: a woman called Jessica Livingston. A couple days later I asked her out. Jessica was in charge of marketing at a Boston investment bank. This bank th... > Source (Doc id: c865f333-b731-4a8b-a99f-eec54eaa1e6b): Like McCarthy's original Lisp, it's a spec rather than an implementation, although like McCarthy's Lisp it's a spec expressed as code. Now that I could write essays again, I wrote a bunch about to... > Source (Doc id: 69c6b190-2d4e-4128-b9c4-4fd31af2df65): 1960 paper. But if so there's no reason to suppose that this is the limit of the language that might be known to them. Presumably aliens need numbers and errors and I/O too. So it seems likely the... > Source (Doc id: c9c95028-a49e-440e-a953-7aabe6b9996d): What I Worked On February 2021 Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed... > Source (Doc id: 7f0c11db-d6f0-41f9-95bc-1feab914f58f): that big, bureaucratic customers are a dangerous source of money, and that there's not much overlap between conventional office hours and the optimal time for hacking, or conventional offices and t... > Source (Doc id: c143a6c2-5f5d-49c5-bc1e-b9caa0ce4931): must tell readers things they don't already know, and some people dislike being told such things. [11] People put plenty of stuff on the internet in the 90s of course, but putting something online... > Source (Doc id: 6e281eec-6964-414b-be61-bcc509d95903): which I'd created years before using Viaweb but had never used for anything. In one day it got 30,000 page views. What on earth had happened? The referring urls showed that someone had posted it on...
正如我们所见,使用重排的查询引擎在更短的时间内(4秒 对比 28秒)产生了更简洁的输出。虽然两个响应基本上都是正确的,但没有使用重排的查询引擎包含了许多不相关的信息——这种现象可以归因于“上下文窗口的污染”。