基类:BaseNodePostprocessor
模型难以访问扩展上下文中心的重要细节。一项研究 (https://arxiv.org/abs/2307.03172) 观察到,当关键数据位于输入上下文的开头或结尾时,通常会获得最佳性能。此外,随着输入上下文的增长,即使是为长上下文设计的模型,性能也会显著下降。
源代码位于 llama-index-core/llama_index/core/postprocessor/node.py
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396 | class LongContextReorder(BaseNodePostprocessor):
"""
Models struggle to access significant details found
in the center of extended contexts. A study
(https://arxiv.org/abs/2307.03172) observed that the best
performance typically arises when crucial data is positioned
at the start or conclusion of the input context. Additionally,
as the input context lengthens, performance drops notably, even
in models designed for long contexts.".
"""
@classmethod
def class_name(cls) -> str:
return "LongContextReorder"
def _postprocess_nodes(
self,
nodes: List[NodeWithScore],
query_bundle: Optional[QueryBundle] = None,
) -> List[NodeWithScore]:
"""Postprocess nodes."""
reordered_nodes: List[NodeWithScore] = []
ordered_nodes: List[NodeWithScore] = sorted(
nodes, key=lambda x: x.score if x.score is not None else 0
)
for i, node in enumerate(ordered_nodes):
if i % 2 == 0:
reordered_nodes.insert(0, node)
else:
reordered_nodes.append(node)
return reordered_nodes
|