长文本重排

Node PostProcessor 模块。

LongContextReorder #

基类：BaseNodePostprocessor

模型难以访问扩展上下文中心的重要细节。一项研究 (https://arxiv.org/abs/2307.03172) 观察到，当关键数据位于输入上下文的开头或结尾时，通常会获得最佳性能。此外，随着输入上下文的增长，即使是为长上下文设计的模型，性能也会显著下降。

源代码位于 llama-index-core/llama_index/core/postprocessor/node.py

class LongContextReorder(BaseNodePostprocessor):
    """
    Models struggle to access significant details found
    in the center of extended contexts. A study
    (https://arxiv.org/abs/2307.03172) observed that the best
    performance typically arises when crucial data is positioned
    at the start or conclusion of the input context. Additionally,
    as the input context lengthens, performance drops notably, even
    in models designed for long contexts.".
    """

    @classmethod
    def class_name(cls) -> str:
        return "LongContextReorder"

    def _postprocess_nodes(
        self,
        nodes: List[NodeWithScore],
        query_bundle: Optional[QueryBundle] = None,
    ) -> List[NodeWithScore]:
        """Postprocess nodes."""
        reordered_nodes: List[NodeWithScore] = []
        ordered_nodes: List[NodeWithScore] = sorted(
            nodes, key=lambda x: x.score if x.score is not None else 0
        )
        for i, node in enumerate(ordered_nodes):
            if i % 2 == 0:
                reordered_nodes.insert(0, node)
            else:
                reordered_nodes.append(node)
        return reordered_nodes