从零开始构建一个（非常简单的）向量存储¶

在本教程中，我们将向您展示如何构建一个简单的内存向量存储，它可以存储文档及其元数据。它还将暴露一个可以支持各种查询的查询接口

语义搜索（通过嵌入相似性）
元数据过滤

注意：显然，这并非旨在替代任何实际的向量存储（例如 Pinecone、Weaviate、Chroma、Qdrant、Milvus 或我们广泛的向量存储集成中的其他任何一个）。这更多是为了教授一些关键的检索概念，例如 top-k 嵌入搜索 + 元数据过滤。

我们不会涵盖高级查询/检索概念，例如近似最近邻、稀疏/混合搜索，或构建实际数据库所需的任何系统概念。

设置¶

我们加载一些文档，并将它们解析为 Node 对象——准备插入向量存储的块。

加载文档¶

输入 [ ]

已复制！

%pip install llama-index-readers-file pymupdf
%pip install llama-index-embeddings-openai
%pip install llama-index-readers-file pymupdf %pip install llama-index-embeddings-openai

输入 [ ]

已复制！

!mkdir data
!wget --user-agent "Mozilla" "https://arxiv.org/pdf/2307.09288.pdf" -O "data/llama2.pdf"
!mkdir data !wget --user-agent "Mozilla" "https://arxiv.org/pdf/2307.09288.pdf" -O "data/llama2.pdf"

输入 [ ]

已复制！

from pathlib import Path
from llama_index.readers.file import PyMuPDFReader
from pathlib import Path from llama_index.readers.file import PyMuPDFReader loader = PyMuPDFReader() documents = loader.load(file_path="./data/llama2.pdf")

输入 [ ]

已复制！

loader = PyMuPDFReader()
documents = loader.load(file_path="./data/llama2.pdf")
解析成节点¶

from llama_index.core.node_parser import SentenceSplitter node_parser = SentenceSplitter(chunk_size=256) nodes = node_parser.get_nodes_from_documents(documents)

输入 [ ]

已复制！

from llama_index.core.node_parser import SentenceSplitter

node_parser = SentenceSplitter(chunk_size=256)
nodes = node_parser.get_nodes_from_documents(documents)
为每个节点生成嵌入¶

from llama_index.embeddings.openai import OpenAIEmbedding embed_model = OpenAIEmbedding() for node in nodes: node_embedding = embed_model.get_text_embedding( node.get_content(metadata_mode="all") ) node.embedding = node_embedding

输入 [ ]

已复制！





from llama_index.embeddings.openai import OpenAIEmbedding

embed_model = OpenAIEmbedding()
for node in nodes:
    node_embedding = embed_model.get_text_embedding(
        node.get_content(metadata_mode="all")
    )
    node.embedding = node_embedding
构建一个简单的内存向量存储¶

现在我们将构建内存向量存储。我们将把节点存储在一个简单的 Python 字典中。我们将从实现嵌入搜索开始，然后添加元数据过滤。

1. 定义接口¶

我们将首先定义构建向量存储的接口。它包含以下项

get

add
delete
query
persist (我们暂不实现)
from llama_index.core.vector_stores.types import BasePydanticVectorStore from llama_index.core.vector_stores import ( VectorStoreQuery, VectorStoreQueryResult, ) from typing import List, Any, Optional, Dict from llama_index.core.schema import TextNode, BaseNode import os class BaseVectorStore(BasePydanticVectorStore): """Simple custom Vector Store. Stores documents in a simple in-memory dict. """ stores_text: bool = True def get(self, text_id: str) -> List[float]: """Get embedding.""" pass def add( self, nodes: List[BaseNode], ) -> List[str]: """Add nodes to index.""" pass def delete(self, ref_doc_id: str, **delete_kwargs: Any) -> None: """ Delete nodes using with ref_doc_id. Args: ref_doc_id (str): The doc_id of the document to delete. """ pass def query( self, query: VectorStoreQuery, **kwargs: Any, ) -> VectorStoreQueryResult: """Get nodes for response.""" pass def persist(self, persist_path, fs=None) -> None: """Persist the SimpleVectorStore to a directory. NOTE: we are not implementing this for now. """ pass

输入 [ ]

已复制！





from llama_index.core.vector_stores.types import BasePydanticVectorStore
from llama_index.core.vector_stores import (
    VectorStoreQuery,
    VectorStoreQueryResult,
)
from typing import List, Any, Optional, Dict
from llama_index.core.schema import TextNode, BaseNode
import os


class BaseVectorStore(BasePydanticVectorStore):
    """Simple custom Vector Store.

    Stores documents in a simple in-memory dict.

    """

    stores_text: bool = True

    def get(self, text_id: str) -> List[float]:
        """Get embedding."""
        pass

    def add(
        self,
        nodes: List[BaseNode],
    ) -> List[str]:
        """Add nodes to index."""
        pass

    def delete(self, ref_doc_id: str, **delete_kwargs: Any) -> None:
        """
        Delete nodes using with ref_doc_id.

        Args:
            ref_doc_id (str): The doc_id of the document to delete.

        """
        pass

    def query(
        self,
        query: VectorStoreQuery,
        **kwargs: Any,
    ) -> VectorStoreQueryResult:
        """Get nodes for response."""
        pass

    def persist(self, persist_path, fs=None) -> None:
        """Persist the SimpleVectorStore to a directory.

        NOTE: we are not implementing this for now.

        """
        pass
从高层次看，我们对基础 VectorStore 抽象类进行了子类化。如果您只是从零开始构建向量存储，这样做并没有本质上的原因。我们这样做是为了方便以后接入到我们的下游抽象中。

让我们看看这里定义的一些类。

BaseNode 只是我们的核心 Node 模块的父类。每个 Node 代表一个文本块 + 相关的元数据。

我们还使用了一些底层构造，例如我们的 VectorStoreQuery 和 VectorStoreQueryResult。这些只是轻量级的数据类容器，用于表示查询和结果。我们在下面查看数据类字段。
from dataclasses import fields {f.name: f.type for f in fields(VectorStoreQuery)}

输入 [ ]

已复制！

from dataclasses import fields

{f.name: f.type for f in fields(VectorStoreQuery)}
输出 [ ]

{f.name: f.type for f in fields(VectorStoreQueryResult)}

{'query_embedding': typing.Optional[typing.List[float]],
 'similarity_top_k': int,
 'doc_ids': typing.Optional[typing.List[str]],
 'node_ids': typing.Optional[typing.List[str]],
 'query_str': typing.Optional[str],
 'output_fields': typing.Optional[typing.List[str]],
 'embedding_field': typing.Optional[str],
 'mode': <enum 'VectorStoreQueryMode'>,
 'alpha': typing.Optional[float],
 'filters': typing.Optional[llama_index.vector_stores.types.MetadataFilters],
 'mmr_threshold': typing.Optional[float],
 'sparse_top_k': typing.Optional[int]}

输入 [ ]

已复制！

{f.name: f.type for f in fields(VectorStoreQueryResult)}
2. 定义 add、get 和 delete¶

{f.name: f.type for f in fields(VectorStoreQueryResult)}

{'nodes': typing.Optional[typing.Sequence[llama_index.schema.BaseNode]],
 'similarities': typing.Optional[typing.List[float]],
 'ids': typing.Optional[typing.List[str]]}

我们添加了一些基本功能，用于向向量存储添加、获取和删除数据。

实现非常简单（所有内容都存储在 Python 字典中）。

from llama_index.core.bridge.pydantic import Field class VectorStore2(BaseVectorStore): """VectorStore2 (add/get/delete implemented).""" stores_text: bool = True node_dict: Dict[str, BaseNode] = Field(default_factory=dict) def get(self, text_id: str) -> List[float]: """Get embedding.""" return self.node_dict[text_id] def add( self, nodes: List[BaseNode], ) -> List[str]: """Add nodes to index.""" for node in nodes: self.node_dict[node.node_id] = node def delete(self, node_id: str, **delete_kwargs: Any) -> None: """ Delete nodes using with node_id. Args: node_id: str """ del self.node_dict[node_id]

输入 [ ]

已复制！





from llama_index.core.bridge.pydantic import Field


class VectorStore2(BaseVectorStore):
    """VectorStore2 (add/get/delete implemented)."""

    stores_text: bool = True
    node_dict: Dict[str, BaseNode] = Field(default_factory=dict)

    def get(self, text_id: str) -> List[float]:
        """Get embedding."""
        return self.node_dict[text_id]

    def add(
        self,
        nodes: List[BaseNode],
    ) -> List[str]:
        """Add nodes to index."""
        for node in nodes:
            self.node_dict[node.node_id] = node

    def delete(self, node_id: str, **delete_kwargs: Any) -> None:
        """
        Delete nodes using with node_id.

        Args:
            node_id: str

        """
        del self.node_dict[node_id]
我们运行一些基本测试来展示其良好工作。

test_node = TextNode(id_="id1", text="hello world") test_node2 = TextNode(id_="id2", text="foo bar") test_nodes = [test_node, test_node2] vector_store = VectorStore2() vector_store.add(test_nodes)

输入 [ ]

已复制！

test_node = TextNode(id_="id1", text="hello world")
test_node2 = TextNode(id_="id2", text="foo bar")
test_nodes = [test_node, test_node2]
node = vector_store.get("id1") print(str(node))

输入 [ ]

已复制！

vector_store = VectorStore2()
3.a 定义 query（语义搜索）¶

输入 [ ]

已复制！

vector_store.add(test_nodes)
我们实现了一个 top-k 语义搜索的基本版本。它只是遍历所有文档嵌入，并计算与查询嵌入的余弦相似度。返回按余弦相似度排序的 top-k 文档。

输入 [ ]

已复制！

node = vector_store.get("id1")
print(str(node))
余弦相似度：对于每个文档嵌入 $\vec{d}$ 和查询嵌入 $\vec{q}$ 对，计算 $\dfrac{\vec{d}\vec{q}}{|\vec{d}||\vec{q}|}$。

Node ID: id1
Text: hello world

注意：top-k 值包含在 `VectorStoreQuery` 容器中。注意：与上面类似，我们定义了另一个子类，仅仅是为了避免重复实现上述函数（并非因为这是良好的代码实践）。

from typing import Tuple import numpy as np def get_top_k_embeddings( query_embedding: List[float], doc_embeddings: List[List[float]], doc_ids: List[str], similarity_top_k: int = 5, ) -> Tuple[List[float], List]: """Get top nodes by similarity to the query.""" # dimensions: D qembed_np = np.array(query_embedding) # dimensions: N x D dembed_np = np.array(doc_embeddings) # dimensions: N dproduct_arr = np.dot(dembed_np, qembed_np) # dimensions: N norm_arr = np.linalg.norm(qembed_np) * np.linalg.norm( dembed_np, axis=1, keepdims=False ) # dimensions: N cos_sim_arr = dproduct_arr / norm_arr # now we have the N cosine similarities for each document # sort by top k cosine similarity, and return ids tups = [(cos_sim_arr[i], doc_ids[i]) for i in range(len(doc_ids))] sorted_tups = sorted(tups, key=lambda t: t[0], reverse=True) sorted_tups = sorted_tups[:similarity_top_k] result_similarities = [s for s, _ in sorted_tups] result_ids = [n for _, n in sorted_tups] return result_similarities, result_ids

from typing import cast class VectorStore3A(VectorStore2): """Implements semantic/dense search.""" def query( self, query: VectorStoreQuery, **kwargs: Any, ) -> VectorStoreQueryResult: """Get nodes for response.""" query_embedding = cast(List[float], query.query_embedding) doc_embeddings = [n.embedding for n in self.node_dict.values()] doc_ids = [n.node_id for n in self.node_dict.values()] similarities, node_ids = get_top_k_embeddings( query_embedding, doc_embeddings, doc_ids, similarity_top_k=query.similarity_top_k, ) result_nodes = [self.node_dict[node_id] for node_id in node_ids] return VectorStoreQueryResult( nodes=result_nodes, similarities=similarities, ids=node_ids )

3.b. 支持元数据过滤¶

下一个扩展是添加元数据过滤支持。这意味着我们将首先使用通过元数据过滤的文档来过滤候选集，然后执行语义查询。

输入 [ ]

已复制！





from typing import Tuple
import numpy as np


def get_top_k_embeddings(
    query_embedding: List[float],
    doc_embeddings: List[List[float]],
    doc_ids: List[str],
    similarity_top_k: int = 5,
) -> Tuple[List[float], List]:
    """Get top nodes by similarity to the query."""
    # dimensions: D
    qembed_np = np.array(query_embedding)
    # dimensions: N x D
    dembed_np = np.array(doc_embeddings)
    # dimensions: N
    dproduct_arr = np.dot(dembed_np, qembed_np)
    # dimensions: N
    norm_arr = np.linalg.norm(qembed_np) * np.linalg.norm(
        dembed_np, axis=1, keepdims=False
    )
    # dimensions: N
    cos_sim_arr = dproduct_arr / norm_arr

    # now we have the N cosine similarities for each document
    # sort by top k cosine similarity, and return ids
    tups = [(cos_sim_arr[i], doc_ids[i]) for i in range(len(doc_ids))]
    sorted_tups = sorted(tups, key=lambda t: t[0], reverse=True)

    sorted_tups = sorted_tups[:similarity_top_k]

    result_similarities = [s for s, _ in sorted_tups]
    result_ids = [n for _, n in sorted_tups]
    return result_similarities, result_ids
为简单起见，我们使用元数据过滤来实现精确匹配和 AND 条件。

输入 [ ]

已复制！





from typing import cast


class VectorStore3A(VectorStore2):
    """Implements semantic/dense search."""

    def query(
        self,
        query: VectorStoreQuery,
        **kwargs: Any,
    ) -> VectorStoreQueryResult:
        """Get nodes for response."""

        query_embedding = cast(List[float], query.query_embedding)
        doc_embeddings = [n.embedding for n in self.node_dict.values()]
        doc_ids = [n.node_id for n in self.node_dict.values()]

        similarities, node_ids = get_top_k_embeddings(
            query_embedding,
            doc_embeddings,
            doc_ids,
            similarity_top_k=query.similarity_top_k,
        )
        result_nodes = [self.node_dict[node_id] for node_id in node_ids]

        return VectorStoreQueryResult(
            nodes=result_nodes, similarities=similarities, ids=node_ids
        )
from llama_index.core.vector_stores import MetadataFilters from llama_index.core.schema import BaseNode from typing import cast def filter_nodes(nodes: List[BaseNode], filters: MetadataFilters): filtered_nodes = [] for node in nodes: matches = True for f in filters.filters: if f.key not in node.metadata: matches = False continue if f.value != node.metadata[f.key]: matches = False continue if matches: filtered_nodes.append(node) return filtered_nodes

在运行语义搜索之前，我们将 `filter_nodes` 作为对节点进行第一次过滤处理。

def dense_search(query: VectorStoreQuery, nodes: List[BaseNode]): """Dense search.""" query_embedding = cast(List[float], query.query_embedding) doc_embeddings = [n.embedding for n in nodes] doc_ids = [n.node_id for n in nodes] return get_top_k_embeddings( query_embedding, doc_embeddings, doc_ids, similarity_top_k=query.similarity_top_k, ) class VectorStore3B(VectorStore2): """Implements Metadata Filtering.""" def query( self, query: VectorStoreQuery, **kwargs: Any, ) -> VectorStoreQueryResult: """Get nodes for response.""" # 1. First filter by metadata nodes = self.node_dict.values() if query.filters is not None: nodes = filter_nodes(nodes, query.filters) if len(nodes) == 0: result_nodes = [] similarities = [] node_ids = [] else: # 2. Then perform semantic search similarities, node_ids = dense_search(query, nodes) result_nodes = [self.node_dict[node_id] for node_id in node_ids] return VectorStoreQueryResult( nodes=result_nodes, similarities=similarities, ids=node_ids )

4. 将数据加载到我们的向量存储中¶

输入 [ ]

已复制！





from llama_index.core.vector_stores import MetadataFilters
from llama_index.core.schema import BaseNode
from typing import cast


def filter_nodes(nodes: List[BaseNode], filters: MetadataFilters):
    filtered_nodes = []
    for node in nodes:
        matches = True
        for f in filters.filters:
            if f.key not in node.metadata:
                matches = False
                continue
            if f.value != node.metadata[f.key]:
                matches = False
                continue
        if matches:
            filtered_nodes.append(node)
    return filtered_nodes
让我们将文本块加载到向量存储中，并对不同类型的查询运行它：密集搜索、带元数据过滤等。

vector_store = VectorStore3B() # load data into the vector stores vector_store.add(nodes)

输入 [ ]

已复制！





def dense_search(query: VectorStoreQuery, nodes: List[BaseNode]):
    """Dense search."""
    query_embedding = cast(List[float], query.query_embedding)
    doc_embeddings = [n.embedding for n in nodes]
    doc_ids = [n.node_id for n in nodes]
    return get_top_k_embeddings(
        query_embedding,
        doc_embeddings,
        doc_ids,
        similarity_top_k=query.similarity_top_k,
    )


class VectorStore3B(VectorStore2):
    """Implements Metadata Filtering."""

    def query(
        self,
        query: VectorStoreQuery,
        **kwargs: Any,
    ) -> VectorStoreQueryResult:
        """Get nodes for response."""
        # 1. First filter by metadata
        nodes = self.node_dict.values()
        if query.filters is not None:
            nodes = filter_nodes(nodes, query.filters)
        if len(nodes) == 0:
            result_nodes = []
            similarities = []
            node_ids = []
        else:
            # 2. Then perform semantic search
            similarities, node_ids = dense_search(query, nodes)
            result_nodes = [self.node_dict[node_id] for node_id in node_ids]
        return VectorStoreQueryResult(
            nodes=result_nodes, similarities=similarities, ids=node_ids
        )
定义一个示例问题并将其嵌入。

query_str = "Can you tell me about the key concepts for safety finetuning" query_embedding = embed_model.get_query_embedding(query_str)

使用密集搜索查询向量存储。¶

输入 [ ]

已复制！

vector_store = VectorStore3B()
# load data into the vector stores
vector_store.add(nodes)
query_obj = VectorStoreQuery( query_embedding=query_embedding, similarity_top_k=2 ) query_result = vector_store.query(query_obj) for similarity, node in zip(query_result.similarities, query_result.nodes): print( "\n----------------\n" f"[Node ID {node.node_id}] Similarity: {similarity}\n\n" f"{node.get_content(metadata_mode='all')}" "\n----------------\n\n" )

使用密集搜索 + 元数据过滤查询向量存储¶

输入 [ ]

已复制！

query_str = "Can you tell me about the key concepts for safety finetuning"
query_embedding = embed_model.get_query_embedding(query_str)
# filters = MetadataFilters( # filters=[ # ExactMatchFilter(key="page", value=3) # ] # ) filters = MetadataFilters.from_dict({"source": "24"}) query_obj = VectorStoreQuery( query_embedding=query_embedding, similarity_top_k=2, filters=filters ) query_result = vector_store.query(query_obj) for similarity, node in zip(query_result.similarities, query_result.nodes): print( "\n----------------\n" f"[Node ID {node.node_id}] Similarity: {similarity}\n\n" f"{node.get_content(metadata_mode='all')}" "\n----------------\n\n" )

使用向量存储构建 RAG 系统¶

输入 [ ]

已复制！





query_obj = VectorStoreQuery(
    query_embedding=query_embedding, similarity_top_k=2
)

query_result = vector_store.query(query_obj)
for similarity, node in zip(query_result.similarities, query_result.nodes):
    print(
        "\n----------------\n"
        f"[Node ID {node.node_id}] Similarity: {similarity}\n\n"
        f"{node.get_content(metadata_mode='all')}"
        "\n----------------\n\n"
    )
既然我们已经构建了 RAG 系统，现在是时候将其插入到我们的下游系统中了！

----------------
[Node ID 3f74fdf4-0e2e-473e-9b07-10c51eb62794] Similarity: 0.835677131511819

total_pages: 77
file_path: ./data/llama2.pdf
source: 23

Specifically, we use the following techniques in safety fine-tuning:
1. Supervised Safety Fine-Tuning: We initialize by gathering adversarial prompts and safe demonstra-
tions that are then included in the general supervised fine-tuning process (Section 3.1). This teaches
the model to align with our safety guidelines even before RLHF, and thus lays the foundation for
high-quality human preference data annotation.
2. Safety RLHF: Subsequently, we integrate safety in the general RLHF pipeline described in Sec-
tion 3.2.2. This includes training a safety-specific reward model and gathering more challenging
adversarial prompts for rejection sampling style fine-tuning and PPO optimization.
3. Safety Context Distillation: Finally, we refine our RLHF pipeline with context distillation (Askell
et al., 2021b).
----------------

----------------
[Node ID 5ad5efb3-8442-4e8a-b35a-cc3a10551dc9] Similarity: 0.827877930608312

total_pages: 77
file_path: ./data/llama2.pdf
source: 23

Benchmarks give a summary view of model capabilities and behaviors that allow us to understand general
patterns in the model, but they do not provide a fully comprehensive view of the impact the model may have
on people or real-world outcomes; that would require study of end-to-end product deployments. Further
testing and mitigation should be done to understand bias and other social issues for the specific context
in which a system may be deployed. For this, it may be necessary to test beyond the groups available in
the BOLD dataset (race, religion, and gender). As LLMs are integrated and deployed, we look forward to
continuing research that will amplify their potential for positive impact on these important social issues.
4.2
Safety Fine-Tuning
In this section, we describe our approach to safety fine-tuning, including safety categories, annotation
guidelines, and the techniques we use to mitigate safety risks. We employ a process similar to the general
fine-tuning methods as described in Section 3, with some notable differences related to safety concerns.
----------------

from llama_index.core import VectorStoreIndex vector_store = VectorStore3B() # load data into the vector stores vector_store.add(nodes) index = VectorStoreIndex.from_vector_store(vector_store) query_engine = index.as_query_engine() query_str = "Can you tell me about the key concepts for safety finetuning" response = query_engine.query(query_str) print(str(response))

输入 [ ]

已复制！





# filters = MetadataFilters(
#     filters=[
#         ExactMatchFilter(key="page", value=3)
#     ]
# )
filters = MetadataFilters.from_dict({"source": "24"})

query_obj = VectorStoreQuery(
    query_embedding=query_embedding, similarity_top_k=2, filters=filters
)

query_result = vector_store.query(query_obj)
for similarity, node in zip(query_result.similarities, query_result.nodes):
    print(
        "\n----------------\n"
        f"[Node ID {node.node_id}] Similarity: {similarity}\n\n"
        f"{node.get_content(metadata_mode='all')}"
        "\n----------------\n\n"
    )
结论¶

----------------
[Node ID efe54bc0-4f9f-49ad-9dd5-900395a092fa] Similarity: 0.8190195580569283

total_pages: 77
file_path: ./data/llama2.pdf
source: 24

4.2.2
Safety Supervised Fine-Tuning
In accordance with the established guidelines from Section 4.2.1, we gather prompts and demonstrations
of safe model responses from trained annotators, and use the data for supervised fine-tuning in the same
manner as described in Section 3.1. An example can be found in Table 5.
The annotators are instructed to initially come up with prompts that they think could potentially induce
the model to exhibit unsafe behavior, i.e., perform red teaming, as defined by the guidelines. Subsequently,
annotators are tasked with crafting a safe and helpful response that the model should produce.
4.2.3
Safety RLHF
We observe early in the development of Llama 2-Chat that it is able to generalize from the safe demonstrations
in supervised fine-tuning. The model quickly learns to write detailed safe responses, address safety concerns,
explain why the topic might be sensitive, and provide additional helpful information.
----------------

----------------
[Node ID 619c884b-cdbc-44b2-aec0-2692b44740ee] Similarity: 0.8010811332867503

total_pages: 77
file_path: ./data/llama2.pdf
source: 24

In particular, when
the model outputs safe responses, they are often more detailed than what the average annotator writes.
Therefore, after gathering only a few thousand supervised demonstrations, we switched entirely to RLHF to
teach the model how to write more nuanced responses. Comprehensive tuning with RLHF has the added
benefit that it may make the model more robust to jailbreak attempts (Bai et al., 2022a).
We conduct RLHF by first collecting human preference data for safety similar to Section 3.2.2: annotators
write a prompt that they believe can elicit unsafe behavior, and then compare multiple model responses to
the prompts, selecting the response that is safest according to a set of guidelines. We then use the human
preference data to train a safety reward model (see Section 3.2.2), and also reuse the adversarial prompts to
sample from the model during the RLHF stage.
Better Long-Tail Safety Robustness without Hurting Helpfulness
Safety is inherently a long-tail problem,
where the challenge comes from a small number of very specific cases.
----------------

就这样！我们构建了一个简单的内存向量存储，它支持非常简单的插入、获取、删除，并支持密集搜索和元数据过滤。然后，它可以插入到 LlamaIndex 的其他抽象中。

它尚不支持稀疏搜索，显然也不适用于任何实际应用程序。但这应该能揭示一些底层原理！

输入 [ ]

已复制！

from llama_index.core import VectorStoreIndex
返回顶部

输入 [ ]

已复制！

index = VectorStoreIndex.from_vector_store(vector_store)
index = VectorStoreIndex.from_vector_store(vector_store)

输入 [ ]

已复制！

query_engine = index.as_query_engine()
query_engine = index.as_query_engine()

输入 [ ]

已复制！

query_str = "Can you tell me about the key concepts for safety finetuning"
query_str = "你能告诉我关于安全微调的关键概念吗？"

输入 [ ]

已复制！

response = query_engine.query(query_str)
response = query_engine.query(query_str)

输入 [ ]

已复制！

print(str(response))
print(str(response))

The key concepts for safety fine-tuning include supervised safety fine-tuning, safety RLHF (Reinforcement Learning from Human Feedback), and safety context distillation. Supervised safety fine-tuning involves gathering adversarial prompts and safe demonstrations to align the model with safety guidelines before RLHF. Safety RLHF integrates safety into the RLHF pipeline by training a safety-specific reward model and gathering more challenging adversarial prompts for fine-tuning and optimization. Finally, safety context distillation is used to refine the RLHF pipeline. These techniques aim to mitigate safety risks and ensure that the model aligns with safety guidelines.

结论¶

就是这样！我们构建了一个简单的内存向量存储，它支持非常简单的插入、获取、删除，并支持密集搜索和元数据过滤。然后可以将其接入 LlamaIndex 的其他抽象层。

它还不支持稀疏搜索，显然也不是用于任何实际应用。但这应该能揭示底层的一些工作原理！

从零开始构建一个（非常简单的）向量存储¶

设置¶

加载文档¶

from llama_index.core.node_parser import SentenceSplitter node_parser = SentenceSplitter(chunk_size=256) nodes = node_parser.get_nodes_from_documents(documents)

from llama_index.embeddings.openai import OpenAIEmbedding embed_model = OpenAIEmbedding() for node in nodes: node_embedding = embed_model.get_text_embedding( node.get_content(metadata_mode="all") ) node.embedding = node_embedding

现在我们将构建内存向量存储。我们将把节点存储在一个简单的 Python 字典中。我们将从实现嵌入搜索开始，然后添加元数据过滤。

我们将首先定义构建向量存储的接口。它包含以下项

我们添加了一些基本功能，用于向向量存储添加、获取和删除数据。

注意：top-k 值包含在 VectorStoreQuery 容器中。注意：与上面类似，我们定义了另一个子类，仅仅是为了避免重复实现上述函数（并非因为这是良好的代码实践）。

在运行语义搜索之前，我们将 filter_nodes 作为对节点进行第一次过滤处理。

query_str = "Can you tell me about the key concepts for safety finetuning" query_embedding = embed_model.get_query_embedding(query_str)

使用向量存储构建 RAG 系统¶

就这样！我们构建了一个简单的内存向量存储，它支持非常简单的插入、获取、删除，并支持密集搜索和元数据过滤。然后，它可以插入到 LlamaIndex 的其他抽象中。

结论¶

注意：top-k 值包含在 `VectorStoreQuery` 容器中。注意：与上面类似，我们定义了另一个子类，仅仅是为了避免重复实现上述函数（并非因为这是良好的代码实践）。

在运行语义搜索之前，我们将 `filter_nodes` 作为对节点进行第一次过滤处理。