Neo4j 向量存储¶

如果您在 Colab 上打开此 Notebook，您可能需要安装 LlamaIndex 🦙。

In [ ]

已复制！

%pip install llama-index-vector-stores-neo4jvector
%pip install llama-index-vector-stores-neo4jvector

In [ ]

已复制！

!pip install llama-index
!pip install llama-index

In [ ]

已复制！

import os
import openai

os.environ["OPENAI_API_KEY"] = "OPENAI_API_KEY"
openai.api_key = os.environ["OPENAI_API_KEY"]
import os import openai os.environ["OPENAI_API_KEY"] = "OPENAI_API_KEY" openai.api_key = os.environ["OPENAI_API_KEY"]

初始化 Neo4j 向量包装器¶

In [ ]

已复制！

from llama_index.vector_stores.neo4jvector import Neo4jVectorStore

username = "neo4j"
password = "pleaseletmein"
url = "bolt://:7687"
embed_dim = 1536

neo4j_vector = Neo4jVectorStore(username, password, url, embed_dim)
from llama_index.vector_stores.neo4jvector import Neo4jVectorStore username = "neo4j" password = "pleaseletmein" url = "bolt://:7687" embed_dim = 1536 neo4j_vector = Neo4jVectorStore(username, password, url, embed_dim)

加载文档，构建 VectorStoreIndex¶

In [ ]

已复制！

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from IPython.display import Markdown, display
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader from IPython.display import Markdown, display

下载数据

In [ ]

已复制！

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/' !wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

--2023-12-14 18:44:00--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘data/paul_graham/paul_graham_essay.txt’

data/paul_graham/pa 100%[===================>]  73,28K  --.-KB/s    in 0,03s   

2023-12-14 18:44:00 (2,16 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]

In [ ]

已复制！

# load documents
documents = SimpleDirectoryReader("./data/paul_graham").load_data()
# load documents documents = SimpleDirectoryReader("./data/paul_graham").load_data()

In [ ]

已复制！

from llama_index.core import StorageContext

storage_context = StorageContext.from_defaults(vector_store=neo4j_vector)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)
from llama_index.core import StorageContext storage_context = StorageContext.from_defaults(vector_store=neo4j_vector) index = VectorStoreIndex.from_documents( documents, storage_context=storage_context )

In [ ]

已复制！

query_engine = index.as_query_engine()
response = query_engine.query("What happened at interleaf?")
display(Markdown(f"<b>{response}</b>"))
query_engine = index.as_query_engine() response = query_engine.query("What happened at interleaf?") display(Markdown(f"{response}"))

在 Interleaf，他们添加了一种受 Emacs 启发并将其做成 Lisp 方言的脚本语言。他们正在寻找一位 Lisp 高手来用这种脚本语言编写程序。文本作者在 Interleaf 工作，并提到他们的 Lisp 是巨大 C 蛋糕上最薄的一层糖霜。作者还提到他们不懂 C 语言，也不想学，所以他们从未理解 Interleaf 的大部分软件。此外，作者承认自己是一名糟糕的员工，并花费了大量时间在一个名为 'On Lisp' 的独立项目上。

混合搜索¶

混合搜索结合了关键词搜索和向量搜索。要使用混合搜索，您需要将 `hybrid_search` 设置为 `True`。

In [ ]

已复制！

neo4j_vector_hybrid = Neo4jVectorStore(
    username, password, url, embed_dim, hybrid_search=True
)
neo4j_vector_hybrid = Neo4jVectorStore( username, password, url, embed_dim, hybrid_search=True )

In [ ]

已复制！





storage_context = StorageContext.from_defaults(
    vector_store=neo4j_vector_hybrid
)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)
query_engine = index.as_query_engine()
response = query_engine.query("What happened at interleaf?")
display(Markdown(f"<b>{response}</b>"))
storage_context = StorageContext.from_defaults( vector_store=neo4j_vector_hybrid ) index = VectorStoreIndex.from_documents( documents, storage_context=storage_context ) query_engine = index.as_query_engine() response = query_engine.query("What happened at interleaf?") display(Markdown(f"{response}"))

在 Interleaf，他们添加了一种受 Emacs 启发并将其做成 Lisp 方言的脚本语言。他们正在寻找一位 Lisp 高手来用这种脚本语言编写程序。文章作者在 Interleaf 工作，但他不理解大部分软件，因为他不懂 C 语言，也不想学。他还提到他们的 Lisp 是巨大 C 蛋糕上最薄的一层糖霜。作者承认自己是一名糟糕的员工，并花费了大量时间致力于出版 On Lisp 的合同。

加载现有向量索引¶

要连接到现有向量索引，您需要定义 `index_name` 和 `text_node_property` 参数

index_name: 现有向量索引的名称 (默认为 `vector`)
text_node_property: 包含文本值的属性名称 (默认为 `text`)

In [ ]

已复制！





index_name = "existing_index"
text_node_property = "text"
existing_vector = Neo4jVectorStore(
    username,
    password,
    url,
    embed_dim,
    index_name=index_name,
    text_node_property=text_node_property,
)

loaded_index = VectorStoreIndex.from_vector_store(existing_vector)
index_name = "existing_index" text_node_property = "text" existing_vector = Neo4jVectorStore( username, password, url, embed_dim, index_name=index_name, text_node_property=text_node_property, ) loaded_index = VectorStoreIndex.from_vector_store(existing_vector)

定制响应¶

您可以使用 `retrieval_query` 参数定制从知识图谱中检索到的信息。

检索查询必须返回以下四列

text:str - 返回文档的文本
score:str - 相似度得分
id:str - 节点 id
metadata: Dict - 包含额外元数据的字典 (必须包含 `_node_type` 和 `_node_content` 键)

In [ ]

已复制！





retrieval_query = (
    "RETURN 'Interleaf hired Tomaz' AS text, score, node.id AS id, "
    "{author: 'Tomaz', _node_type:node._node_type, _node_content:node._node_content} AS metadata"
)
neo4j_vector_retrieval = Neo4jVectorStore(
    username, password, url, embed_dim, retrieval_query=retrieval_query
)
retrieval_query = ( "RETURN 'Interleaf hired Tomaz' AS text, score, node.id AS id, " "{author: 'Tomaz', _node_type:node._node_type, _node_content:node._node_content} AS metadata" ) neo4j_vector_retrieval = Neo4jVectorStore( username, password, url, embed_dim, retrieval_query=retrieval_query )

In [ ]

已复制！

loaded_index = VectorStoreIndex.from_vector_store(
    neo4j_vector_retrieval
).as_query_engine()
response = loaded_index.query("What happened at interleaf?")
display(Markdown(f"<b>{response}</b>"))
loaded_index = VectorStoreIndex.from_vector_store( neo4j_vector_retrieval ).as_query_engine() response = loaded_index.query("What happened at interleaf?") display(Markdown(f"{response}"))

Interleaf 雇佣了 Tomaz。