Neo4j 属性图索引¶

Neo4j 是一个生产级图数据库，能够存储属性图、执行向量搜索、过滤等操作。

最简单的入门方法是使用 Neo4j Aura 的云托管实例

在本笔记本中，我们将介绍如何使用 docker 在本地运行数据库。

如果您已经有一个现有图，请跳到本笔记本的末尾。

In [ ]

已复制！

%pip install llama-index llama-index-graph-stores-neo4j
%pip install llama-index llama-index-graph-stores-neo4j

Docker 设置¶

要在本地启动 Neo4j，首先请确保已安装 docker。然后，可以使用以下 docker 命令启动数据库

docker run \
    -p 7474:7474 -p 7687:7687 \
    -v $PWD/data:/data -v $PWD/plugins:/plugins \
    --name neo4j-apoc \
    -e NEO4J_apoc_export_file_enabled=true \
    -e NEO4J_apoc_import_file_enabled=true \
    -e NEO4J_apoc_import_file_use__neo4j__config=true \
    -e NEO4JLABS_PLUGINS=\[\"apoc\"\] \
    neo4j:latest

从这里，您可以在 http://localhost:7474/ 打开数据库。在此页面上，系统会要求您登录。使用默认用户名/密码 neo4j 和 neo4j。

首次登录后，系统会要求您更改密码。

之后，您就可以创建您的第一个属性图了！

环境设置¶

我们需要一些环境设置才能开始。

In [ ]

已复制！

import os

os.environ["OPENAI_API_KEY"] = "sk-proj-..."
import os os.environ["OPENAI_API_KEY"] = "sk-proj-..."

In [ ]

已复制！

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/' !wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

In [ ]

已复制！

import nest_asyncio

nest_asyncio.apply()
import nest_asyncio nest_asyncio.apply()

In [ ]

已复制！

from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
from llama_index.core import SimpleDirectoryReader documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

/Users/loganmarkewich/Library/Caches/pypoetry/virtualenvs/llama-index-caVs7DDe-py3.11/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

索引构建¶

In [ ]

已复制！





from llama_index.graph_stores.neo4j import Neo4jPropertyGraphStore

# Note: used to be `Neo4jPGStore`
graph_store = Neo4jPropertyGraphStore(
    username="neo4j",
    password="llamaindex",
    url="bolt://localhost:7687",
)
from llama_index.graph_stores.neo4j import Neo4jPropertyGraphStore # 注意：以前是 `Neo4jPGStore` graph_store = Neo4jPropertyGraphStore( username="neo4j", password="llamaindex", url="bolt://localhost:7687", )

In [ ]

已复制！





from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core.indices.property_graph import SchemaLLMPathExtractor

index = PropertyGraphIndex.from_documents(
    documents,
    embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
    kg_extractors=[
        SchemaLLMPathExtractor(
            llm=OpenAI(model="gpt-3.5-turbo", temperature=0.0)
        )
    ],
    property_graph_store=graph_store,
    show_progress=True,
)
from llama_index.core import PropertyGraphIndex from llama_index.embeddings.openai import OpenAIEmbedding from llama_index.llms.openai import OpenAI from llama_index.core.indices.property_graph import SchemaLLMPathExtractor index = PropertyGraphIndex.from_documents( documents, embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"), kg_extractors=[ SchemaLLMPathExtractor( llm=OpenAI(model="gpt-3.5-turbo", temperature=0.0) ) ], property_graph_store=graph_store, show_progress=True, )

Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 21.63it/s]
Extracting paths from text with schema: 100%|██████████| 22/22 [01:06<00:00,  3.02s/it]
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.06it/s]
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.89it/s]

图创建完成后，我们可以访问 http://localhost:7474/ 在 UI 中探索它。

查看整个图的最简单方法是在顶部使用 cypher 命令，例如 "match n=() return n"。

要删除整个图，一个有用的命令是 "match n=() detach delete n"。

查询和检索¶

In [ ]

已复制！

retriever = index.as_retriever(
    include_text=False,  # include source text in returned nodes, default True
)

nodes = retriever.retrieve("What happened at Interleaf and Viaweb?")

for node in nodes:
    print(node.text)
retriever = index.as_retriever( include_text=False, # 在返回的节点中包含源文本，默认为 True ) nodes = retriever.retrieve("What happened at Interleaf and Viaweb?") for node in nodes: print(node.text)

Interleaf -> Got crushed by -> Moore's law
Interleaf -> Made -> Scripting language
Interleaf -> Had -> Smart people
Interleaf -> Inspired by -> Emacs
Interleaf -> Had -> Few years to live
Interleaf -> Made -> Software
Interleaf -> Had done -> Something bold
Interleaf -> Added -> Scripting language
Interleaf -> Built -> Impressive technology
Interleaf -> Was -> Company
Viaweb -> Was -> Profitable
Viaweb -> Was -> Growing rapidly
Viaweb -> Suggested -> Hospital
Idea -> Was clear from -> Experience
Idea -> Would have to be embodied as -> Company
Painting department -> Seemed to be -> Rigorous

In [ ]

已复制！

query_engine = index.as_query_engine(include_text=True)

response = query_engine.query("What happened at Interleaf and Viaweb?")

print(str(response))
query_engine = index.as_query_engine(include_text=True) response = query_engine.query("What happened at Interleaf and Viaweb?") print(str(response))

Interleaf had smart people and built impressive technology but got crushed by Moore's Law. Viaweb was profitable and growing rapidly.

从现有图加载¶

如果您有一个现有图（无论是否使用 LlamaIndex 创建），我们可以连接并使用它！

注意：如果您的图是在 LlamaIndex 外部创建的，最实用的检索器将是文本到 cypher 或 cypher 模板。其他检索器依赖于 LlamaIndex 插入的属性。

In [ ]

已复制！





from llama_index.graph_stores.neo4j import Neo4jPropertyGraphStore
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI

graph_store = Neo4jPropertyGraphStore(
    username="neo4j",
    password="794613852",
    url="bolt://localhost:7687",
)

index = PropertyGraphIndex.from_existing(
    property_graph_store=graph_store,
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3),
    embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
)
from llama_index.graph_stores.neo4j import Neo4jPropertyGraphStore from llama_index.core import PropertyGraphIndex from llama_index.embeddings.openai import OpenAIEmbedding from llama_index.llms.openai import OpenAI graph_store = Neo4jPropertyGraphStore( username="neo4j", password="794613852", url="bolt://localhost:7687", ) index = PropertyGraphIndex.from_existing( property_graph_store=graph_store, llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3), embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"), )

从这里，我们仍然可以插入更多文档！

In [ ]

已复制！

from llama_index.core import Document

document = Document(text="LlamaIndex is great!")

index.insert(document)
from llama_index.core import Document document = Document(text="LlamaIndex 很棒！") index.insert(document)

In [ ]

已复制！

nodes = index.as_retriever(include_text=False).retrieve("LlamaIndex")

print(nodes[0].text)
nodes = index.as_retriever(include_text=False).retrieve("LlamaIndex") print(nodes[0].text)

Llamaindex -> Is -> Great

有关属性图的构建、检索和查询的完整详细信息，请参阅完整文档页面。