Memgraph属性图索引¶
Memgraph是一个开源的图数据库,专为实时流处理和快速分析存储数据而构建。
在运行Memgraph之前,请确保Docker已在后台运行。首次试用Memgraph Platform(Memgraph数据库 + MAGE库 + Memgraph Lab)的最快方法是运行以下命令
适用于Linux/macOS
curl https://install.memgraph.com | sh
适用于Windows
iwr https://windows.memgraph.com | iex
从这里,您可以在http://localhost:3000/上查看Memgraph的可视化工具Memgraph Lab,或者使用该应用的桌面版本。
In [ ]
已复制!
%pip install llama-index llama-index-graph-stores-memgraph
%pip install llama-index llama-index-graph-stores-memgraph
环境设置¶
In [ ]
已复制!
import os
os.environ[
"OPENAI_API_KEY"
] = "sk-proj-..." # Replace with your OpenAI API key
import os os.environ[ "OPENAI_API_KEY" ] = "sk-proj-..." # 替换为您的OpenAI API密钥
创建数据目录并下载我们将在此示例中用作输入数据的Paul Graham文章。
In [ ]
已复制!
import urllib.request
os.makedirs("data/paul_graham/", exist_ok=True)
url = "https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt"
output_path = "data/paul_graham/paul_graham_essay.txt"
urllib.request.urlretrieve(url, output_path)
import urllib.request os.makedirs("data/paul_graham/", exist_ok=True) url = "https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt" output_path = "data/paul_graham/paul_graham_essay.txt" urllib.request.urlretrieve(url, output_path)
In [ ]
已复制!
import nest_asyncio
nest_asyncio.apply()
import nest_asyncio nest_asyncio.apply()
读取文件,替换单引号,保存修改后的内容,并使用SimpleDirectoryReader
加载文档数据
In [ ]
已复制!
from llama_index.core import SimpleDirectoryReader
with open(output_path, "r", encoding="utf-8") as file:
content = file.read()
with open(output_path, "w", encoding="utf-8") as file:
file.write(content)
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
from llama_index.core import SimpleDirectoryReader with open(output_path, "r", encoding="utf-8") as file: content = file.read() with open(output_path, "w", encoding="utf-8") as file: file.write(content) documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
设置Memgraph连接¶
通过提供数据库凭据来设置您的图存储类。
In [ ]
已复制!
from llama_index.graph_stores.memgraph import MemgraphPropertyGraphStore
username = "" # Enter your Memgraph username (default "")
password = "" # Enter your Memgraph password (default "")
url = "" # Specify the connection URL, e.g., 'bolt://localhost:7687'
graph_store = MemgraphPropertyGraphStore(
username=username,
password=password,
url=url,
)
from llama_index.graph_stores.memgraph import MemgraphPropertyGraphStore username = "" # 输入您的Memgraph用户名(默认为"") password = "" # 输入您的Memgraph密码(默认为"") url = "" # 指定连接URL,例如 'bolt://localhost:7687' graph_store = MemgraphPropertyGraphStore( username=username, password=password, url=url, )
索引构建¶
In [ ]
已复制!
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core.indices.property_graph import SchemaLLMPathExtractor
index = PropertyGraphIndex.from_documents(
documents,
embed_model=OpenAIEmbedding(model_name="text-embedding-ada-002"),
kg_extractors=[
SchemaLLMPathExtractor(
llm=OpenAI(model="gpt-3.5-turbo", temperature=0.0)
)
],
property_graph_store=graph_store,
show_progress=True,
)
from llama_index.core import PropertyGraphIndex from llama_index.embeddings.openai import OpenAIEmbedding from llama_index.llms.openai import OpenAI from llama_index.core.indices.property_graph import SchemaLLMPathExtractor index = PropertyGraphIndex.from_documents( documents, embed_model=OpenAIEmbedding(model_name="text-embedding-ada-002"), kg_extractors=[ SchemaLLMPathExtractor( llm=OpenAI(model="gpt-3.5-turbo", temperature=0.0) ) ], property_graph_store=graph_store, show_progress=True, )
图创建完成后,我们可以访问http://localhost:3000/在UI中探索它。
可视化整个图的最简单方法是运行类似于以下的Cypher命令
MATCH p=()-[]-() RETURN p;
此命令匹配图中的所有可能路径并返回整个图。
要可视化图的架构,请访问“图架构”标签页,并根据新创建的图生成新的架构。
要删除整个图,请使用
MATCH (n) DETACH DELETE n;
查询与检索¶
In [ ]
已复制!
retriever = index.as_retriever(include_text=False)
# Example query: "What happened at Interleaf and Viaweb?"
nodes = retriever.retrieve("What happened at Interleaf and Viaweb?")
# Output results
print("Query Results:")
for node in nodes:
print(node.text)
# Alternatively, using a query engine
query_engine = index.as_query_engine(include_text=True)
# Perform a query and print the detailed response
response = query_engine.query("What happened at Interleaf and Viaweb?")
print("\nDetailed Query Response:")
print(str(response))
retriever = index.as_retriever(include_text=False) # 示例查询:"What happened at Interleaf and Viaweb?" nodes = retriever.retrieve("What happened at Interleaf and Viaweb?") # 输出结果 print("Query Results:") for node in nodes: print(node.text) # 或者,使用查询引擎 query_engine = index.as_query_engine(include_text=True) # 执行查询并打印详细响应 response = query_engine.query("What happened at Interleaf and Viaweb?") print("\nDetailed Query Response:") print(str(response))
从现有图加载¶
In [ ]
已复制!
llm = OpenAI(model="gpt-4", temperature=0.0)
kg_extractors = [SchemaLLMPathExtractor(llm=llm)]
index = PropertyGraphIndex.from_existing(
property_graph_store=graph_store,
kg_extractors=kg_extractors,
embed_model=OpenAIEmbedding(model_name="text-embedding-ada-002"),
show_progress=True,
)
llm = OpenAI(model="gpt-4", temperature=0.0) kg_extractors = [SchemaLLMPathExtractor(llm=llm)] index = PropertyGraphIndex.from_existing( property_graph_store=graph_store, kg_extractors=kg_extractors, embed_model=OpenAIEmbedding(model_name="text-embedding-ada-002"), show_progress=True, )