知识图谱查询引擎¶
创建知识图谱通常涉及专业且复杂的任务。然而,通过利用 Llama Index (LLM)、KnowledgeGraphIndex 和 GraphStore,我们可以方便地从 Llama Hub 支持的任何数据源创建相对有效的知识图谱。
此外,查询知识图谱通常需要与存储系统相关的领域特定知识,例如 Cypher。但是,借助 LLM 和 LlamaIndex KnowledgeGraphQueryEngine,这可以使用自然语言完成!
在此演示中,我们将指导您完成以下步骤:
- 使用 Llama Index 提取和设置知识图谱
- 使用 Cypher 查询知识图谱
- 使用自然语言查询知识图谱
如果您正在 colab 上打开此 Notebook,您可能需要安装 LlamaIndex 🦙。
%pip install llama-index-readers-wikipedia
%pip install llama-index-llms-azure-openai
%pip install llama-index-graph-stores-nebula
%pip install llama-index-llms-openai
%pip install llama-index-embeddings-azure-openai
!pip install llama-index
首先我们为 Llama Index 做一些基本准备。
OpenAI¶
# For OpenAI
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
import logging
import sys
logging.basicConfig(
stream=sys.stdout, level=logging.INFO
) # logging.DEBUG for more verbose output
# define LLM
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
Settings.llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
Settings.chunk_size = 512
Azure¶
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
# For Azure OpenAI
api_key = "<api-key>"
azure_endpoint = "https://<your-resource-name>.openai.azure.com/"
api_version = "2023-07-01-preview"
llm = AzureOpenAI(
model="gpt-35-turbo-16k",
deployment_name="my-custom-llm",
api_key=api_key,
azure_endpoint=azure_endpoint,
api_version=api_version,
)
# You need to deploy your own embedding model as well as your own chat completion model
embed_model = AzureOpenAIEmbedding(
model="text-embedding-ada-002",
deployment_name="my-custom-embedding",
api_key=api_key,
azure_endpoint=azure_endpoint,
api_version=api_version,
)
from llama_index.core import Settings
Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512
准备 NebulaGraph¶
在下一步创建知识图谱之前,请确保您已安装并运行 NebulaGraph,并且已定义数据模式。
# Create a NebulaGraph (version 3.5.0 or newer) cluster with:
# Option 0 for machines with Docker: `curl -fsSL nebula-up.siwei.io/install.sh | bash`
# Option 1 for Desktop: NebulaGraph Docker Extension https://hub.docker.com/extensions/weygu/nebulagraph-dd-ext
# If not, create it with the following commands from NebulaGraph's console:
# CREATE SPACE llamaindex(vid_type=FIXED_STRING(256), partition_num=1, replica_factor=1);
# :sleep 10;
# USE llamaindex;
# CREATE TAG entity(name string);
# CREATE EDGE relationship(relationship string);
# :sleep 10;
# CREATE TAG INDEX entity_index ON entity(name(256));
%pip install ipython-ngql nebula3-python
os.environ["NEBULA_USER"] = "root"
os.environ["NEBULA_PASSWORD"] = "nebula" # default is "nebula"
os.environ[
"NEBULA_ADDRESS"
] = "127.0.0.1:9669" # assumed we have NebulaGraph installed locally
space_name = "llamaindex"
edge_types, rel_prop_names = ["relationship"], [
"relationship"
] # default, could be omit if create from an empty kg
tags = ["entity"] # default, could be omit if create from an empty kg
Requirement already satisfied: ipython-ngql in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (0.5)
Requirement already satisfied: nebula3-python in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (3.4.0)
Requirement already satisfied: pandas in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (2.0.3)
Requirement already satisfied: Jinja2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (3.1.2)
Requirement already satisfied: pytz>=2021.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (2023.3)
Requirement already satisfied: future>=0.18.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (0.18.3)
Requirement already satisfied: httplib2>=0.20.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (0.22.0)
Requirement already satisfied: six>=1.16.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python) (1.16.0)
Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from httplib2>=0.20.0->nebula3-python) (3.0.9)
Requirement already satisfied: MarkupSafe>=2.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from Jinja2->ipython-ngql) (2.1.3)
Requirement already satisfied: tzdata>=2022.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2023.3)
Requirement already satisfied: numpy>=1.20.3 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (1.25.2)
Requirement already satisfied: python-dateutil>=2.8.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2.8.2)
WARNING: You are using pip version 21.2.4; however, version 23.2.1 is available.
You should consider upgrading via the '/Users/loganmarkewich/llama_index/llama-index/bin/python -m pip install --upgrade pip' command.
Note: you may need to restart the kernel to use updated packages.
准备 StorageContext,将 graph_store 设置为 NebulaGraphStore
from llama_index.core import StorageContext
from llama_index.graph_stores.nebula import NebulaGraphStore
graph_store = NebulaGraphStore(
space_name=space_name,
edge_types=edge_types,
rel_prop_names=rel_prop_names,
tags=tags,
)
storage_context = StorageContext.from_defaults(graph_store=graph_store)
(可选)使用 LlamaIndex 构建知识图谱¶
在已定义 Llama Index 和 LLM 的帮助下,我们可以从给定文档构建知识图谱。
如果 NebulaGraphStore 中已有知识图谱,则可以跳过此步骤
步骤 1:从 Wikipedia 加载“银河护卫队 3”的数据¶
from llama_index.core import download_loader
from llama_index.readers.wikipedia import WikipediaReader
loader = WikipediaReader()
documents = loader.load_data(
pages=["Guardians of the Galaxy Vol. 3"], auto_suggest=False
)
from llama_index.core import KnowledgeGraphIndex
kg_index = KnowledgeGraphIndex.from_documents(
documents,
storage_context=storage_context,
max_triplets_per_chunk=10,
space_name=space_name,
edge_types=edge_types,
rel_prop_names=rel_prop_names,
tags=tags,
include_embeddings=True,
)
现在我们在 NebulaGraph 集群的 llamaindex
空间中拥有了一个关于电影“银河护卫队 3”的知识图谱,让我们来玩一下。
# install related packages, password is nebula by default
%pip install ipython-ngql networkx pyvis
%load_ext ngql
%ngql --address 127.0.0.1 --port 9669 --user root --password <password>
Requirement already satisfied: ipython-ngql in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (0.5)
Requirement already satisfied: networkx in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (3.1)
Requirement already satisfied: pyvis in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (0.3.2)
Requirement already satisfied: Jinja2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (3.1.2)
Requirement already satisfied: pandas in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (2.0.3)
Requirement already satisfied: nebula3-python in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython-ngql) (3.4.0)
Requirement already satisfied: jsonpickle>=1.4.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pyvis) (3.0.1)
Requirement already satisfied: ipython>=5.3.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pyvis) (8.10.0)
Requirement already satisfied: backcall in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.2.0)
Requirement already satisfied: pickleshare in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.7.5)
Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.30 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (3.0.39)
Requirement already satisfied: appnope in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.1.3)
Requirement already satisfied: pygments>=2.4.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (2.15.1)
Requirement already satisfied: traitlets>=5 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (5.9.0)
Requirement already satisfied: pexpect>4.3 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (4.8.0)
Requirement already satisfied: stack-data in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.6.2)
Requirement already satisfied: decorator in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (5.1.1)
Requirement already satisfied: jedi>=0.16 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.18.2)
Requirement already satisfied: matplotlib-inline in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from ipython>=5.3.0->pyvis) (0.1.6)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from jedi>=0.16->ipython>=5.3.0->pyvis) (0.8.3)
Requirement already satisfied: MarkupSafe>=2.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from Jinja2->ipython-ngql) (2.1.3)
Requirement already satisfied: ptyprocess>=0.5 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pexpect>4.3->ipython>=5.3.0->pyvis) (0.7.0)
Requirement already satisfied: wcwidth in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from prompt-toolkit<3.1.0,>=3.0.30->ipython>=5.3.0->pyvis) (0.2.6)
Requirement already satisfied: six>=1.16.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (1.16.0)
Requirement already satisfied: pytz>=2021.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (2023.3)
Requirement already satisfied: future>=0.18.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (0.18.3)
Requirement already satisfied: httplib2>=0.20.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from nebula3-python->ipython-ngql) (0.22.0)
Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from httplib2>=0.20.0->nebula3-python->ipython-ngql) (3.0.9)
Requirement already satisfied: python-dateutil>=2.8.2 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2.8.2)
Requirement already satisfied: numpy>=1.20.3 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (1.25.2)
Requirement already satisfied: tzdata>=2022.1 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from pandas->ipython-ngql) (2023.3)
Requirement already satisfied: executing>=1.2.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from stack-data->ipython>=5.3.0->pyvis) (1.2.0)
Requirement already satisfied: pure-eval in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from stack-data->ipython>=5.3.0->pyvis) (0.2.2)
Requirement already satisfied: asttokens>=2.1.0 in /Users/loganmarkewich/llama_index/llama-index/lib/python3.9/site-packages (from stack-data->ipython>=5.3.0->pyvis) (2.2.1)
WARNING: You are using pip version 21.2.4; however, version 23.2.1 is available.
You should consider upgrading via the '/Users/loganmarkewich/llama_index/llama-index/bin/python -m pip install --upgrade pip' command.
Note: you may need to restart the kernel to use updated packages.
Connection Pool Created
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
[ERROR]:
'IPythonNGQL' object has no attribute '_decode_value'
Name | |
---|---|
0 | llamaindex |
# Query some random Relationships with Cypher
%ngql USE llamaindex;
%ngql MATCH ()-[e]->() RETURN e LIMIT 10
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669) INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
e | |
---|---|
0 | ("A second trailer for the film")-[:relationsh... |
1 | ("Adam McKay")-[:relationship@-442854342936029... |
2 | ("Adam McKay")-[:relationship@8513344855738553... |
3 | ("Asim Chaudhry")-[:relationship@-803614038978... |
4 | ("Bakalova")-[:relationship@-25325064520311626... |
5 | ("Bautista")-[:relationship@-90386029986457371... |
6 | ("Bautista")-[:relationship@-90386029986457371... |
7 | ("Beth Mickle")-[:relationship@716197657641767... |
8 | ("Bradley Cooper")-[:relationship@138630731832... |
9 | ("Bradley Cooper")-[:relationship@838402633192... |
# draw the result
%ng_draw
nebulagraph_draw.html
查询知识图谱¶
最后,让我们演示如何使用自然语言查询知识图谱!
在这里,我们将利用 KnowledgeGraphQueryEngine
,并使用 NebulaGraphStore
作为 storage_context.graph_store
。
from llama_index.core.query_engine import KnowledgeGraphQueryEngine
from llama_index.core import StorageContext
from llama_index.graph_stores.nebula import NebulaGraphStore
query_engine = KnowledgeGraphQueryEngine(
storage_context=storage_context,
llm=llm,
verbose=True,
)
response = query_engine.query(
"Tell me about Peter Quill?",
)
display(Markdown(f"<b>{response}</b>"))
Graph Store Query: ``` MATCH (p:`entity`)-[:relationship]->(m:`entity`) WHERE p.`entity`.`name` == 'Peter Quill' RETURN p.`entity`.`name`; ``` Graph Store Response: {'p.entity.name': ['Peter Quill', 'Peter Quill', 'Peter Quill', 'Peter Quill', 'Peter Quill']} Final Response: Peter Quill is a character in the Marvel Universe. He is the son of Meredith Quill and Ego the Living Planet.
Peter Quill 是漫威宇宙中的一个角色。他是 Meredith Quill 和 Ego the Living Planet 的儿子。
graph_query = query_engine.generate_query(
"Tell me about Peter Quill?",
)
graph_query = graph_query.replace("WHERE", "\n WHERE").replace(
"RETURN", "\nRETURN"
)
display(
Markdown(
f"""
```cypher
{graph_query}
```
"""
)
)
MATCH (p:entity
)-[:relationship]->(m:entity
) WHERE p.entity
.name
== 'Peter Quill'
RETURN p.entity
.name
;
我们可以看到它帮助生成了图查询
MATCH (p:`entity`)-[:relationship]->(e:`entity`)
WHERE p.`entity`.`name` == 'Peter Quill'
RETURN e.`entity`.`name`;
并根据其结果合成答案
{'e2.entity.name': ['grandfather', 'alternate version of Gamora', 'Guardians of the Galaxy']}
当然,我们仍然可以查询它!而且这个查询引擎可以是我们的最佳图查询语言学习助手 :)
%%ngql
MATCH (p:`entity`)-[e:relationship]->(m:`entity`)
WHERE p.`entity`.`name` == 'Peter Quill'
RETURN p.`entity`.`name`, e.relationship, m.`entity`.`name`;
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
p.entity.name | e.relationship | m.entity.name | |
---|---|---|---|
0 | Peter Quill | 会回到 MCU | 2021年5月 |
1 | Peter Quill | 从地球被绑架 | 小时候 |
2 | Peter Quill | 是领袖 | 银河护卫队 |
3 | Peter Quill | 由...抚养长大 | 一群外星盗贼和走私犯 |
4 | Peter Quill | 半人类 | 半天神 |
并更改查询以进行渲染
%%ngql
MATCH (p:`entity`)-[e:relationship]->(m:`entity`)
WHERE p.`entity`.`name` == 'Peter Quill'
RETURN p, e, m;
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
p | e | m | |
---|---|---|---|
0 | ("Peter Quill" :entity{name: "Peter Quill"}) | ("Peter Quill")-[:relationship@-84437522554765... | ("May 2021" :entity{name: "May 2021"}) |
1 | ("Peter Quill" :entity{name: "Peter Quill"}) | ("Peter Quill")-[:relationship@-11770408155938... | ("as a child" :entity{name: "as a child"}) |
2 | ("Peter Quill" :entity{name: "Peter Quill"}) | ("Peter Quill")-[:relationship@-79394488349732... | ("Guardians of the Galaxy" :entity{name: "Guar... |
3 | ("Peter Quill" :entity{name: "Peter Quill"}) | ("Peter Quill")-[:relationship@325695233021653... | ("a group of alien thieves and smugglers" :ent... |
4 | ("Peter Quill" :entity{name: "Peter Quill"}) | ("Peter Quill")-[:relationship@555553046209276... | ("half-Celestial" :entity{name: "half-Celestia... |
%ng_draw
nebulagraph_draw.html
通过渲染的图,此知识获取查询的结果再清晰不过了。