Typesense 向量存储¶
下载数据¶
In [ ]
已复制!
%pip install llama-index-embeddings-openai
%pip install llama-index-vector-stores-typesense
%pip install llama-index-embeddings-openai %pip install llama-index-vector-stores-typesense
In [ ]
已复制!
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/' !wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
加载文档,构建 VectorStoreIndex¶
In [ ]
已复制!
# import logging
# import sys
# logging.basicConfig(stream=sys.stdout, level=logging.INFO)
# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext,
)
from IPython.display import Markdown, display
# import logging # import sys # logging.basicConfig(stream=sys.stdout, level=logging.INFO) # logging.getLogger().addHandler(StreamHandler(stream=sys.stdout)) from llama_index.core import ( VectorStoreIndex, SimpleDirectoryReader, StorageContext, ) from IPython.display import Markdown, display
In [ ]
已复制!
# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
# load documents documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
In [ ]
已复制!
from llama_index.vector_stores.typesense import TypesenseVectorStore
from typesense import Client
typesense_client = Client(
{
"api_key": "xyz",
"nodes": [{"host": "localhost", "port": "8108", "protocol": "http"}],
"connection_timeout_seconds": 2,
}
)
typesense_vector_store = TypesenseVectorStore(typesense_client)
storage_context = StorageContext.from_defaults(
vector_store=typesense_vector_store
)
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context
)
from llama_index.vector_stores.typesense import TypesenseVectorStore from typesense import Client typesense_client = Client( { "api_key": "xyz", "nodes": [{"host": "localhost", "port": "8108", "protocol": "http"}], "connection_timeout_seconds": 2, } ) typesense_vector_store = TypesenseVectorStore(typesense_client) storage_context = StorageContext.from_defaults( vector_store=typesense_vector_store ) index = VectorStoreIndex.from_documents( documents, storage_context=storage_context )
查询索引¶
In [ ]
已复制!
from llama_index.core import QueryBundle
from llama_index.embeddings.openai import OpenAIEmbedding
# By default, typesense vector store uses vector search. You need to provide the embedding yourself.
query_str = "What did the author do growing up?"
embed_model = OpenAIEmbedding()
# You can also get the settings from the Settings object
from llama_index.core import Settings
# embed_model = Settings.embed_model
query_embedding = embed_model.get_agg_embedding_from_queries(query_str)
query_bundle = QueryBundle(query_str, embedding=query_embedding)
response = index.as_query_engine().query(query_bundle)
display(Markdown(f"<b>{response}</b>"))
from llama_index.core import QueryBundle from llama_index.embeddings.openai import OpenAIEmbedding # By default, typesense vector store uses vector search. You need to provide the embedding yourself. query_str = "What did the author do growing up?" embed_model = OpenAIEmbedding() # You can also get the settings from the Settings object from llama_index.core import Settings # embed_model = Settings.embed_model query_embedding = embed_model.get_agg_embedding_from_queries(query_str) query_bundle = QueryBundle(query_str, embedding=query_embedding) response = index.as_query_engine().query(query_bundle) display(Markdown(f"{response}"))
作者成长过程中跳过了计算机发展的一个阶段,学习了意大利语,漫步佛罗伦萨,画人物,与科技公司合作,在 RISD 寻找标志性风格,住在租金管制公寓,发布软件,编辑代码(包括 Lisp 表达式),撰写文章,在线发表,并接收来自愤怒读者的反馈。他还经历了 20 世纪 90 年代商品处理器的指数级增长,这整合了高端、专用硬件和软件公司。他还学会了如何用几个简单的动词串联抽象概念,从而让一点意大利语发挥大作用。他还经历了艺术界金钱和酷炫的紧密结合,以及任何昂贵的东西都会被视为酷炫,而任何被视为酷炫的东西很快也会变得同样昂贵。他还经历了发布软件的挑战,因为他必须招募初始用户群,并确保他们在公开发布前拥有体面的商店。他还经历了第一次熟悉的感觉,当时他阅读了评论,发现里面充满了愤怒的人。他还经历了在线发布和公开发布的区别。最后,他写了一些关于他积累的主题的文章,并写了一个更详细的版本供他人阅读。
In [ ]
已复制!
from llama_index.core.vector_stores.types import VectorStoreQueryMode
# You can also use text search
query_bundle = QueryBundle(query_str=query_str)
response = index.as_query_engine(
vector_store_query_mode=VectorStoreQueryMode.TEXT_SEARCH
).query(query_bundle)
display(Markdown(f"<b>{response}</b>"))
from llama_index.core.vector_stores.types import VectorStoreQueryMode # You can also use text search query_bundle = QueryBundle(query_str=query_str) response = index.as_query_engine( vector_store_query_mode=VectorStoreQueryMode.TEXT_SEARCH ).query(query_bundle) display(Markdown(f"{response}"))
作者成长于互联网泡沫时期,当时他正在经营一家初创公司。他们不得不雇用比预期更多的人手,以便看起来更专业,并且在雅虎收购他们之前一直受制于他们的投资者。他们学到了很多关于零售和初创公司的知识,并且必须做很多他们不一定擅长的事情,才能使他们的业务成功。