Typesense 向量存储¶

下载数据¶

In [ ]

已复制!

%pip install llama-index-embeddings-openai
%pip install llama-index-vector-stores-typesense
%pip install llama-index-embeddings-openai %pip install llama-index-vector-stores-typesense

In [ ]

已复制!

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/' !wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

加载文档，构建 VectorStoreIndex¶

In [ ]

已复制!





# import logging
# import sys

# logging.basicConfig(stream=sys.stdout, level=logging.INFO)
# logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
)
from IPython.display import Markdown, display
# import logging # import sys # logging.basicConfig(stream=sys.stdout, level=logging.INFO) # logging.getLogger().addHandler(StreamHandler(stream=sys.stdout)) from llama_index.core import ( VectorStoreIndex, SimpleDirectoryReader, StorageContext, ) from IPython.display import Markdown, display

In [ ]

已复制!

# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
# load documents documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

In [ ]

已复制!





from llama_index.vector_stores.typesense import TypesenseVectorStore
from typesense import Client

typesense_client = Client(
    {
        "api_key": "xyz",
        "nodes": [{"host": "localhost", "port": "8108", "protocol": "http"}],
        "connection_timeout_seconds": 2,
    }
)
typesense_vector_store = TypesenseVectorStore(typesense_client)
storage_context = StorageContext.from_defaults(
    vector_store=typesense_vector_store
)

index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)
from llama_index.vector_stores.typesense import TypesenseVectorStore from typesense import Client typesense_client = Client( { "api_key": "xyz", "nodes": [{"host": "localhost", "port": "8108", "protocol": "http"}], "connection_timeout_seconds": 2, } ) typesense_vector_store = TypesenseVectorStore(typesense_client) storage_context = StorageContext.from_defaults( vector_store=typesense_vector_store ) index = VectorStoreIndex.from_documents( documents, storage_context=storage_context )

查询索引¶

In [ ]

已复制!





from llama_index.core import QueryBundle
from llama_index.embeddings.openai import OpenAIEmbedding

# By default, typesense vector store uses vector search. You need to provide the embedding yourself.
query_str = "What did the author do growing up?"
embed_model = OpenAIEmbedding()
# You can also get the settings from the Settings object
from llama_index.core import Settings

# embed_model = Settings.embed_model
query_embedding = embed_model.get_agg_embedding_from_queries(query_str)
query_bundle = QueryBundle(query_str, embedding=query_embedding)
response = index.as_query_engine().query(query_bundle)

display(Markdown(f"<b>{response}</b>"))
from llama_index.core import QueryBundle from llama_index.embeddings.openai import OpenAIEmbedding # By default, typesense vector store uses vector search. You need to provide the embedding yourself. query_str = "What did the author do growing up?" embed_model = OpenAIEmbedding() # You can also get the settings from the Settings object from llama_index.core import Settings # embed_model = Settings.embed_model query_embedding = embed_model.get_agg_embedding_from_queries(query_str) query_bundle = QueryBundle(query_str, embedding=query_embedding) response = index.as_query_engine().query(query_bundle) display(Markdown(f"{response}"))

作者成长过程中跳过了计算机发展的一个阶段，学习了意大利语，漫步佛罗伦萨，画人物，与科技公司合作，在 RISD 寻找标志性风格，住在租金管制公寓，发布软件，编辑代码（包括 Lisp 表达式），撰写文章，在线发表，并接收来自愤怒读者的反馈。他还经历了 20 世纪 90 年代商品处理器的指数级增长，这整合了高端、专用硬件和软件公司。他还学会了如何用几个简单的动词串联抽象概念，从而让一点意大利语发挥大作用。他还经历了艺术界金钱和酷炫的紧密结合，以及任何昂贵的东西都会被视为酷炫，而任何被视为酷炫的东西很快也会变得同样昂贵。他还经历了发布软件的挑战，因为他必须招募初始用户群，并确保他们在公开发布前拥有体面的商店。他还经历了第一次熟悉的感觉，当时他阅读了评论，发现里面充满了愤怒的人。他还经历了在线发布和公开发布的区别。最后，他写了一些关于他积累的主题的文章，并写了一个更详细的版本供他人阅读。

In [ ]

已复制!

from llama_index.core.vector_stores.types import VectorStoreQueryMode

# You can also use text search

query_bundle = QueryBundle(query_str=query_str)
response = index.as_query_engine(
    vector_store_query_mode=VectorStoreQueryMode.TEXT_SEARCH
).query(query_bundle)
display(Markdown(f"<b>{response}</b>"))
from llama_index.core.vector_stores.types import VectorStoreQueryMode # You can also use text search query_bundle = QueryBundle(query_str=query_str) response = index.as_query_engine( vector_store_query_mode=VectorStoreQueryMode.TEXT_SEARCH ).query(query_bundle) display(Markdown(f"{response}"))

作者成长于互联网泡沫时期，当时他正在经营一家初创公司。他们不得不雇用比预期更多的人手，以便看起来更专业，并且在雅虎收购他们之前一直受制于他们的投资者。他们学到了很多关于零售和初创公司的知识，并且必须做很多他们不一定擅长的事情，才能使他们的业务成功。