Azure 认知搜索

AzCognitiveSearchReader #

基类: BaseReader

适用于任何 Azure Cognitive Search 索引读取器的通用读取器。

参数

名称	类型	描述	默认值
`service_name`	`str`	Azure Cognitive Search 服务的名称。	必需
`search_key`	`str`	直接提供 Azure Search 访问密钥。	必需
`index`	`str`	索引名称	必需

源代码位于 llama-index-integrations/readers/llama-index-readers-azcognitive-search/llama_index/readers/azcognitive_search/base.py

class AzCognitiveSearchReader(BaseReader):
    """
    General reader for any Azure Cognitive Search index reader.

    Args:
        service_name (str): the name of azure cognitive search service.
        search_key (str): provide azure search access key directly.
        index (str): index name

    """

    def __init__(self, service_name: str, searck_key: str, index: str) -> None:
        """Initialize Azure cognitive search service using the search key."""
        import logging

        logger = logging.getLogger("azure.core.pipeline.policies.http_logging_policy")
        logger.setLevel(logging.WARNING)

        azure_credential = AzureKeyCredential(searck_key)

        self.search_client = SearchClient(
            endpoint=f"https://{service_name}.search.windows.net",
            index_name=index,
            credential=azure_credential,
        )

    def load_data(
        self, query: str, content_field: str, filter: Optional[str] = None
    ) -> List[Document]:
        """
        Read data from azure cognitive search index.

        Args:
            query (str): search term in Azure Search index
            content_field (str): field name of the document content.
            filter (str): Filter expression. For example : 'sourcepage eq
                'employee_handbook-3.pdf' and sourcefile eq 'employee_handbook.pdf''

        Returns:
            List[Document]: A list of documents.

        """
        search_result = self.search_client.search(query, filter=filter)

        return [
            Document(
                text=result[content_field],
                extra_info={"id": result["id"], "score": result["@search.score"]},
            )
            for result in search_result
        ]

load_data #

load_data(query: str, content_field: str, filter: Optional[str] = None) -> List[Document]

从 Azure Cognitive Search 索引读取数据。

参数

名称	类型	描述	默认值
`query`	`str`	Azure Search 索引中的搜索词。	必需
`content_field`	`str`	文档内容的字段名称。	必需
`filter`	`str`	筛选表达式。例如：'sourcepage eq 'employee_handbook-3.pdf' and sourcefile eq 'employee_handbook.pdf''	`无`

返回值

类型	描述
`List[Document]`	List[Document]: 文档列表。

源代码位于 llama-index-integrations/readers/llama-index-readers-azcognitive-search/llama_index/readers/azcognitive_search/base.py

def load_data(
    self, query: str, content_field: str, filter: Optional[str] = None
) -> List[Document]:
    """
    Read data from azure cognitive search index.

    Args:
        query (str): search term in Azure Search index
        content_field (str): field name of the document content.
        filter (str): Filter expression. For example : 'sourcepage eq
            'employee_handbook-3.pdf' and sourcefile eq 'employee_handbook.pdf''

    Returns:
        List[Document]: A list of documents.

    """
    search_result = self.search_client.search(query, filter=filter)

    return [
        Document(
            text=result[content_field],
            extra_info={"id": result["id"], "score": result["@search.score"]},
        )
        for result in search_result
    ]