跳到内容

Azure 认知搜索

AzCognitiveSearchReader #

基类: BaseReader

适用于任何 Azure Cognitive Search 索引读取器的通用读取器。

参数

名称 类型 描述 默认值
service_name str

Azure Cognitive Search 服务的名称。

必需
search_key str

直接提供 Azure Search 访问密钥。

必需
index str

索引名称

必需
源代码位于 llama-index-integrations/readers/llama-index-readers-azcognitive-search/llama_index/readers/azcognitive_search/base.py
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
class AzCognitiveSearchReader(BaseReader):
    """
    General reader for any Azure Cognitive Search index reader.

    Args:
        service_name (str): the name of azure cognitive search service.
        search_key (str): provide azure search access key directly.
        index (str): index name

    """

    def __init__(self, service_name: str, searck_key: str, index: str) -> None:
        """Initialize Azure cognitive search service using the search key."""
        import logging

        logger = logging.getLogger("azure.core.pipeline.policies.http_logging_policy")
        logger.setLevel(logging.WARNING)

        azure_credential = AzureKeyCredential(searck_key)

        self.search_client = SearchClient(
            endpoint=f"https://{service_name}.search.windows.net",
            index_name=index,
            credential=azure_credential,
        )

    def load_data(
        self, query: str, content_field: str, filter: Optional[str] = None
    ) -> List[Document]:
        """
        Read data from azure cognitive search index.

        Args:
            query (str): search term in Azure Search index
            content_field (str): field name of the document content.
            filter (str): Filter expression. For example : 'sourcepage eq
                'employee_handbook-3.pdf' and sourcefile eq 'employee_handbook.pdf''

        Returns:
            List[Document]: A list of documents.

        """
        search_result = self.search_client.search(query, filter=filter)

        return [
            Document(
                text=result[content_field],
                extra_info={"id": result["id"], "score": result["@search.score"]},
            )
            for result in search_result
        ]

load_data #

load_data(query: str, content_field: str, filter: Optional[str] = None) -> List[Document]

从 Azure Cognitive Search 索引读取数据。

参数

名称 类型 描述 默认值
query str

Azure Search 索引中的搜索词。

必需
content_field str

文档内容的字段名称。

必需
filter str

筛选表达式。例如:'sourcepage eq 'employee_handbook-3.pdf' and sourcefile eq 'employee_handbook.pdf''

返回值

类型 描述
List[Document]

List[Document]: 文档列表。

源代码位于 llama-index-integrations/readers/llama-index-readers-azcognitive-search/llama_index/readers/azcognitive_search/base.py
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
def load_data(
    self, query: str, content_field: str, filter: Optional[str] = None
) -> List[Document]:
    """
    Read data from azure cognitive search index.

    Args:
        query (str): search term in Azure Search index
        content_field (str): field name of the document content.
        filter (str): Filter expression. For example : 'sourcepage eq
            'employee_handbook-3.pdf' and sourcefile eq 'employee_handbook.pdf''

    Returns:
        List[Document]: A list of documents.

    """
    search_result = self.search_client.search(query, filter=filter)

    return [
        Document(
            text=result[content_field],
            extra_info={"id": result["id"], "score": result["@search.score"]},
        )
        for result in search_result
    ]