跳到主内容

Hive

基类: BaseReader

从 Hive 读取文档。

这些文档随后可以在下游 Llama Index 数据结构中使用。

参数

名称

类型 描述 默认值 host
HiveServer2 运行所在的主机

必需

port
Hive Server 运行所在的端口。默认为 10000。

auth

HiveServer2 使用的 hive.server2.authentication 的值。默认为 NONE

database

Optional[str] 数据库名称

password

仅与 auth='LDAP' 或 auth='CUSTOM' 一起使用 数据库名称

源代码位于 llama-index-integrations/readers/llama-index-readers-hive/llama_index/readers/hive/base.py

load_data #
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
class HiveReader(BaseReader):
    """
    Read documents from a Hive.

    These documents can then be used in a downstream Llama Index data structure.

    Args:
        host : What host HiveServer2 runs on
        port : The port Hive Server runs on. Defaults to 10000.
        auth : The value of hive.server2.authentication used by HiveServer2.
               Defaults to ``NONE``
        database: the database name
        password: Use with auth='LDAP' or auth='CUSTOM' only

    """

    def __init__(
        self,
        host: str,
        port: Optional[int] = None,
        database: Optional[str] = None,
        username: Optional[str] = None,
        password: Optional[str] = None,
        auth: Optional[str] = None,
    ):
        """Initialize with parameters."""
        try:
            from pyhive import hive
        except ImportError:
            raise ImportError(
                "`hive` package not found, please run `pip install pyhive`"
            )

        self.con = hive.Connection(
            host=host,
            port=port,
            username=username,
            database=database,
            auth=auth,
            password=password,
        )

    def load_data(self, query: str) -> List[Document]:
        """
        Read data from the Hive.

        Args:
            query (str): The query used to query data from Hive
        Returns:
            List[Document]: A list of documents.

        """
        try:
            cursor = self.con.cursor().execute(query)
            cursor.execute(query)
            rows = cursor.fetchall()
        except Exception:
            raise Exception(
                "Throws Exception in execution, please check your connection params and query "
            )

        documents = []
        for row in rows:
            documents = Document(text=row)
        return documents

从 Hive 读取数据。

load_data(query: str) -> List[Document]

query

名称

类型 描述 默认值 host
str 用于从 Hive 查询数据的查询语句

返回值: List[Document]: 文档列表。

port

回到顶部

load_data #
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
def load_data(self, query: str) -> List[Document]:
    """
    Read data from the Hive.

    Args:
        query (str): The query used to query data from Hive
    Returns:
        List[Document]: A list of documents.

    """
    try:
        cursor = self.con.cursor().execute(query)
        cursor.execute(query)
        rows = cursor.fetchall()
    except Exception:
        raise Exception(
            "Throws Exception in execution, please check your connection params and query "
        )

    documents = []
    for row in rows:
        documents = Document(text=row)
    return documents