跳过内容

索引

Bases: BaseModel

参数

名称

类型 描述 默认值 context_window
int | None 模型在生成响应时可以输入的总 token 数。

num_output

3900
模型在生成响应时可以输出的 token 数。 模型在生成响应时可以输入的总 token 数。

num_input_files

256
模型在生成响应时可以接受的输入文件数。 模型在生成响应时可以输入的总 token 数。

is_function_calling_model

10
bool | None 如果模型支持函数调用消息,类似于 OpenAI 的函数调用 API,则设置为 True。例如,将“发送邮件给 Anya 询问她下周五是否想喝咖啡”转换为函数调用,如 send_email(to: string, body: string)

False

model_name
str 用于日志记录、测试和健全性检查的模型名称。对于某些模型,可以自动识别。对于其他模型,如本地加载的模型,必须手动指定。

'unknown'

is_chat_model
bool 如果模型暴露聊天接口(即可以传递消息序列,而不是文本),类似于 OpenAI 的 /v1/chat/completions endpoint,则设置为 True。

源代码位于 llama-index-core/llama_index/core/multi_modal_llms/base.py

model_name
MultiModalLLM #
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
class MultiModalLLMMetadata(BaseModel):
    model_config = ConfigDict(protected_namespaces=("pydantic_model_",))
    context_window: Optional[int] = Field(
        default=DEFAULT_CONTEXT_WINDOW,
        description=(
            "Total number of tokens the model can be input when generating a response."
        ),
    )
    num_output: Optional[int] = Field(
        default=DEFAULT_NUM_OUTPUTS,
        description="Number of tokens the model can output when generating a response.",
    )
    num_input_files: Optional[int] = Field(
        default=DEFAULT_NUM_INPUT_FILES,
        description="Number of input files the model can take when generating a response.",
    )
    is_function_calling_model: Optional[bool] = Field(
        default=False,
        # SEE: https://openai.com/blog/function-calling-and-other-api-updates
        description=(
            "Set True if the model supports function calling messages, similar to"
            " OpenAI's function calling API. For example, converting 'Email Anya to"
            " see if she wants to get coffee next Friday' to a function call like"
            " `send_email(to: string, body: string)`."
        ),
    )
    model_name: str = Field(
        default="unknown",
        description=(
            "The model's name used for logging, testing, and sanity checking. For some"
            " models this can be automatically discerned. For other models, like"
            " locally loaded models, this must be manually specified."
        ),
    )

    is_chat_model: bool = Field(
        default=False,
        description=(
            "Set True if the model exposes a chat interface (i.e. can be passed a"
            " sequence of messages, rather than text), like OpenAI's"
            " /v1/chat/completions endpoint."
        ),
    )

Bases: ChainableMixin, BaseComponent, DispatcherSpanMixin

多模态 LLM 接口。

callback_manager

名称

类型 描述 默认值 context_window
CallbackManager 处理 LlamaIndex 中事件回调的 Callback Manager。

Callback manager 提供在事件开始/结束时调用 handler 的方法。

此外,Callback manager 追踪当前的事件堆栈。它通过使用几个关键属性来实现这一点。 - trace_stack - 当前尚未结束的事件堆栈。当事件结束时,它会从堆栈中移除。由于这是一个 contextvar,因此它对于每个线程/任务是唯一的。 - trace_map - 事件 ID 到其子事件的映射。在事件开始时,trace stack 的底部被用作 trace map 的当前父事件。 - trace_id - 当前 trace 的简单名称,通常表示入口点(查询、索引构建、插入等)。

参数: handlers (List[BaseCallbackHandler]): 要使用的 handler 列表。

用法: with callback_manager.event(CBEventType.QUERY) as event: event.on_start(payload={key, val}) ... event.on_end(payload={key, val})

<dynamic>

metadata abstractmethod property #
MultiModalLLM #
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
class MultiModalLLM(ChainableMixin, BaseComponent, DispatcherSpanMixin):
    """Multi-Modal LLM interface."""

    model_config = ConfigDict(arbitrary_types_allowed=True)
    callback_manager: CallbackManager = Field(
        default_factory=CallbackManager, exclude=True
    )

    def __init__(self, *args: Any, **kwargs: Any) -> None:
        # Help static checkers understand this class hierarchy
        super().__init__(*args, **kwargs)

    @property
    @abstractmethod
    def metadata(self) -> MultiModalLLMMetadata:
        """Multi-Modal LLM metadata."""

    @abstractmethod
    def complete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponse:
        """Completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    def stream_complete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponseGen:
        """Streaming completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    def chat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponse:
        """Chat endpoint for Multi-Modal LLM."""

    @abstractmethod
    def stream_chat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponseGen:
        """Stream chat endpoint for Multi-Modal LLM."""

    # ===== Async Endpoints =====

    @abstractmethod
    async def acomplete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponse:
        """Async completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    async def astream_complete(
        self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
    ) -> CompletionResponseAsyncGen:
        """Async streaming completion endpoint for Multi-Modal LLM."""

    @abstractmethod
    async def achat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponse:
        """Async chat endpoint for Multi-Modal LLM."""

    @abstractmethod
    async def astream_chat(
        self,
        messages: Sequence[ChatMessage],
        **kwargs: Any,
    ) -> ChatResponseAsyncGen:
        """Async streaming chat endpoint for Multi-Modal LLM."""

    def _as_query_component(self, **kwargs: Any) -> QueryComponent:
        """Return query component."""
        if self.metadata.is_chat_model:
            # TODO: we don't have a separate chat component
            return MultiModalCompleteComponent(multi_modal_llm=self, **kwargs)
        else:
            return MultiModalCompleteComponent(multi_modal_llm=self, **kwargs)

    def __init_subclass__(cls, **kwargs: Any) -> None:
        """
        The callback decorators installs events, so they must be applied before
        the span decorators, otherwise the spans wouldn't contain the events.
        """
        for attr in (
            "complete",
            "acomplete",
            "stream_complete",
            "astream_complete",
            "chat",
            "achat",
            "stream_chat",
            "astream_chat",
        ):
            if callable(method := cls.__dict__.get(attr)):
                if attr.endswith("chat"):
                    setattr(cls, attr, llm_chat_callback()(method))
                else:
                    setattr(cls, attr, llm_completion_callback()(method))
        super().__init_subclass__(**kwargs)

多模态 LLM 元数据。

complete abstractmethod #

多模态 LLM 的 Completion endpoint。

complete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponse

stream_complete abstractmethod #

MultiModalLLM #
 98
 99
100
101
102
@abstractmethod
def complete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponse:
    """Completion endpoint for Multi-Modal LLM."""

多模态 LLM 的流式 Completion endpoint。

stream_complete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponseGen

chat abstractmethod #

MultiModalLLM #
104
105
106
107
108
@abstractmethod
def stream_complete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponseGen:
    """Streaming completion endpoint for Multi-Modal LLM."""

多模态 LLM 的 Chat endpoint。

chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponse

stream_chat abstractmethod #

MultiModalLLM #
110
111
112
113
114
115
116
@abstractmethod
def chat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponse:
    """Chat endpoint for Multi-Modal LLM."""

多模态 LLM 的流式 Chat endpoint。

stream_chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponseGen

acomplete abstractmethod async #

MultiModalLLM #
118
119
120
121
122
123
124
@abstractmethod
def stream_chat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponseGen:
    """Stream chat endpoint for Multi-Modal LLM."""

多模态 LLM 的异步 Completion endpoint。

acomplete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponse

astream_complete abstractmethod async #

MultiModalLLM #
128
129
130
131
132
@abstractmethod
async def acomplete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponse:
    """Async completion endpoint for Multi-Modal LLM."""

多模态 LLM 的异步流式 Completion endpoint。

astream_complete(prompt: str, image_documents: List[ImageNode], **kwargs: Any) -> CompletionResponseAsyncGen

achat abstractmethod async #

MultiModalLLM #
134
135
136
137
138
@abstractmethod
async def astream_complete(
    self, prompt: str, image_documents: List[ImageNode], **kwargs: Any
) -> CompletionResponseAsyncGen:
    """Async streaming completion endpoint for Multi-Modal LLM."""

多模态 LLM 的异步 Chat endpoint。

achat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponse

astream_chat abstractmethod async #

MultiModalLLM #
140
141
142
143
144
145
146
@abstractmethod
async def achat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponse:
    """Async chat endpoint for Multi-Modal LLM."""

多模态 LLM 的异步流式 Chat endpoint。

astream_chat(messages: Sequence[ChatMessage], **kwargs: Any) -> ChatResponseAsyncGen

BaseMultiModalComponent #

MultiModalLLM #
148
149
150
151
152
153
154
@abstractmethod
async def astream_chat(
    self,
    messages: Sequence[ChatMessage],
    **kwargs: Any,
) -> ChatResponseAsyncGen:
    """Async streaming chat endpoint for Multi-Modal LLM."""

Bases: QueryComponent

基本 LLM 组件。

multi_modal_llm

名称

类型 描述 默认值 context_window
LLM metadata

必需

streaming
流式模式 如果模型暴露聊天接口(即可以传递消息序列,而不是文本),类似于 OpenAI 的 /v1/chat/completions endpoint,则设置为 True。

set_callback_manager #

model_name
MultiModalLLM #
187
188
189
190
191
192
193
194
195
class BaseMultiModalComponent(QueryComponent):
    """Base LLM component."""

    model_config = ConfigDict(arbitrary_types_allowed=True)
    multi_modal_llm: MultiModalLLM = Field(..., description="LLM")
    streaming: bool = Field(default=False, description="Streaming mode")

    def set_callback_manager(self, callback_manager: Any) -> None:
        """Set callback manager."""

设置 callback manager。

set_callback_manager(callback_manager: Any) -> None

MultiModalCompleteComponent #

MultiModalLLM #
194
195
def set_callback_manager(self, callback_manager: Any) -> None:
    """Set callback manager."""

Bases: BaseMultiModalComponent

多模态完成组件。

input_keys property #

MultiModalLLM #
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
class MultiModalCompleteComponent(BaseMultiModalComponent):
    """Multi-modal completion component."""

    def _validate_component_inputs(self, input: Dict[str, Any]) -> Dict[str, Any]:
        """Validate component inputs during run_component."""
        if "prompt" not in input:
            raise ValueError("Prompt must be in input dict.")

        # do special check to see if prompt is a list of chat messages
        if isinstance(input["prompt"], get_args(List[ChatMessage])):
            raise NotImplementedError(
                "Chat messages not yet supported as input to multi-modal model."
            )
        else:
            input["prompt"] = validate_and_convert_stringable(input["prompt"])

        # make sure image documents are valid
        if "image_documents" in input:
            if not isinstance(input["image_documents"], list):
                raise ValueError("image_documents must be a list.")
            for doc in input["image_documents"]:
                if not isinstance(doc, (ImageDocument, ImageNode)):
                    raise ValueError(
                        "image_documents must be a list of ImageNode objects."
                    )

        return input

    def _run_component(self, **kwargs: Any) -> Any:
        """Run component."""
        # TODO: support only complete for now
        prompt = kwargs["prompt"]
        image_documents = kwargs.get("image_documents", [])

        response: Any
        if self.streaming:
            response = self.multi_modal_llm.stream_complete(prompt, image_documents)
        else:
            response = self.multi_modal_llm.complete(prompt, image_documents)
        return {"output": response}

    async def _arun_component(self, **kwargs: Any) -> Any:
        """Run component."""
        # TODO: support only complete for now
        # non-trivial to figure how to support chat/complete/etc.
        prompt = kwargs["prompt"]
        image_documents = kwargs.get("image_documents", [])

        response: Any
        if self.streaming:
            response = await self.multi_modal_llm.astream_complete(
                prompt, image_documents
            )
        else:
            response = await self.multi_modal_llm.acomplete(prompt, image_documents)
        return {"output": response}

    @property
    def input_keys(self) -> InputKeys:
        """Input keys."""
        # TODO: support only complete for now
        return InputKeys.from_keys({"prompt", "image_documents"})

    @property
    def output_keys(self) -> OutputKeys:
        """Output keys."""
        return OutputKeys.from_keys({"output"})

输入键。

input_keys: InputKeys

output_keys property #

输出键。

output_keys: OutputKeys

返回顶部