用于结构化提取的函数调用程序¶

本指南向您展示如何使用我们的 FunctionCallingProgram 进行结构化数据提取。给定一个函数调用 LLM 以及一个输出 Pydantic 类，生成一个结构化的 Pydantic 对象。我们使用了三种不同的函数调用 LLM

OpenAI
Anthropic Claude
Mistral

关于目标对象，您可以选择直接指定 output_cls，或者指定一个 PydanticOutputParser 或任何其他生成 Pydantic 对象的 BaseOutputParser。

在下面的示例中，我们将向您展示提取到 Album 对象（其中可以包含 Song 对象列表）的不同方法。

注意：FunctionCallingProgram 仅适用于原生支持函数调用的 LLM，它通过将 Pydantic 对象的架构作为工具的“工具参数”插入来实现。对于所有其他 LLM，请使用我们的 LLMTextCompletionProgram，它将直接通过文本提示模型以获取结构化输出。

定义 `Album` 类¶

这是一个将输出解析为 Album 架构的简单示例，该架构可以包含多首歌曲。

只需在初始化 FunctionCallingProgram 时将 Album 传递给 output_cls 属性即可。

如果您正在 colab 上打开此 Notebook，您可能需要安装 LlamaIndex 🦙。

In [ ]

已复制！

!pip install llama-index
!pip install llama-index

In [ ]

已复制！

from pydantic import BaseModel
from typing import List

from llama_index.core.program import FunctionCallingProgram
from pydantic import BaseModel from typing import List from llama_index.core.program import FunctionCallingProgram

定义输出架构

In [ ]

已复制！

class Song(BaseModel):
    """Data model for a song."""

    title: str
    length_seconds: int

class Album(BaseModel):
    """Data model for an album."""

    name: str
    artist: str
    songs: List[Song]
class Song(BaseModel): """歌曲的数据模型。""" title: str length_seconds: int class Album(BaseModel): """专辑的数据模型。""" name: str artist: str songs: List[Song]

定义函数调用程序¶

我们使用三个函数调用 LLM 定义了一个函数调用程序

OpenAI
Anthropic
Mistral

使用 OpenAI 的函数调用程序¶

这里我们使用 gpt-3.5-turbo。

我们演示了结构化数据提取的“单个”函数调用以及并行函数调用，允许我们提取多个对象。

函数调用（单个对象）¶

In [ ]

已复制！

from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.openai import OpenAI
from llama_index.core.program import FunctionCallingProgram from llama_index.llms.openai import OpenAI

In [ ]

已复制！





prompt_template_str = """\
Generate an example album, with an artist and a list of songs. \
Using the movie {movie_name} as inspiration.\
"""
llm = OpenAI(model="gpt-3.5-turbo")

program = FunctionCallingProgram.from_defaults(
    output_cls=Album,
    prompt_template_str=prompt_template_str,
    verbose=True,
)
prompt_template_str = """\ 基于电影 {movie_name} 生成一个包含艺术家和歌曲列表的专辑示例。\ """ llm = OpenAI(model="gpt-3.5-turbo") program = FunctionCallingProgram.from_defaults( output_cls=Album, prompt_template_str=prompt_template_str, verbose=True, )

运行程序以获取结构化输出。

In [ ]

已复制！

output = program(movie_name="The Shining")
output = program(movie_name="The Shining")

=== Calling Function ===
Calling function: Album with args: {"name": "The Shining Soundtrack", "artist": "Various Artists", "songs": [{"title": "Main Title", "length_seconds": 180}, {"title": "Rocky Mountains", "length_seconds": 240}, {"title": "Lullaby", "length_seconds": 200}, {"title": "The Overlook Hotel", "length_seconds": 220}, {"title": "Grady's Story", "length_seconds": 180}, {"title": "The Maze", "length_seconds": 210}]}
=== Function Output ===
name='The Shining Soundtrack' artist='Various Artists' songs=[Song(title='Main Title', length_seconds=180), Song(title='Rocky Mountains', length_seconds=240), Song(title='Lullaby', length_seconds=200), Song(title='The Overlook Hotel', length_seconds=220), Song(title="Grady's Story", length_seconds=180), Song(title='The Maze', length_seconds=210)]

输出是一个有效的 Pydantic 对象，我们可以用它来调用函数/API。

In [ ]

已复制！

output
output

Out [ ]

Album(name='The Shining Soundtrack', artist='Various Artists', songs=[Song(title='Main Title', length_seconds=180), Song(title='Rocky Mountains', length_seconds=240), Song(title='Lullaby', length_seconds=200), Song(title='The Overlook Hotel', length_seconds=220), Song(title="Grady's Story", length_seconds=180), Song(title='The Maze', length_seconds=210)])

函数调用（并行函数调用，多个对象）¶

In [ ]

已复制！





prompt_template_str = """\
Generate example albums, with an artist and a list of songs, using each movie below as inspiration. \

Here are the movies:
{movie_names}
"""
llm = OpenAI(model="gpt-3.5-turbo")

program = FunctionCallingProgram.from_defaults(
    output_cls=Album,
    prompt_template_str=prompt_template_str,
    verbose=True,
    allow_parallel_tool_calls=True,
)
output = program(movie_names="The Shining, The Blair Witch Project, Saw")
prompt_template_str = """\ 基于以下每部电影生成包含艺术家和歌曲列表的专辑示例。\ 电影列表如下： {movie_names} """ llm = OpenAI(model="gpt-3.5-turbo") program = FunctionCallingProgram.from_defaults( output_cls=Album, prompt_template_str=prompt_template_str, verbose=True, allow_parallel_tool_calls=True, ) output = program(movie_names="The Shining, The Blair Witch Project, Saw")

=== Calling Function ===
Calling function: Album with args: {"name": "The Shining", "artist": "Various Artists", "songs": [{"title": "Main Theme", "length_seconds": 180}, {"title": "The Overlook Hotel", "length_seconds": 240}, {"title": "Redrum", "length_seconds": 200}]}
=== Function Output ===
name='The Shining' artist='Various Artists' songs=[Song(title='Main Theme', length_seconds=180), Song(title='The Overlook Hotel', length_seconds=240), Song(title='Redrum', length_seconds=200)]
=== Calling Function ===
Calling function: Album with args: {"name": "The Blair Witch Project", "artist": "Soundtrack Ensemble", "songs": [{"title": "Into the Woods", "length_seconds": 210}, {"title": "The Rustling Leaves", "length_seconds": 180}, {"title": "The Witch's Curse", "length_seconds": 240}]}
=== Function Output ===
name='The Blair Witch Project' artist='Soundtrack Ensemble' songs=[Song(title='Into the Woods', length_seconds=210), Song(title='The Rustling Leaves', length_seconds=180), Song(title="The Witch's Curse", length_seconds=240)]
=== Calling Function ===
Calling function: Album with args: {"name": "Saw", "artist": "Horror Soundscapes", "songs": [{"title": "The Reverse Bear Trap", "length_seconds": 220}, {"title": "Jigsaw's Game", "length_seconds": 260}, {"title": "Bathroom Escape", "length_seconds": 180}]}
=== Function Output ===
name='Saw' artist='Horror Soundscapes' songs=[Song(title='The Reverse Bear Trap', length_seconds=220), Song(title="Jigsaw's Game", length_seconds=260), Song(title='Bathroom Escape', length_seconds=180)]

In [ ]

已复制！

output
output

Out [ ]

[Album(name='The Shining', artist='Various Artists', songs=[Song(title='Main Theme', length_seconds=180), Song(title='The Overlook Hotel', length_seconds=240), Song(title='Redrum', length_seconds=200)]),
 Album(name='The Blair Witch Project', artist='Soundtrack Ensemble', songs=[Song(title='Into the Woods', length_seconds=210), Song(title='The Rustling Leaves', length_seconds=180), Song(title="The Witch's Curse", length_seconds=240)]),
 Album(name='Saw', artist='Horror Soundscapes', songs=[Song(title='The Reverse Bear Trap', length_seconds=220), Song(title="Jigsaw's Game", length_seconds=260), Song(title='Bathroom Escape', length_seconds=180)])]

使用 Anthropic 的函数调用程序¶

这里我们使用 Claude Sonnet（这三个模型都支持函数调用）。

In [ ]

已复制！

from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.anthropic import Anthropic
from llama_index.core.program import FunctionCallingProgram from llama_index.llms.anthropic import Anthropic

In [ ]

已复制！





prompt_template_str = "Generate a song about {topic}."
llm = Anthropic(model="claude-3-sonnet-20240229")

program = FunctionCallingProgram.from_defaults(
    output_cls=Song,
    prompt_template_str=prompt_template_str,
    llm=llm,
    verbose=True,
)
prompt_template_str = "生成一首关于 {topic} 的歌曲。" llm = Anthropic(model="claude-3-sonnet-20240229") program = FunctionCallingProgram.from_defaults( output_cls=Song, prompt_template_str=prompt_template_str, llm=llm, verbose=True, )

In [ ]

已复制！

output = program(topic="harry potter")
output = program(topic="harry potter")

=== Calling Function ===
Calling function: Song with args: {"title": "The Boy Who Lived", "length_seconds": 180}
=== Function Output ===
title='The Boy Who Lived' length_seconds=180

In [ ]

已复制！

output
output

Out [ ]

Song(title='The Boy Who Lived', length_seconds=180)

使用 Mistral 的函数调用程序¶

这里我们使用 mistral-large。

In [ ]

已复制！

from llama_index.core.program import FunctionCallingProgram
from llama_index.llms.mistralai import MistralAI
from llama_index.core.program import FunctionCallingProgram from llama_index.llms.mistralai import MistralAI

In [ ]

已复制！





# prompt_template_str = """\
# Generate an example album, with an artist and a list of songs. \
# Use the broadway show {broadway_show} as inspiration. \
# Make sure to use the tool.
# """
prompt_template_str = "Generate a song about {topic}."
llm = MistralAI(model="mistral-large-latest")
program = FunctionCallingProgram.from_defaults(
    output_cls=Song,
    prompt_template_str=prompt_template_str,
    llm=llm,
    verbose=True,
)
# prompt_template_str = """\ # 生成一个包含艺术家和歌曲列表的专辑示例。\ # 基于百老汇演出 {broadway_show}。\ # 确保使用工具。\ # """ prompt_template_str = "生成一首关于 {topic} 的歌曲。" llm = MistralAI(model="mistral-large-latest") program = FunctionCallingProgram.from_defaults( output_cls=Song, prompt_template_str=prompt_template_str, llm=llm, verbose=True, )

In [ ]

已复制！

output = program(topic="the broadway show Wicked")
output = program(topic="the broadway show Wicked")

=== Calling Function ===
Calling function: Song with args: {"title": "Defying Gravity", "length_seconds": 240}
=== Function Output ===
title='Defying Gravity' length_seconds=240

In [ ]

已复制！

output
output

Out [ ]

Song(title='Defying Gravity', length_seconds=240)

In [ ]

已复制！

from llama_index.core.output_parsers import PydanticOutputParser

program = LLMTextCompletionProgram.from_defaults(
    output_parser=PydanticOutputParser(output_cls=Album),
    prompt_template_str=prompt_template_str,
    verbose=True,
)
from llama_index.core.output_parsers import PydanticOutputParser program = LLMTextCompletionProgram.from_defaults( output_parser=PydanticOutputParser(output_cls=Album), prompt_template_str=prompt_template_str, verbose=True, )

In [ ]

已复制！

output = program(movie_name="Lord of the Rings")
output
output = program(movie_name="Lord of the Rings") output

Out [ ]

Album(name='The Fellowship of the Ring', artist='Middle-earth Ensemble', songs=[Song(title='The Shire', length_seconds=240), Song(title='Concerning Hobbits', length_seconds=180), Song(title='The Ring Goes South', length_seconds=300), Song(title='A Knife in the Dark', length_seconds=270), Song(title='Flight to the Ford', length_seconds=210), Song(title='Many Meetings', length_seconds=240), Song(title='The Council of Elrond', length_seconds=330), Song(title='The Great Eye', length_seconds=180), Song(title='The Breaking of the Fellowship', length_seconds=360)])

用于结构化提取的函数调用程序¶

定义 Album 类¶

定义函数调用程序¶

使用 OpenAI 的函数调用程序¶

函数调用（单个对象）¶

函数调用（并行函数调用，多个对象）¶

使用 Anthropic 的函数调用程序¶

使用 Mistral 的函数调用程序¶

定义 `Album` 类¶