Guidance Pydantic 程序¶
通过 LlamaIndex 使用 guidance 生成结构化数据。
使用 guidance,你可以通过**强制** LLM 输出期望的 token 来保证输出结构的正确性。
当你使用容量较低的模型(例如当前的开源模型)时,这尤其有用,否则这些模型很难生成符合期望输出 schema 的有效输出。
如果你在 colab 上打开此 Notebook,你可能需要安装 LlamaIndex 🦙。
In [ ]
已复制!
%pip install llama-index-program-guidance
%pip install llama-index-program-guidance
In [ ]
已复制!
!pip install llama-index
!pip install llama-index
In [ ]
已复制!
from pydantic import BaseModel
from typing import List
from guidance.llms import OpenAI
from llama_index.program.guidance import GuidancePydanticProgram
from pydantic import BaseModel from typing import List from guidance.llms import OpenAI from llama_index.program.guidance import GuidancePydanticProgram
定义输出 schema
In [ ]
已复制!
class Song(BaseModel):
title: str
length_seconds: int
class Album(BaseModel):
name: str
artist: str
songs: List[Song]
class Song(BaseModel): title: str length_seconds: int class Album(BaseModel): name: str artist: str songs: List[Song]
定义 guidance pydantic 程序
In [ ]
已复制!
program = GuidancePydanticProgram(
output_cls=Album,
prompt_template_str=(
"Generate an example album, with an artist and a list of songs. Using"
" the movie {{movie_name}} as inspiration"
),
guidance_llm=OpenAI("text-davinci-003"),
verbose=True,
)
program = GuidancePydanticProgram( output_cls=Album, prompt_template_str=( "Generate an example album, with an artist and a list of songs. Using" " the movie {{movie_name}} as inspiration" ), guidance_llm=OpenAI("text-davinci-003"), verbose=True, )
运行程序获取结构化输出。
蓝色高亮文本是我们指定的变量,绿色高亮文本是 LLM 生成的。
In [ ]
已复制!
output = program(movie_name="The Shining")
output = program(movie_name="The Shining")
Generate an example album, with an artist and a list of songs. Using the movie The Shining as inspiration ```json { "name": "The Shining", "artist": "Jack Torrance", "songs": [{ "title": "All Work and No Play", "length_seconds": "180", }, { "title": "The Overlook Hotel", "length_seconds": "240", }, { "title": "The Shining", "length_seconds": "210", }], } ```
输出是有效的 Pydantic 对象,我们可以用它来调用函数/API。
In [ ]
已复制!
output
output
Out[ ]
Album(name='The Shining', artist='Jack Torrance', songs=[Song(title='All Work and No Play', length_seconds=180), Song(title='The Overlook Hotel', length_seconds=240), Song(title='The Shining', length_seconds=210)])