-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Issue Description:
Currently, pydantic-ai
implements structured output solely using tool-calling APIs from model providers. While this works in most cases, certain schemas supported by pydantic
exhibit inconsistencies between model providers.
For instance, the following schema from the documentation does not work with Gemini models:
class UserProfile(TypedDict, total=False):
name: str
dob: date
bio: str
agent = Agent(
'gemini-2.0-flash-exp',
result_type=UserProfile,
)
agent.run_sync("Generate a synthetic data")
This results in the following error:
UnexpectedModelBehavior: Unexpected response from gemini 400, body:
{
"error": {
"code": 400,
"message": "* GenerateContentRequest.tools[0].function_declarations[0].parameters.properties[dob].format: only 'enum' is supported for STRING type\n",
"status": "INVALID_ARGUMENT"
}
}
In this example, the inconsistency stems from the model provider's limitations. However, based on my observations working with tools like instructor
, modern LLMs are increasingly proficient at adhering to JSON-format prompts in their text responses. In fact, they often perform better in terms of json content in standard completion modes than in tool-calling modes. The Berkeley Function-Calling Leaderboard may provide further evidence of this trend.
Feature Request
Would it be possible for pydantic-ai
to implement an alternative mode akin to instructor
's MD_JSON
mode? This mode could use prompt engineering to guide the LLM’s output and parse the resulting JSON as raw text rather than relying on tool-calling APIs.
Such a feature would:
- Allow broader compatibility with any model capable of following JSON schema prompts.
- Address model-specific inconsistencies while leveraging
pydantic
's full schema flexibility.
Thank you for considering this suggestion!