-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Description
I'm not entirely sure if this is a bug or a missing feature but I think it would make sense to be able to return multimodal types. I believe currently types such as DocumentUrl
gets serialized as a json.
For example:
@agent.tool_plain
def special_document() -> DocumentUrl:
'''Retrieve a research paper for analysis.'''
return DocumentUrl(url='https://arxiv.org/pdf/2504.07136')
Full example (DocumentUrl
returned as tool)
import httpx
from google.colab import userdata
from pydantic_ai import Agent, BinaryContent, DocumentUrl
from pydantic_ai.models.gemini import GeminiModel
from pydantic_ai.providers.google_gla import GoogleGLAProvider
model = GeminiModel(
'gemini-2.5-pro-preview-03-25',
provider=GoogleGLAProvider(api_key=userdata.get('GEMINI_API_KEY'))
)
agent = Agent(model)
documentUrl = DocumentUrl(url='https://arxiv.org/pdf/2504.07136')
@agent.tool_plain
def special_document() -> DocumentUrl:
'''Retrieve a research paper for analysis.'''
return documentUrl
result = await agent.run(
[
'I need to read a research paper. Please use the special_document tool to get the paper and tell me its title.'
]
)
print('Agent response:')
print(result.output)
Agent response:
Okay, I have retrieved the research paper using thespecial_document
tool.However, the tool only provided a URL to the paper's PDF file:
[https://arxiv.org/pdf/2504.07136
](https://arxiv.org/pdf/2504.07136%60)It did not return the content or the title of the paper itself. Therefore, I cannot tell you the title based on the information provided by the tool. You can access the paper at the URL above to read it and find its title.
Full example (DocumentUrl
passed in agent.run()
)
import httpx
from google.colab import userdata
from pydantic_ai import Agent, BinaryContent, DocumentUrl
from pydantic_ai.models.gemini import GeminiModel
from pydantic_ai.providers.google_gla import GoogleGLAProvider
model = GeminiModel(
'gemini-2.5-pro-preview-03-25',
provider=GoogleGLAProvider(api_key=userdata.get('GEMINI_API_KEY'))
)
agent = Agent(model)
documentUrl = DocumentUrl(url='https://arxiv.org/pdf/2504.07136')
result = await agent.run(
[
'I need to read a research paper. Please use the special_document tool to get the paper and tell me its title.',
documentUrl # Directly pass in documentUrl.
]
)
print('Agent response:')
print(result.output)
Agent response:
Okay, I have accessed the research paper using thespecial_document
tool.The title of the paper is: The spectrum of magnetized turbulence in the interstellar medium