File Upload

File upload enables agents to process and understand various file types including documents, images, code files, and data formats. By providing files alongside text queries, agents can analyze visual content, extract information from documents, review code, and work with structured data. This multimodal capability allows agents to go beyond text-only interactions, enabling them to understand and respond to questions about file contents, extract structured data, and combine visual or document context with natural language queries.

Supported File Types

Agents can process a wide variety of file formats:

Category	Supported Formats	Use Cases
Documents	PDF, TXT, DOCX, MD, RTF	Reports, articles, documentation
Images	PNG, JPG, JPEG, GIF, WEBP	Photos, diagrams, screenshots, charts
Audio	MP3, WAV, M4A, OGG	Speech transcription, audio analysis
Video	MP4, MOV, AVI, WEBM	Video analysis, frame extraction

Supported file types can change depending on the model. Check your model provider’s documentation for specific file type support and limitations.

Non-Streaming

Upload and process files using the files parameter with run() or arun(). Files can be local paths or URLs:

from hypertic.agents import Agent
from hypertic.models import OpenAIChat

model = OpenAIChat(model="gpt-5.2")

agent = Agent(
    model=model
)

# Non-streaming with files
response = agent.run(
    query="What's in this image and the document?",
    files=[
        "https://yavuzceliker.github.io/sample-images/image-1021.jpg",
        "data/index.pdf"
    ]
)
print(f"Response: {response.content}")
print(f"Metadata: {response.metadata}")

Streaming

Stream responses when processing files using stream() or astream():

from hypertic.agents import Agent
from hypertic.models import OpenAIChat

model = OpenAIChat(model="gpt-5.2")

agent = Agent(
    model=model
)

# Streaming with files
for event in agent.stream(
    query="What is in the image and the document?",
    files=["data/image.jpg", "https://www.berkshirehathaway.com/letters/2024ltr.pdf"]
):
    if event.type == "content":
        print(event.content, end="", flush=True)
    elif event.type == "tool_calls":
        print(f"\nTool Calls: {event.tool_calls}")
    elif event.type == "tool_outputs":
        print(f"\nTool Outputs: {event.tool_outputs}")
    elif event.type == "metadata":
        print(f"\nMetadata: {event.metadata}")

Get started

Agent

Workflow

Supported File Types

Non-Streaming

Streaming

Get started

Agent

Workflow

​Supported File Types

​Non-Streaming

​Streaming

Supported File Types

Non-Streaming

Streaming