Vision โ
Vision lets you pass images, PDFs, and other media to vision-capable models. Use the multiModal() helper to combine text prompts with one or more image sources.
ts
import {
multiModal,
imageUrl,
imageFile,
imageBuffer,
} from 'confused-ai';Pass a remote image โ
ts
import { createAgent, multiModal, imageUrl } from 'confused-ai';
const agent = createAgent({
name: 'vision-agent',
instructions: 'Analyse the images provided by the user.',
model: 'gpt-4o',
apiKey: process.env.OPENAI_API_KEY!,
});
const result = await agent.run(
multiModal(
'What is in this image?',
imageUrl('https://example.com/photo.jpg'),
),
);Pass a local file โ
ts
import { multiModal, imageFile } from 'confused-ai';
// imageFile() is async โ loads and base64-encodes the file
const result = await agent.run(
multiModal(
'Describe this chart.',
await imageFile('./chart.png'),
),
);Pass a buffer (canvas, upload, fetch response) โ
ts
import { multiModal, imageBuffer } from 'confused-ai';
const response = await fetch('https://example.com/diagram.png');
const buffer = await response.arrayBuffer();
const result = await agent.run(
multiModal(
'What does this architecture diagram show?',
imageBuffer(buffer, 'image/png'),
),
);Multiple images in one message โ
ts
const result = await agent.run(
multiModal(
'Compare these two screenshots and explain the differences.',
imageUrl(beforeUrl),
imageUrl(afterUrl),
),
);Image detail level โ
Control quality vs. speed with the detail option:
ts
imageUrl('https://example.com/photo.jpg', { detail: 'high' })
imageUrl('https://example.com/thumbnail.jpg', { detail: 'low' })
// 'auto' (default) โ model decidesImageSource types โ
| Type | Factory | Description |
|---|---|---|
ImageUrl | imageUrl(url, opts?) | HTTPS or data URI |
ImageFile | await imageFile(path, opts?) | Local file โ loaded at call time (Node.js only) |
ImageBuffer | imageBuffer(data, mimeType, opts?) | Raw ArrayBuffer / Uint8Array |
Supported formats โ
Images: jpg, jpeg, png, gif, webp, bmp, svg, tiff, heic
Audio: mp3, wav, ogg, m4a, flac, webm
Video: mp4, webm, mov, avi, mkv