MetaVision API Documentation
Introduction
MetaVision is an advanced, high-performance Multi-Modal AI analysis engine.
Unlike traditional computer vision APIs that return static labels, MetaVision allows you to define flexible Analysis Categories or use Custom Prompts to extract structured intelligence from media (Images, Videos, and Audio).
Try the Interactive Playground →Authentication
All API requests require authentication using a Bearer token. Include your API key in the Authorization header of every request:
Authorization: Bearer YOUR_API_KEY
Core Analysis
The primary endpoint handles image, video, and audio analysis.
POST /api/v1/analyze
Request Structure
Send a JSON body with the media content and configuration.
Required Parameters
| Parameter | Type | Description |
|---|---|---|
media | string | The content to analyze. Must be a valid HTTP URL or a Base64 encoded string (image, video, or audio/mp3). |
Optional Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
categories | array[string] | (All) | List of Category IDs to execute (e.g., ["title", "objects"]). |
custom_prompt | string | null | Provide a custom analysis instruction (max 2000 chars). The result will appear under the key custom_analysis. |
model | string | null | Force the use of a specific Agent. Must use the Model UUID found in the Models section. |
agent_strategy | string | "random" | If no model is specified: random (load balancing) or best (highest reliability score). |
detail | string | "high" | Analysis depth: low (faster, less token usage) or high (more detailed analysis). |
Response Structure
The API returns a JSON object divided into metadata and data results.
{
"meta": {
"timestamp": "2024-03-20T10:00:00.000Z",
"model_id": "8f32e9...",
"model_name": "general-v2",
"execution_time": 2.45,
"successful_categories": 2,
"failed_categories": 0,
"total_tokens_in": 1500,
"total_tokens_out": 300,
"estimated_cost": 0.0045
},
"data": {
"title": {
"result": "Sunset over a mountain range"
},
"custom_analysis": {
"result": "The mood is serene and peaceful."
}
}
}
Meta Object
| Field | Type | Description |
|---|---|---|
model_id | string | The UUID of the model that processed the request. |
model_name | string | The display name of the model. |
execution_time | float | Total processing time in seconds. |
total_tokens_in | integer | Total input tokens consumed by the agents. |
total_tokens_out | integer | Total output tokens generated. |
estimated_cost | float | Estimated cost of the request in USD. |
Analysis Categories
Categories define specific tasks for the engine. You can select specific categories in your request to tailor the analysis output.
Available Models
Use the UUID below in the model parameter to force a specific agent.
Advanced Options
Agent Consistency
MetaVision ensures consistency by using a single Agent for all categories within a single request. This prevents conflicting interpretations of the media across different analysis tasks.
Audio Analysis
MetaVision supports audio files (MP3, WAV, OGG). The engine listens to speech, tone, and background audio to perform the analysis. Max file size: 10MB.
Video Analysis
MetaVision supports video files (MP4, MOV, WebM). The engine samples frames from the video to perform the analysis.
Error Handling
Errors can occur at the request level (4xx/5xx status codes) or at the individual category level.
API Errors
| Status | Code | Description |
|---|---|---|
| 400 | invalid_request | Malformed JSON, missing 'media', or invalid 'model' UUID. |
| 400 | invalid_categories | Requested categories do not exist or are disabled. |
| 401 | invalid_api_key | Missing or incorrect Authorization header. |
| 403 | account_disabled | The user account has been disabled by an admin. |
| 413 | payload_too_large | Media exceeds the maximum allowed size (10MB). |
| 500 | internal_error | Unexpected server or upstream provider error. |
Code Examples
import requests
API_KEY = "your_api_key"
API_URL = "https://metavision.umbrosus.com/api/v1/analyze"
payload = {
"media": "https://example.com/image.jpg",
"categories": ["title", "description"],
"custom_prompt": "Describe the emotion.",
"model": "8f32e9...", # Optional UUID
"agent_strategy": "best"
}
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
response = requests.post(API_URL, json=payload, headers=headers)
print(response.json())
const API_KEY = 'your_api_key';
const API_URL = 'https://metavision.umbrosus.com/api/v1/analyze';
async function analyzeMedia() {
const response = await fetch(API_URL, {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
media: "https://example.com/audio.mp3", // Audio supported
categories: ["title"],
custom_prompt: "Transcribe the speech and describe tone.",
model: "8f32e9..." // Optional UUID
})
});
const result = await response.json();
console.log(result);
}
analyzeMedia();
curl -X POST "https://metavision.umbrosus.com/api/v1/analyze" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"media": "https://example.com/image.jpg",
"custom_prompt": "List all colors present.",
"agent_strategy": "best"
}'