MetaVision API Documentation

Introduction

MetaVision is an advanced, high-performance Multi-Modal AI analysis engine. It orchestrates a swarm of Vision-Language Model (VLM) agents to analyze images and videos in parallel.

Unlike traditional computer vision APIs that return static labels, MetaVision allows you to define flexible Analysis Categories (e.g., "Generate Title", "Describe Lighting", "Safety Check"). The engine executes these tasks simultaneously and aggregates the structured results into a single JSON response.

Try the Interactive Playground →

Authentication

All API requests require authentication using a Bearer token. Include your API key in the Authorization header of every request:

Authorization: Bearer YOUR_API_KEY

Core Analysis

The primary endpoint handles both image and video analysis.

POST /api/v1/analyze

Request Structure

Send a JSON body with the media content and configuration.

Required Parameters

ParameterTypeDescription
mediastringThe content to analyze. Must be a valid HTTP URL or a Base64 encoded string (including data URI scheme).

Optional Parameters

ParameterTypeDefaultDescription
categoriesarray[string](All)List of Category IDs to execute (e.g., ["title", "objects"]). If omitted, the system defaults are used.
agent_strategystring"random"Strategy for selecting VLM agents: random (load balancing), fastest (lowest latency history), or best (highest reliability score).
detailstring"high"Analysis depth: low (faster, less token usage) or high (more detailed vision analysis).

Response Structure

The API returns a JSON object divided into metadata and data results.

{
  "meta": {
    "timestamp": "2024-03-20T10:00:00.000Z",
    "execution_time": 2.45,
    "successful_categories": 2,
    "failed_categories": 0
  },
  "data": {
    "title": {
      "result": "Sunset over a mountain range"
    },
    "objects": [ "sun", "mountain", "tree", "river" ]
  }
}

Meta Object

FieldTypeDescription
execution_timefloatTotal processing time in seconds.
successful_categoriesintegerCount of analysis tasks that completed successfully.
failed_categoriesintegerCount of analysis tasks that failed or timed out.

Data Object

The keys in the data object correspond to the output_key defined in the Categories. The value is the structured JSON returned by the AI agent.

Analysis Categories

Categories define specific tasks for the vision engine. You can select specific categories in your request to tailor the analysis output.

Loading categories...

Advanced Options

Agent Strategy

Optimize your request based on your priority:

  • Random: Distributes load evenly across available agents. Good for general batch processing.
  • Fastest: Prioritizes agents with historically low latency. Ideal for real-time user-facing apps.
  • Best: Prioritizes agents with the highest reliability score (1-100). Best for complex reasoning tasks.

Video Analysis

MetaVision supports video files (MP4, MOV, WebM). The engine samples frames from the video to perform the analysis.

Performance Note: Video analysis is significantly more computationally expensive than images. Execution times may range from 15s to 45s depending on video length and complexity.

Error Handling

Errors can occur at the request level (4xx/5xx status codes) or at the individual category level.

Category-Level Errors

If a specific category fails (e.g., AI timeout), it does not fail the entire request. Instead, the error is reported in the data object:

{
  "data": {
    "title": "A nice image",
    "objects": {
      "error": "Analysis timed out after 15000ms"
    }
  }
}

API Errors

StatusCodeDescription
400invalid_requestMalformed JSON or missing 'media' field.
400invalid_categoriesRequested categories do not exist or are disabled.
401invalid_api_keyMissing or incorrect Authorization header.
403account_disabledThe user account has been disabled by an admin.
413payload_too_largeMedia exceeds the maximum allowed size (10MB).
500internal_errorUnexpected server or upstream provider error.

Code Examples

import requests

API_KEY = "your_api_key"
API_URL = "https://metavision.umbrosus.com/api/v1/analyze"

payload = {
    "media": "https://example.com/image.jpg",
    "categories": ["title", "description", "objects"],
    "agent_strategy": "best"
}

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

response = requests.post(API_URL, json=payload, headers=headers)
print(response.json())
const API_KEY = 'your_api_key';
const API_URL = 'https://metavision.umbrosus.com/api/v1/analyze';

async function analyzeMedia() {
  const response = await fetch(API_URL, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      media: "https://example.com/image.jpg",
      categories: ["title", "description", "objects"],
      agent_strategy: "best"
    })
  });

  const result = await response.json();
  console.log(result);
}

analyzeMedia();
curl -X POST "https://metavision.umbrosus.com/api/v1/analyze" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "media": "https://example.com/image.jpg",
    "categories": ["title", "description", "objects"],
    "agent_strategy": "best"
  }'
<?php

$url = 'https://metavision.umbrosus.com/api/v1/analyze';
$apiKey = 'your_api_key';

$data = [
    'media' => 'https://example.com/image.jpg',
    'categories' => ['title', 'description', 'objects'],
    'agent_strategy' => 'best'
];

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    'Authorization: Bearer ' . $apiKey,
    'Content-Type: application/json'
]);

$response = curl_exec($ch);
curl_close($ch);

echo $response;
?>