naralens/Docs

Introduction

naralens is the video understanding API. Turn a landscape video into a 9:16 vertical clip that follows the active speaker, with one API call.

What is naralens?

naralens runs a serverless GPU pipeline that ingests a video, detects scene cuts, tracks faces, identifies the active speaker, and auto-crops the footage to vertical. You can run the whole thing with a single reframe call, or drive each step yourself with the pipeline endpoints.

Key Features

  • Speaker-aware reframe - Auto-crop landscape to 9:16 that follows whoever is talking
  • Active speaker detection - Per-track speaking scores via TalkNet-ASD over tracked faces
  • Scene-aware - Scene-cut detection keeps the crop re-centering cleanly across shots
  • Async REST - Submit a URL, poll one endpoint, download the result. No SDK required
  • Pay as you go - 1 credit = 1 second of video; 300 free credits to start

How It Works

  1. Submit - POST your video URL to /api/v2/reframe
  2. Process - naralens runs the full pipeline for you
  3. Poll - GET /api/v2/reframe/{job_id} until it's completed
  4. Download - Fetch the 9:16 clip from the returned download_url

Quick Example

Bash
# Reframe a landscape video to 9:16, following the active speaker
curl -X POST https://naralens.com/api/v2/reframe \
  -H "Authorization: Bearer $NARA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/video.mp4"}'

# Response: { "job_id": "9f1c...", "status": "processing" }

Next Steps