Aurora: Unified Video Editing with a Tool-Using Agent

This repository will host the official implementation of Aurora, an agentic video editing framework that pairs a tool-augmented vision-language model (VLM) agent with a unified video diffusion transformer. The VLM agent rewrites a raw user request into a typed edit plan (instruction, task label, image-search query, mask phrase) and dispatches it to the video DiT, resolving textual and visual underspecification before generation.

Features

🎬 Unified video editing - replacement, removal, style transfer, and reference-driven insertion under one set of weights
🤖 Tool-using VLM agent - rewrites a raw user request into a four-field edit plan
🔍 Resolves underspecification - fills in missing reference images via web image search and missing masks via grounded segmentation
📊 AgentEdit-Bench - evaluates agent-enhanced video editing under textual and visual underspecification

TODO

[] The code is being prepared for release. ETA: late May 2026.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aurora: Unified Video Editing with a Tool-Using Agent

Features

TODO

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Aurora: Unified Video Editing with a Tool-Using Agent

Features

TODO

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages