Skip to content

inier/Code2Video

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

69 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Code2Video: Video Generation via Code

Code2Video: A Code-centric Paradigm for Educational Video Generation

Yanzhe Chen, Kevin Qinghong Lin, Mike Zheng Shou
Show Lab @ National University of Singapore

Β  πŸ“„ Paper Β  | Β  Β  πŸ€— Daily Paper Β  | Β  Β  πŸ€— Dataset Β  | Β  Β  🌐 Project Website Β  | Β  Β  πŸ’¬ X (Twitter)

code2video_light.mp4

Learning Topic Veo3 Wan2.2 Code2Video (Ours)
Hanoi Problem
Large Language Model
Pure Fourier Series


πŸ”₯ Update

  • [2025.10.6] We have updated the ground truth human-made videos and metadata for the MMMC dataset.
  • [2025.10.3] Thanks @_akhaliq for sharing our work on Twitter!
  • [2025.10.2] We release the arXiv, code and dataset .
  • [2025.9.22] Code2Video has been accepted to the Deep Learning for Code (DL4C) Workshop at NeurIPS 2025.

Table of Contents


🌟 Overview

Overview

Code2Video is an agentic, code-centric framework that generates high-quality educational videos from knowledge points.
Unlike pixel-based text-to-video models, our approach leverages executable Manim code to ensure clarity, coherence, and reproducibility.

Key Features:

  • 🎬 Code-Centric Paradigm β€” executable code as the unified medium for both temporal sequencing and spatial organization of educational videos.
  • πŸ€– Modular Tri-Agent Design β€” Planner (storyboard expansion), Coder (debuggable code synthesis), and Critic (layout refinement with anchors) work together for structured generation.
  • πŸ“š MMMC Benchmark β€” the first benchmark for code-driven video generation, covering 117 curated learning topics inspired by 3Blue1Brown, spanning diverse areas.
  • πŸ§ͺ Multi-Dimensional Evaluation β€” systematic assessment on efficiency, aesthetics, and end-to-end knowledge transfer.

πŸš€ Try Code2Video

Approach

1. Requirements

cd src/
pip install -r requirements.txt

Here is the official installation guide for Manim Community v0.19.0, to help everyone correctly set up the environment.

2. Configure LLM API Keys

Fill in your API credentials in api_config.json.

  • LLM API:

    • Required for Planner & Coder.
    • Best Manim code quality achieved with Claude-4-Opus.
  • VLM API:

    • Required for Planner Critic.
    • For layout and aesthetics optimization, provide Gemini API key.
    • Best quality achieved with gemini-2.5-pro-preview-05-06.
  • Visual Assets API:

    • To enrich videos with icons, set ICONFINDER_API_KEY from IconFinder.

3. Run Agents

We provide two shell scripts for different generation modes:

(a) Any Query

Script: run_agent_single.sh

Generates a video from a single knowledge point specified in the script.

sh run_agent_single.sh --knowledge_point "Linear transformations and matrices"

Important parameters inside run_agent_single.sh:

  • API: specify which LLM to use.
  • FOLDER_PREFIX: output folder prefix (e.g., TEST-single).
  • KNOWLEDGE_POINT: target concept, e.g. "Linear transformations and matrices".

(b) Full Benchmark Mode

Script: run_agent.sh

Runs all (or a subset of) learning topics defined in long_video_topics_list.json.

sh run_agent.sh

Important parameters inside run_agent.sh:

  • API: specify which LLM to use.
  • FOLDER_PREFIX: name prefix for saving output folders (e.g., TEST-LIST).
  • MAX_CONCEPTS: number of concepts to include (-1 means all).
  • PARALLEL_GROUP_NUM: number of groups to run in parallel.

4. Project Organization

A suggested directory structure:

src/
│── agent.py
│── run_agent.sh
│── run_agent_single.sh
│── api_config.json
│── ...
β”‚
β”œβ”€β”€ assets/
β”‚   β”œβ”€β”€ icons/          #  downloaded visual assets cache via IconFinder API
β”‚   └── reference/      # reference images
β”‚
β”œβ”€β”€ json_files/         # JSON-based topic lists & metadata
β”œβ”€β”€ prompts/            # prompt templates for LLM calls
β”œβ”€β”€ CASES/              # generated cases, organized by FOLDER_PREFIX
β”‚   └── TEST-LIST/      # example multi-topic generation results
β”‚   └── TEST-single/    # example single-topic generation results

πŸ“Š Evaluation -- MMMC

We evaluate along three complementary dimensions:

  1. Knowledge Transfer (TeachQuiz)

    python3 eval_TQ.py
  2. Aesthetic & Structural Quality (AES)

    python3 eval_AES.py
  3. Efficiency Metrics (During Creating)

    • Token usage
    • Execution time

πŸ‘‰ More data and evaluation scripts are available at: HuggingFace: MMMC Benchmark


πŸ™ Acknowledgements

  • Video data is sourced from the 3Blue1Brown official lessons. These videos represent the upper bound of clarity and aesthetics in educational video design and inform our evaluation metrics.
  • We thank all the Show Lab @ NUS members for support!
  • This project builds upon open-source contributions from Manim Community and the broader AI research ecosystem.
  • High-quality visual assets (icons) are provided by IconFinder and Icons8, which were used to enrich the educational videos.

πŸ“Œ Citation

If you find our work useful, please cite:

@misc{code2video,
      title={Code2Video: A Code-centric Paradigm for Educational Video Generation}, 
      author={Yanzhe Chen and Kevin Qinghong Lin and Mike Zheng Shou},
      year={2025},
      eprint={2510.01174},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.01174}, 
}

If you like our project, please give us a star ⭐ on GitHub for the latest update. Star History Chart

About

Video generation via code

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.7%
  • Shell 1.3%