Code2Video: A Code-centric Paradigm for Educational Video Generation
Yanzhe Chen,
Kevin Qinghong Lin,
Mike Zheng Shou
Show Lab @ National University of Singapore
Β π Paper Β | Β Β π€ Daily Paper Β | Β Β π€ Dataset Β | Β Β π Project Website Β | Β Β π¬ X (Twitter)
code2video_light.mp4
Learning Topic | Veo3 | Wan2.2 | Code2Video (Ours) |
---|---|---|---|
Hanoi Problem | |||
Large Language Model | |||
Pure Fourier Series |
- [2025.10.6] We have updated the ground truth human-made videos and metadata for the MMMC dataset.
- [2025.10.3] Thanks @_akhaliq for sharing our work on Twitter!
- [2025.10.2] We release the arXiv, code and dataset .
- [2025.9.22] Code2Video has been accepted to the Deep Learning for Code (DL4C) Workshop at NeurIPS 2025.
- π Overview
- π Quick Start: Code2Video
- π Evaluation: MMMC
- π Acknowledgements
- π Citation
Code2Video is an agentic, code-centric framework that generates high-quality educational videos from knowledge points.
Unlike pixel-based text-to-video models, our approach leverages executable Manim code to ensure clarity, coherence, and reproducibility.
Key Features:
- π¬ Code-Centric Paradigm β executable code as the unified medium for both temporal sequencing and spatial organization of educational videos.
- π€ Modular Tri-Agent Design β Planner (storyboard expansion), Coder (debuggable code synthesis), and Critic (layout refinement with anchors) work together for structured generation.
- π MMMC Benchmark β the first benchmark for code-driven video generation, covering 117 curated learning topics inspired by 3Blue1Brown, spanning diverse areas.
- π§ͺ Multi-Dimensional Evaluation β systematic assessment on efficiency, aesthetics, and end-to-end knowledge transfer.
cd src/
pip install -r requirements.txt
Here is the official installation guide for Manim Community v0.19.0, to help everyone correctly set up the environment.
Fill in your API credentials in api_config.json
.
-
LLM API:
- Required for Planner & Coder.
- Best Manim code quality achieved with Claude-4-Opus.
-
VLM API:
- Required for Planner Critic.
- For layout and aesthetics optimization, provide Gemini API key.
- Best quality achieved with gemini-2.5-pro-preview-05-06.
-
Visual Assets API:
- To enrich videos with icons, set
ICONFINDER_API_KEY
from IconFinder.
- To enrich videos with icons, set
We provide two shell scripts for different generation modes:
Script: run_agent_single.sh
Generates a video from a single knowledge point specified in the script.
sh run_agent_single.sh --knowledge_point "Linear transformations and matrices"
Important parameters inside run_agent_single.sh
:
API
: specify which LLM to use.FOLDER_PREFIX
: output folder prefix (e.g.,TEST-single
).KNOWLEDGE_POINT
: target concept, e.g."Linear transformations and matrices"
.
Script: run_agent.sh
Runs all (or a subset of) learning topics defined in long_video_topics_list.json
.
sh run_agent.sh
Important parameters inside run_agent.sh
:
API
: specify which LLM to use.FOLDER_PREFIX
: name prefix for saving output folders (e.g.,TEST-LIST
).MAX_CONCEPTS
: number of concepts to include (-1
means all).PARALLEL_GROUP_NUM
: number of groups to run in parallel.
A suggested directory structure:
src/
βββ agent.py
βββ run_agent.sh
βββ run_agent_single.sh
βββ api_config.json
βββ ...
β
βββ assets/
β βββ icons/ # downloaded visual assets cache via IconFinder API
β βββ reference/ # reference images
β
βββ json_files/ # JSON-based topic lists & metadata
βββ prompts/ # prompt templates for LLM calls
βββ CASES/ # generated cases, organized by FOLDER_PREFIX
β βββ TEST-LIST/ # example multi-topic generation results
β βββ TEST-single/ # example single-topic generation results
We evaluate along three complementary dimensions:
-
Knowledge Transfer (TeachQuiz)
python3 eval_TQ.py
-
Aesthetic & Structural Quality (AES)
python3 eval_AES.py
-
Efficiency Metrics (During Creating)
- Token usage
- Execution time
π More data and evaluation scripts are available at: HuggingFace: MMMC Benchmark
- Video data is sourced from the 3Blue1Brown official lessons. These videos represent the upper bound of clarity and aesthetics in educational video design and inform our evaluation metrics.
- We thank all the Show Lab @ NUS members for support!
- This project builds upon open-source contributions from Manim Community and the broader AI research ecosystem.
- High-quality visual assets (icons) are provided by IconFinder and Icons8, which were used to enrich the educational videos.
If you find our work useful, please cite:
@misc{code2video,
title={Code2Video: A Code-centric Paradigm for Educational Video Generation},
author={Yanzhe Chen and Kevin Qinghong Lin and Mike Zheng Shou},
year={2025},
eprint={2510.01174},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2510.01174},
}
If you like our project, please give us a star β on GitHub for the latest update.