GitHub - TQTQliu/Light-X: Light-X: Generative 4D Video Rendering with Camera and Illumination Control

Light-X: Generative 4D Video Rendering with Camera and Illumination Control

Tianqi Liu^1,2,3 Zhaoxi Chen¹ Zihao Huang^1,2,3 Shaocong Xu² Saining Zhang^2,4
Chongjie Ye⁵ Bohan Li^6,7 Zhiguo Cao³ Wei Li¹ Hao Zhao^4,2,* Ziwei Liu^1,*

¹S-Lab, NTU ²BAAI ³HUST ⁴AIR,THU ⁵FNii, CUHKSZ ⁶SJTU ⁷EIT (Ningbo)

TL;DR: Light-X is a video generation framework that jointly controls camera trajectory and illumination from monocular videos.

teaser_compressed.mp4

🌟 Abstract

Recent advances in illumination control extend image-based methods to video, yet still facing a trade-off between lighting fidelity and temporal consistency. Moving beyond relighting, a key step toward generative modeling of real-world scenes is the joint control of camera trajectory and illumination, since visual dynamics are inherently shaped by both geometry and lighting. To this end, we present Light-X, a video generation framework that enables controllable rendering from monocular videos with both viewpoint and illumination control. 1) We propose a disentangled design that decouples geometry and lighting signals: geometry and motion are captured via dynamic point clouds projected along user-defined camera trajectories, while illumination cues are provided by a relit frame consistently projected into the same geometry. These explicit, fine-grained cues enable effective disentanglement and guide high-quality illumination. 2) To address the lack of paired multi-view and multi-illumination videos, we introduce Light-Syn, a degradation-based pipeline with inverse-mapping that synthesizes training pairs from in-the-wild monocular footage. This strategy yields a dataset covering static, dynamic, and AI-generated scenes, ensuring robust training. Extensive experiments show that Light-X outperforms baseline methods in joint camera-illumination control and surpasses prior video relighting methods under both text- and background-conditioned settings.

🛠️ Installation

Clone Light-X

git clone https://github.com/TQTQliu/Light-X.git
cd Light-X

Setup environments

conda create -n lightx python=3.10
conda activate lightx
pip install -r requirements.txt

Download Pretrained Models

Pretrained models are hosted on Hugging Face and load automatically during inference.
If your environment cannot access Hugging Face, you may download them manually:

Text-based / background-image lighting: tqliu/Light-X
HDR / reference-image lighting (also supports text/bg): tqliu/Light-X-Uni

After downloading, specify the local model directory using --transformer_path in inference.py.

🚀 Inference

Run inference using the following script:

bash run.sh

All required models will be downloaded automatically.

We also provide EXAMPLE.md with commonly used commands and their corresponding visual outputs. Please refer to this file to better understand the purpose and effect of each argument.

The run.sh script executes inference.py with the following arguments:

python inference.py \
    --video_path [INPUT_VIDEO_PATH] \
    --stride [VIDEO_STRIDE] \
    --out_dir [OUTPUT_DIR] \
    --camera ['traj' | 'target'] \
    --mode ['gradual' | 'bullet' | 'direct' | 'dolly-zoom'] \
    --mask \
    --target_pose [THETA PHI RADIUS X Y] \
    --traj_txt [TRAJECTORY_TXT] \
    --relit_txt [RELIGHTING_TXT] \
    --relit_cond_type ['ic' | 'ref' | 'hdr' | 'bg'] \
    [--relit_vd] \
    [--relit_cond_img CONDITION_IMAGE] \
    [--recam_vd]

Key Arguments:

🎥 Camera

--camera: Camera control mode:
- traj: Move the camera along a trajectory
- target: Render from a fixed target view
--mode: Style of camera motion when rendering along a trajectory:
- gradual: Smooth and continuous viewpoint transition; suitable for natural, cinematic motion
- bullet: Fast forward-shifting / orbit-like motion with stronger parallax
- direct: Minimal smoothing; quickly moves from start to end pose
- dolly-zoom: Hitchcock-style effect where the camera moves while adjusting radius; the subject stays the same size while the background expands/compresses
--traj_txt: Path to a trajectory text file (required when --camera traj is used)
--target_pose: Target view <theta phi r x y> (required when --camera target is used)
--recam_vd: Enable video re-camera mode

See here for more details on camera parameters.

💡 Relighting

--relit_txt: Path to a relighting parameter text file
--relit_vd: Enable video relighting
--relit_cond_type: Choose the lighting condition source:
- ic: IC-Light (text-based / background-based lighting)
- ref: Reference image lighting
- hdr: HDR environment map lighting
- bg: Background image lighting
--relit_cond_img: Path to the conditioning image (required for ref / hdr modes)

🔥 Training

1. Prepare Training Data

Download the dataset .

2. Generate Metadata

Generate the metadata JSON file describing the training samples:

python tools/gen_json.py -r <DATA_PATH>

Then Update the DATASET_META_NAME in train.sh to the path of the newly generated JSON file.

3. Start Training

bash train.sh

4. Convert Zero Checkpoint to fp32

Convert the DeepSpeed ZeRO sharded checkpoint to a single fp32 file for inference.

Example (for step 16000):

python tools/zero_to_fp32.py train_outputs/checkpoint-16000 train_outputs/checkpoint-16000-out --safe_serialization

train_outputs/checkpoint-16000-out is the resulting fp32 checkpoint directory.

You can then pass this directory directly to the inference script:

python inference.py --transformer_path train_outputs/checkpoint-16000-out

📚 Citation

If you find our work useful for your research, please consider citing our paper:

@article{liu2025light,
  title={Light-X: Generative 4D Video Rendering with Camera and Illumination Control},
  author={Liu, Tianqi and Chen, Zhaoxi and Huang, Zihao and Xu, Shaocong and Zhang, Saining and Ye, Chongjie and Li, Bohan and Cao, Zhiguo and Li, Wei and Zhao, Hao and others},
  journal={arXiv preprint arXiv:2512.05115},
  year={2025}
}

♥️ Acknowledgement

This work is built on many amazing open-source projects shared by TrajectoryCrafter, IC-Light, and VideoX-Fun. Thanks all the authors for their excellent contributions!

📧 Contact

If you have any questions, please feel free to contact Tianqi Liu (tq_liu at hust.edu.cn).

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
config		config
core		core
test		test
tools		tools
.gitignore		.gitignore
EXAMPLE.md		EXAMPLE.md
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
inference.py		inference.py
requirements.txt		requirements.txt
run.sh		run.sh
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Light-X: Generative 4D Video Rendering with Camera and Illumination Control

🌟 Abstract

🛠️ Installation

Clone Light-X

Setup environments

Download Pretrained Models

🚀 Inference

Key Arguments:

🔥 Training

1. Prepare Training Data

2. Generate Metadata

3. Start Training

4. Convert Zero Checkpoint to fp32

📚 Citation

♥️ Acknowledgement

📧 Contact

About

Uh oh!

Releases

Packages

Languages

License

TQTQliu/Light-X

Folders and files

Latest commit

History

Repository files navigation

Light-X: Generative 4D Video Rendering with Camera and Illumination Control

🌟 Abstract

🛠️ Installation

Clone Light-X

Setup environments

Download Pretrained Models

🚀 Inference

Key Arguments:

🔥 Training

1. Prepare Training Data

2. Generate Metadata

3. Start Training

4. Convert Zero Checkpoint to fp32

📚 Citation

♥️ Acknowledgement

📧 Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages