Skip to content

TQTQliu/Light-X

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

20 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Light-X: Generative 4D Video Rendering with Camera and Illumination Control

Tianqi Liu1,2,3  Zhaoxi Chen1  Zihao Huang1,2,3  Shaocong Xu2  Saining Zhang2,4
Chongjie Ye5  Bohan Li6,7  Zhiguo Cao3  Wei Li1  Hao Zhao4,2,*  Ziwei Liu1,*
1S-Lab, NTU  2BAAI  3HUST  4AIR,THU  5FNii, CUHKSZ  6SJTU  7EIT (Ningbo)

ArXiv Visitors

TL;DR: Light-X is a video generation framework that jointly controls camera trajectory and illumination from monocular videos.

teaser_compressed.mp4

🌟 Abstract

Recent advances in illumination control extend image-based methods to video, yet still facing a trade-off between lighting fidelity and temporal consistency. Moving beyond relighting, a key step toward generative modeling of real-world scenes is the joint control of camera trajectory and illumination, since visual dynamics are inherently shaped by both geometry and lighting. To this end, we present Light-X, a video generation framework that enables controllable rendering from monocular videos with both viewpoint and illumination control. 1) We propose a disentangled design that decouples geometry and lighting signals: geometry and motion are captured via dynamic point clouds projected along user-defined camera trajectories, while illumination cues are provided by a relit frame consistently projected into the same geometry. These explicit, fine-grained cues enable effective disentanglement and guide high-quality illumination. 2) To address the lack of paired multi-view and multi-illumination videos, we introduce Light-Syn, a degradation-based pipeline with inverse-mapping that synthesizes training pairs from in-the-wild monocular footage. This strategy yields a dataset covering static, dynamic, and AI-generated scenes, ensuring robust training. Extensive experiments show that Light-X outperforms baseline methods in joint camera-illumination control and surpasses prior video relighting methods under both text- and background-conditioned settings.

πŸ› οΈ Installation

Clone Light-X

git clone https://github.com/TQTQliu/Light-X.git
cd Light-X

Setup environments

conda create -n lightx python=3.10
conda activate lightx
pip install -r requirements.txt

Download Pretrained Models

Pretrained models are hosted on Hugging Face and load automatically during inference.
If your environment cannot access Hugging Face, you may download them manually:

After downloading, specify the local model directory using --transformer_path in inference.py.

πŸš€ Inference

Run inference using the following script:

bash run.sh

All required models will be downloaded automatically.

We also provide EXAMPLE.md with commonly used commands and their corresponding visual outputs. Please refer to this file to better understand the purpose and effect of each argument.

The run.sh script executes inference.py with the following arguments:

python inference.py \
    --video_path [INPUT_VIDEO_PATH] \
    --stride [VIDEO_STRIDE] \
    --out_dir [OUTPUT_DIR] \
    --camera ['traj' | 'target'] \
    --mode ['gradual' | 'bullet' | 'direct' | 'dolly-zoom'] \
    --mask \
    --target_pose [THETA PHI RADIUS X Y] \
    --traj_txt [TRAJECTORY_TXT] \
    --relit_txt [RELIGHTING_TXT] \
    --relit_cond_type ['ic' | 'ref' | 'hdr' | 'bg'] \
    [--relit_vd] \
    [--relit_cond_img CONDITION_IMAGE] \
    [--recam_vd]

Key Arguments:

πŸŽ₯ Camera

  • --camera: Camera control mode:

    • traj: Move the camera along a trajectory
    • target: Render from a fixed target view
  • --mode: Style of camera motion when rendering along a trajectory:

    • gradual: Smooth and continuous viewpoint transition; suitable for natural, cinematic motion
    • bullet: Fast forward-shifting / orbit-like motion with stronger parallax
    • direct: Minimal smoothing; quickly moves from start to end pose
    • dolly-zoom: Hitchcock-style effect where the camera moves while adjusting radius; the subject stays the same size while the background expands/compresses
  • --traj_txt: Path to a trajectory text file (required when --camera traj is used)

  • --target_pose: Target view <theta phi r x y> (required when --camera target is used)

  • --recam_vd: Enable video re-camera mode

See here for more details on camera parameters.

πŸ’‘ Relighting

  • --relit_txt: Path to a relighting parameter text file
  • --relit_vd: Enable video relighting
  • --relit_cond_type: Choose the lighting condition source:
    • ic: IC-Light (text-based / background-based lighting)
    • ref: Reference image lighting
    • hdr: HDR environment map lighting
    • bg: Background image lighting
  • --relit_cond_img: Path to the conditioning image (required for ref / hdr modes)

πŸ”₯ Training

1. Prepare Training Data

Download the dataset .

2. Generate Metadata

Generate the metadata JSON file describing the training samples:

python tools/gen_json.py -r <DATA_PATH>

Then Update the DATASET_META_NAME in train.sh to the path of the newly generated JSON file.

3. Start Training

bash train.sh

4. Convert Zero Checkpoint to fp32

Convert the DeepSpeed ZeRO sharded checkpoint to a single fp32 file for inference.

Example (for step 16000):

python tools/zero_to_fp32.py train_outputs/checkpoint-16000 train_outputs/checkpoint-16000-out --safe_serialization

train_outputs/checkpoint-16000-out is the resulting fp32 checkpoint directory.

You can then pass this directory directly to the inference script:

python inference.py --transformer_path train_outputs/checkpoint-16000-out

πŸ“š Citation

If you find our work useful for your research, please consider citing our paper:

@article{liu2025light,
  title={Light-X: Generative 4D Video Rendering with Camera and Illumination Control},
  author={Liu, Tianqi and Chen, Zhaoxi and Huang, Zihao and Xu, Shaocong and Zhang, Saining and Ye, Chongjie and Li, Bohan and Cao, Zhiguo and Li, Wei and Zhao, Hao and others},
  journal={arXiv preprint arXiv:2512.05115},
  year={2025}
}

β™₯️ Acknowledgement

This work is built on many amazing open-source projects shared by TrajectoryCrafter, IC-Light, and VideoX-Fun. Thanks all the authors for their excellent contributions!

πŸ“§ Contact

If you have any questions, please feel free to contact Tianqi Liu (tq_liu at hust.edu.cn).

About

Light-X: Generative 4D Video Rendering with Camera and Illumination Control

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published