- [2025/12/04] π OmniSVG @ NeurIPS 2025 β Come see our poster!
- π Date: Thursday, December 4, 2025
- π Time: 11:00 AM β 2:00 PM PST
- π Location: Exhibit Hall C, D, E β Poster #5512
- [2025/12/02] We have released the OmniSVG1.1_8B weights and updated OmniSVG1.1_4B model weights! Check out OmniSVG1.1_8B and OmniSVG1.1_4B.
- [2025/12/02] We have released MMSVGBench benchmark dataset and evaluation code! Check out MMSVGBench and Evaluation.
- [2025/09/18] OmniSVG is accepted to NeurIPS 2025π₯! See you in San Diego!
- [2025/07/22] π We have released the Huggingface Demo. π€Demo.
- [2025/07/22] π We have released the inference code and model weight of MMSVG-Icon and MMSVG-Illustration dataset. π€Weight.
- [2025/04/09] π Release MMSVG-Icon and MMSVG-Illustration π€Dataset.
- [2025/04/09] π Upload paper and init project. Read
If you are developing / using OmniSVG in your projects, or you want to contribute to OmniSVG, please let us know π.
- If you find data issues when using MMSVG dataset, please drop an issue in this form.
- π OmniSVG ComfyUI Plugin by @smthemex ComfyUI_OmniSVG.
- Project Page & Technical Report
- MMSVG-Icon and MMSVG-Illustration Dataset Release
- Inference Code & Model Weight of MMSVG-Icon and MMSVG-Illustration Dataset
- Online Demo (Gradio deployed on Huggingface)
- Model Weight of OmniSVG1.1_8B Release
- Model Weight of OmniSVG1.1_4B Release
- MMSVGBench Benchmark & Evaluation Code Release
OmniSVG is the first family of end-to-end multimodal SVG generators that leverage pre-trained Vision-Language Models (VLMs), capable of generating complex and detailed SVGs, from simple icons to intricate anime characters. We also introduce MMSVG-2M, a multimodal dataset with two million richly annotated SVG assets, along with a standardized evaluation protocol for conditional SVG generation tasks.
| Model | Download link | Size | Update date |
|---|---|---|---|
| OmniSVG1.1_8B | Huggingface | 17.2 GB | 2025-12-02 |
| OmniSVG1.1_4B | Huggingface | 7.69 GB | 2025-12-02 |
| OmniSVG-3B | Huggingface | 8.49 GB | 2025-07-22 |
The dependencies configured according to the following instructions provide an environment equipped for inference
git clone https://github.com/OmniSVG/OmniSVG.git
cd OmniSVGCreate and activate a new conda environment with Python 3.10:
conda create -n omnisvg python=3.10
conda activate omnisvgBefore installing Python packages, you need to install Cairo library which is required by CairoSVG in our dependencies:
macOS:
brew install cairoLinux (Ubuntu/Debian):
sudo apt update
sudo apt install libcairo2 libcairo2-devNote: Installing Cairo system library beforehand helps prevent potential build errors when installing
CairoSVGvia pip.
We have tested our environment with CUDA 12.1. You can install CUDA 12.1 by following the CUDA Toolkit installation guide.
Install PyTorch with CUDA 12.1 support:
pip install torch==2.3.0+cu121 torchvision==0.18.0+cu121 --index-url https://download.pytorch.org/whl/cu121Install remaining dependencies:
pip install -r requirements.txt| GPU Memory Usage | Time per 256/512/1024/2048/4096 tokens | |
|---|---|---|
| OmniSVG1.1_8B | 26G | 5.38/9.02/20.11/40.34/98.11 seconds |
| OmniSVG1.1_4B | 17G | 4.08/8.68/18.07/37.51/82.70 seconds |
| OmniSVG-3B | 17G | 4.08/8.68/18.07/37.51/82.70 seconds |
Note: The inference time shown here is measured per OmniSVG SVG tokens, while the inference time reported in our paper is measured per XML code tokens for fair comparison with baseline methods.
Download Model Weights
First, install the Hugging Face CLI tool:
pip install huggingface-hubDownload the model from Hugging Face:
# Download OmniSVG1.1-8B
huggingface-cli download OmniSVG/OmniSVG1.1_8B --local-dir /PATH/TO/OmniSVG1.1_8B
# Download OmniSVG1.1-4B
huggingface-cli download OmniSVG/OmniSVG1.1_4B --local-dir /PATH/TO/OmniSVG1.1_4B
# Download OmniSVG-3B (legacy)
huggingface-cli download OmniSVG/OmniSVG --local-dir /PATH/TO/OmniSVG-3BBasic usage - Generate SVG from txt file:
python inference.py --task text-to-svg --input prompts.txt --output ./output_text --save-all-candidatesUse 4B model:
python inference.py --task text-to-svg --input prompts.txt --output ./output_text --model-size 4B --save-all-candidatesGenerate more candidates and save PNG:
python inference.py --task text-to-svg --input prompts.txt --output ./output_text \
--num-candidates 8 --save-png --save-all-candidatesCustom generation parameters:
python inference.py --task text-to-svg --input prompts.txt --output ./output_text \
--temperature 0.5 --top-p 0.9 --top-k 50 --repetition-penalty 1.05Use local model:
python inference.py --task text-to-svg --input prompts.txt --output ./output_text \
--model-path /path/to/qwen --weight-path /path/to/omnisvgpython inference.py --task image-to-svg --input ./examples --output ./output_image --save-all-candidatesWe provide an interactive generation interface using Gradio:
-
Local Deployment
python app.py
-
Online Demo
Try our live demo on Hugging Face Spaces
We provide MMSVGBench for standardized evaluation of SVG generation models.
Download MMSVGBench:
huggingface-cli download OmniSVG/MMSVGBench --repo-type dataset --local-dir /PATH/TO/MMSVGBenchMMSVGBench is a purely synthetic benchmark where all prompts and images are generated using GPT models, ensuring the data is unseen during model training for fair generalization evaluation. The generation procedure MMSVGBench's prompt is logged, for example the text2svg prompt log.
| Task | Complexity Level | Samples | Description |
|---|---|---|---|
| Text-to-SVG | Icon | 150 | Simple icons (1-2 elements) |
| Text-to-SVG | Illustration | 150 | Complex illustrations (1-3 interacting elements) |
| Image-to-SVG | Icon | 150 | GPT-4o generated icon images |
| Image-to-SVG | Illustration | 150 | GPT-4o generated illustration images |
Key Advantages of Synthetic Design:
- β True generalization test β models cannot have seen these samples during training
- β Controlled diversity β systematic coverage of styles and semantic categories
- β Fairness β no model has unfair advantage from training data overlap
The evaluation code is available in the metrics directory. For more details about MMSVGBench construction and evaluation metrics, please check MMSVGBench.
OmniSVG is licensed under the Apache License 2.0, while MMSVG dataset is under Creative Commons Attribution Non Commercial Share Alike 4.0 License. You can find the license files in the respective github and HuggingFace repositories.
@article{yang2025omnisvg,
title={OmniSVG: A Unified Scalable Vector Graphics Generation Model},
author={Yiying Yang and Wei Cheng and Sijin Chen and Xianfang Zeng and Jiaxu Zhang and Liao Wang and Gang Yu and Xinjun Ma and Yu-Gang Jiang},
journal={arXiv preprint arxiv:2504.06263},
year={2025}
}We thank the following excellent open-source works:
IconShop: is the first advanced work that leverages LLMs to generate monochrome, icon-level SVGs. We referred to its parametric implementation.
Here is the list of highly related concurrent works:
LLM4SVG: treats SVG coordinates as number strings and predicts decimal part for higher spatial accuracy.
StarVector: equips LLM with an image encoder for Image-to-SVG generation.