🎬 AI Video Creator Pipeline

This project is a prototype of an end-to-end pipeline that generates short, multi-scene videos from a single text prompt. It leverages a series of state-of-the-art, open-source AI models, each specialized for a different part of the creative process, all wrapped in a simple, user-friendly web interface.

The core workflow is designed to mimic a real production studio:

The Director (LLM): Creates a script and storyboard.
The Art Department (Text-to-Image): Generates concept art for each scene.
The Animation Studio (Image-to-Video): Animates the concept art into video clips.

✨ Features

Multi-Step Guided UI: A tabbed web interface (Gradio) guides the user through the video creation process.
State-of-the-Art Models: Uses separate models for LLM, image generation, and video generation for best results.
Modular Architecture: Clear folder/module separation so components can be swapped or upgraded easily.
Automated Pipeline: Outputs from one model feed into the next with minimal manual overhead.

🖼️ Screenshots

Step 1: Storyboard Generation	Step 2: Image Generation

Step 3: Video Clip Generation	Final Output Example

🛠️ Technology Stack

Director LLM: Meta-Llama-3.1-8B-Instruct
Image Generation: black-forest-labs/FLUX.1-dev
Video Generation: Wan-AI/Wan2.2-T2V-A14B
Web Framework: Gradio
Core Libraries: transformers, diffusers, torch
GPU Backend: AMD ROCm

🚀 Quick Start

These steps assume you're working on a Linux machine with an AMD GPU and have git installed.

1. Clone the repository

git clone https://github.com/YourUsername/ai-video-creator.git
cd ai-video-creator

2. Set up the Python environment

Create and activate a virtual environment, then install dependencies:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Tip: Use conda or pipx if you prefer those tools — adapt the steps accordingly.

3. Download the AI models

The models are not included in this repo. Run the provided scripts to download them:

# Download the Llama 3.1 LLM
python download_llm.py

# Download the FLUX.1 Image Model
python download_image_model.py

# Download the Wan2.2 Video Model
# Note: Wan2.2 may require cloning a separate repo first — see the Wan2.2 repo page.

Place the downloaded model folders inside ./models/ (e.g., ./models/llama-3.1/, ./models/flux1/, ./models/wan2.2/). The repo includes .gitkeep placeholders demonstrating the expected structure.

4. Install Wan2.2-specific dependencies

Wan2.2 has its own dependencies. Install them inside the same virtual environment (or a dedicated one):

cd Wan2.2
pip install -r requirements.txt
cd ..

Note: You may need system-level packages (FFmpeg, libsndfile, build tools). Follow Wan2.2 repo instructions if you encounter errors.

5. Configure model paths (optional)

If you store models outside ./models, create a .env or update config/*.yaml with the correct paths. Example .env:

MODEL_DIR=./models
LLM_PATH=./models/llama-3.1
IMAGE_MODEL_PATH=./models/flux1
VIDEO_MODEL_PATH=./models/wan2.2

(Usage)

Start the app from the project root:

python app.py

The app will load the models (this may take several minutes). Gradio will print a local URL such as http://localhost:7860 and, optionally, a public *.gradio.live sharing URL.

Important: Model loading is resource-intensive. On ROCm setups, ensure PyTorch/ROCm and drivers are installed and tested first.

Workflow (UI)

Welcome Tab — Overview + quick-start instructions.
Step 1: Storyboard Tab — Enter a video idea and click Generate Storyboard. The LLM outputs scenes, shot descriptions, and recommended camera moves/timings.
Step 2: Images Tab — Generate and review concept art for each scene. Regenerate or refine prompts as needed.
Step 3: Videos Tab — Render short video clips from the images. Preview, download, and iterate.

Outputs are saved to ./outputs/ with a timestamped folder for each run.

Contact

If you'd like help setting this up for ROCm, creating .env/config.sample.yaml, or automating downloads with scripts, open an issue or reach out to ni3.singh.r@gmail.com (replace with your contact).

Generated with ❤️

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.gradio		.gradio
.ipynb_checkpoints		.ipynb_checkpoints
assets		assets
.gitignore		.gitignore
README.md		README.md
app.py		app.py
app_image.py		app_image.py
app_text.py		app_text.py
app_video.py		app_video.py
download_image_model.py		download_image_model.py
download_llm.py		download_llm.py
pipeline.py		pipeline.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎬 AI Video Creator Pipeline

✨ Features

🖼️ Screenshots

🛠️ Technology Stack

🚀 Quick Start

1. Clone the repository

2. Set up the Python environment

3. Download the AI models

4. Install Wan2.2-specific dependencies

5. Configure model paths (optional)

(Usage)

Workflow (UI)

Contact

About

Uh oh!

Releases

Packages

Languages

NI3singh/ARON

Folders and files

Latest commit

History

Repository files navigation

🎬 AI Video Creator Pipeline

✨ Features

🖼️ Screenshots

🛠️ Technology Stack

🚀 Quick Start

1. Clone the repository

2. Set up the Python environment

3. Download the AI models

4. Install Wan2.2-specific dependencies

5. Configure model paths (optional)

(Usage)

Workflow (UI)

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages