One-click Windows installer for Z-Image Turbo AI image generation. Optimized for low-VRAM GPUs (4GB+). Features Gradio web UI, automatic setup, and GGUF model support.
-
Updated
May 8, 2026 - PowerShell
One-click Windows installer for Z-Image Turbo AI image generation. Optimized for low-VRAM GPUs (4GB+). Features Gradio web UI, automatic setup, and GGUF model support.
Hierarchical RAG architecture scaling to 693K chunks on consumer hardware (4GB VRAM). Features 3-address routing, hybrid vector+graph fusion, and SetFit classification.
The most efficient one-page LoRA trainer for Anima 2B. Optimized for 6GB+ VRAM, featuring a smart dataset analyzer and real-time previews.
"Adaptive Hybrid Quantization Framework for deploying 7B+ LLMs on low-VRAM devices (e.g., GTX 1050). Features surgical block alignment and Numba-accelerated inference.
A ComfyUI Workflow for low vram users
Taiwanese Hokkien (Taigi) speech-to-text transcriber - MediaTek Breeze-ASR-26 with faster-whisper, tuned for RTX 3050 4GB low-VRAM GPUs. Gradio UI, CLI, Docker, SRT/VTT/TXT/JSON.
🎥 Generate high-quality videos on budget hardware with the Wan 2.2 14B Low-VRAM Workflow for ComfyUI, optimized for smooth performance and quick results.
Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.
Perkunas AI Training Platform is a memory-aware model training and serving system for serious language model experimentation under tight hardware limits. It combines streaming training, rich telemetry, guarded recovery, checkpoint export, and OpenAI-compatible serving.
Simple FP16 image upscaler for all GPUs (low-mid end users)
Contains the notebooks and workflows configured to run inference from Wan 2.2 Animate with ComfyUI on Kaggle T4 GPUs smoothly
Lightweight Stable Diffusion engine with plugin-based pipelines, VRAM-safe execution, and full 4GB GPU support.
Audit local LLM function calling and agentic reliability. Visual tool-use benchmarking for quantized models on YOUR hardware.
A depth-guided, category-conditioned lightweight geometry completion network for indoor furniture reconstruction on consumer low-VRAM GPUs — 35M parameters, 88 MB, 0.5 s per object
Lightweight SDXL LoRA trainer optimized for 8GB VRAM GPUs. GUI with training, auto-captioning (Ollama) and image search (SearXNG).
A privacy-first Generative AI pipeline for prototyping 3D-style game assets on consumer hardware. Optimized for low-VRAM (4GB) GPUs using PyTorch, Diffusers, and Streamlit.
Tiny GPT-style training and local inference demo for consumer hardware.
A production-ready, frugal, sovereign AI system that orchestrates India's open-source language models to achieve state-of-the-art reasoning on consumer hardware through Test-Time Compute (TTC) and Cognitive Serialization.
Technical Showcase: 22B True-MoE Engine running on 6GB VRAM (GTX 1060). Demonstrates "Surgical" NF4 quantization, dynamic expert swapping, and the custom "Grace Hopper" pipeline.
🚀 Run modern 7B LLMs on legacy 4GB GPUs without crashes, breaking the VRAM barrier for developers facing GPU limitations.
Add a description, image, and links to the low-vram topic page so that developers can more easily learn about it.
To associate your repository with the low-vram topic, visit your repo's landing page and select "manage topics."