Lists (1)
Sort Name ascending (A-Z)
Stars
FlowLens is an open-source MCP server that gives your coding agent (Claude Code, Cursor, Copilot, Codex) full browser context for in-depth debugging and regression testing.
A modern selfhosted media management system for your media library
All-in-One Sandbox for AI Agents that combines Browser, Shell, File, MCP and VSCode Server in a single Docker container.
A central hub for translating stuff into Arabic (Join our Discord Server, if you want to help)
A production-ready FastAPI boilerplate application with a comprehensive set of features for modern web backend development.
Lets make video diffusion practical!
Translate full books and large texts with LLM autonomously
🤗 smolagents: a barebones library for agents that think in code.
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
Local media center for those who want to control what they watch
AirLLM 70B inference with single 4GB GPU
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
A general fine-tuning kit geared toward image/video/audio diffusion models.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Downloads any image from civitai in bulk that has a certain amount of image reactions
An SDK/Python library for Automatic 1111 to run state-of-the-art diffusion models
Ethical shopping for Egyptians. Used by over 150,000 people to date!
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
PLVS is a real-time SLAM system with points, lines, volumetric mapping and 3D unsupervised incremental segmentation.
Compare the performance of different LLM that can be deployed locally on consumer hardware. Run yourself with Colab WebUI.
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
Sweep: AI coding assistant for JetBrains
Imaginate is a gradio app for generating images from initial images and prompts
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation