Stars
On-device, real-time multimodal AI. Have natural voice and vision conversations with an AI that runs entirely on your machine. Powered by Gemma 4 E2B and Kokoro.
This is the official code for Paper "FashionM3: Multimodal, Multitask, and Multiround Fashion Assistant based on Unified Vision-Language Model"
harshitIIITD / fashion-ai
Forked from google-gemini/gemini-cliAn open-source AI agent that brings the power of Gemini directly into your terminal.
uses the cua model, responses api, playwright to search for specific trends information from sites like pinterest
A smart fashion web app built with Python (Flask) that recommends personalized outfits based on user profiling and AI analysis. It includes features like real-time virtual try-on with OOTDiffusion,…
An outfit recommendation local app.
👕 Open-source course on architecting, building and deploying a real-time personalized recommender for H&M fashion articles.
Research on Outfit Recommendation Model Based on CNN-Transformer Cross-Modal Fusion
Hierarchical Fashion Graph Network for Personalized Outfit Recommendation, SIGIR 2020
[CVPR 2025] Official implementation of "AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models"
End2End Virtual Try-on with Visual Reference, CVPR2026
A repository for organizing papers, codes and other resources related to Virtual Try-on Models
Learning on the Job: An Experience-Driven, Self-Evolving Agent for Long-Horizon Tasks
[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.
Helios: Real Real-Time Long Video Generation Model
An open source, privacy focused alternative to NotebookLM for teams with no data limit's. Join our Discord: https://discord.gg/ejRNvftDp9
A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.
PaperBanana: Automating Academic Illustration For AI Scientists
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
Translate the video from one language to another and embed dubbing & subtitles.
Tongyi Deep Research, the Leading Open-source Deep Research Agent