Stars
VoiceHub: A Unified Inference Interface for TTS Models
An AI-powered tool for summarizing YouTube videos by generating scene descriptions, translating them, and creating subtitled videos with text-to-speech narration
oztrkoguz / smol-course
Forked from huggingface/smol-courseA course on aligning smol models.
A course on aligning smol models.
OCR, layout analysis, reading order, table recognition in 90+ languages
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
"MGPT-Langchain-ChatBot-Multi-Functionality-Ollama" is a versatile chatbot framework that uses LangChain to build a context-aware chatbot in Python. This project integrates with Ollama, enabling mu…
This project aims to compare different Retrieval-Augmented Generation (RAG) frameworks in terms of speed and performance.
Agentic components of the Llama Stack APIs
Image Upscaler with Tile Controlnet Fully Integrated in Huggingface Diffusers
Dream Interpreter inside ComfyUI
Famous Vision Language Models and Their Architectures
It automatically describes images in PDF files and generates questions from these descriptions. With its advanced RAG structure, it directs these questions directly to PDF text content, providing c…
This project is an automated research and summarization tool that allows users to conduct research on a specific question and summarize the information found and present it as a blog post.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
gokayfem / HunyuanDiT
Forked from Tencent-Hunyuan/HunyuanDiTHunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
This project offers a user-friendly interface that allows users to easily create stories and enrich them with visuals. It supports creativity with story creation and visualisation features.
Image identification with Kosmos2 model, drawing and cutting bbox with object detection
Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation