Stars
Stable Diffusion web UI
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Robust Speech Recognition via Large-Scale Weak Supervision
A latent text-to-image diffusion model
A high-throughput and memory-efficient inference and serving engine for LLMs
Ghidra is a software reverse engineering (SRE) framework
Official electron build of draw.io
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
The definitive Web UI for local AI, with powerful features and easy setup.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Visualizer for neural network, deep learning and machine learning models
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
sketch + style = paints 🎨 (TOG2018/SIGGRAPH2018ASIA)
WebUI extension for ControlNet
Lets make video diffusion practical!
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
An open source implementation of CLIP.
超轻量级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M
A native gRPC client & server implementation with async/await support.