Lists (6)
Sort Name ascending (A-Z)
Stars
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…
专门面向中文上市公司财报 PDF的预处理工具,用来把复杂的 PDF 文档解析成 结构化的「父子分块 + 表格组」JSON 数据,方便后续做向量检索、问答系统、风控分析等。
Wallpaper engine PKG extractor/TEX to image converter
A large-scale face dataset for face parsing, recognition, generation and editing.
Spriteworld: a flexible, configurable python-based reinforcement learning environment
Burgess et al. "MONet: Unsupervised Scene Decomposition and Representation"
An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"
(CVPR 2025) Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis
Latex code for making neural networks diagrams
Implementation of Slot Attention from GoogleAI
Multi-object image datasets with ground-truth segmentation masks and generative factors.
A modern Fluent Design replacement for the old Metro themed flyouts present in Windows.
Beautiful icons for your favourite terminal apps like Hyper and iTerm2
Play and synthesize MIDI to audio - easy to use Python/CLI API to FluidSynth.
ElevenClock: Customize Windows 11 taskbar clock
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
Large Concept Models: Language modeling in a sentence representation space
A much modified lyric editor with fancy (maybe) UI based on WPF
Kandinsky 2 — multilingual text2image latent diffusion model
OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.