Stars
Your Personal AI Assistant; easy to install, deploy on your own machine or on the cloud; supports multiple chat apps with easily extensible capabilities.
🎨 NeMo Data Designer: Generate high-quality synthetic data from scratch or from seed data.
Renderer for the harmony response format to be used with gpt-oss
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Qwen3.5 is the large language model series developed by Qwen team, Alibaba Cloud.
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
PyFlightProfiler: A diagnostic toolbox for Python applications that provides non-intrusive, low-overhead capabilities for online analysis.
fault-tolerant Python3 package for searching, navigating, and modifying LaTeX documents
💫 Toolkit to help you get started with Spec-Driven Development
Markdown rendering + Latex extras (equations, tables, ...), with conversion features, for the scientific community
Pure Python library for LaTeX to MathML conversion
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.
Find, verify, and analyze leaked credentials
Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4
DFloat11 [NeurIPS '25]: Lossless Compression of LLMs and DiTs for Efficient GPU Inference
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models
Audio Dataset for training CLAP and other models
Lexbor is development of an open source HTML Renderer library. https://lexbor.com
🌟 Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)
Contains code to scrape scriptsonscreen scripts website and scrapped data
The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.
💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
An open source platform for Python in the browser. https://pyscript.net Docs: https://docs.pyscript.net/ Try it: https://pyscript.com/ Community: https://discord.gg/HxvBtukrg2
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!