-
Alibaba
- Beijing
-
21:37
(UTC +08:00) - https://www.zhihu.com/people/wang-zhao-de-6
Lists (1)
Sort Name ascending (A-Z)
Stars
AI coding CLI conversation visualizer - supports Claude Code, Codex, Gemini CLI, OpenCode
Memory Sparse Attention - A scalable, end-to-end trainable latent-memory framework for 100M-token contexts.
Train the smallest LM you can that fits in 16MB. Best model wins!
A light-weight self-hosted AI API gateway that proxies requests to multiple backend providers (OpenAI, Anthropic, Gemini) with user management, quota control, and audit logging.
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
JavaScript in-page GUI agent. Control web interfaces with natural language.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
A lightweight, production-ready C++ library for LLM tokenization, fully compatible with HuggingFace tokenizer.json.
A lightweight, single-header C++11 Jinja2 template engine for LLM chat templates.
🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…
Curated list of datasets and tools for post-training.
⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / verl / LLaMA Factory / ms-swift / U…
Hierarchical Reasoning Model Official Release
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
An open-source AI agent that brings the power of Gemini directly into your terminal.
A third-party MNN server supporting external calls, embedding model, TTS model and ASR model features.一个支持外部调用、向量模型、文字转语音模型和语音识别模型特性的第三方MNN服务器
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models (dLLMs with block diffusion, mixed-CoT, unified RL)
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
Open-source high-performance RISC-V processor
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
llm deploy project based mnn. This project has merged into MNN.