Stars
Models and examples built with TensorFlow
Fast and memory-efficient exact attention
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Repo for counting stars and contributing. Press F to pay respect to glorious developers.
[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Muon is an optimizer for hidden layers in neural networks
Java Solutions to problems on LintCode/LeetCode
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
interactive visualization of 5 popular gradient descent methods with step-by-step illustration and hyperparameter tuning UI
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
A latent text-to-image diffusion model
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
a vue2.0 minimal admin template
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
High-Resolution Image Synthesis with Latent Diffusion Models
100+ Chinese Word Vectors 上百种预训练中文词向量
Code for ACL 2021 paper "ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information"