Skip to content
View wkcn's full-sized avatar
🐳
Tell Your World 🎵
🐳
Tell Your World 🎵
  • China

Highlights

  • Pro

Organizations

@apache @dmlc @MiraiTeam @SYSU-IARC

Block or report wkcn

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

MLSys

34 repositories

Mesh TensorFlow: Model Parallelism Made Easier

Python 1,624 255 Updated Nov 17, 2023

Development repository for the Triton language and compiler

MLIR 18,843 2,730 Updated Apr 5, 2026

Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training

C++ 1,870 250 Updated Mar 25, 2026

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Python 14,686 2,248 Updated Dec 1, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,988 4,775 Updated Apr 3, 2026

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 19,756 3,806 Updated Apr 5, 2026

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 32,156 4,025 Updated Apr 5, 2026

AutoKernel 是一个简单易用,低门槛的自动算子优化工具,提高深度学习算法部署效率。

C++ 747 82 Updated Sep 23, 2022

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

C++ 1,003 167 Updated Sep 19, 2024

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 9,391 1,015 Updated Dec 4, 2025

An Open Source Machine Learning Framework for Everyone

C++ 194,463 75,255 Updated Apr 5, 2026

A hyperparameter optimization framework

Python 13,852 1,291 Updated Apr 3, 2026

Bolt is a deep learning library with high performance and heterogeneous flexibility.

C++ 957 164 Updated Apr 11, 2025

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

C++ 23,805 5,981 Updated Apr 5, 2026

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …

C++ 18,217 3,994 Updated Apr 4, 2026

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its…

C++ 4,631 772 Updated May 9, 2025

Reliable Allreduce and Broadcast Interface for distributed machine learning

C++ 513 180 Updated Nov 5, 2020

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 41,955 7,406 Updated Apr 5, 2026

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

Python 3,215 321 Updated Mar 27, 2026
C++ 291 73 Updated Jan 18, 2021

Tengine is a lite, high performance, modular inference engine for embedded device

C++ 4,513 977 Updated Mar 6, 2025

header only, dependency-free deep learning framework in C++14

C++ 6,020 1,396 Updated Apr 17, 2022

dabnn is an accelerated binary neural networks inference framework for mobile platform

C++ 778 102 Updated Nov 12, 2019

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ 28,226 8,863 Updated Mar 31, 2026

A lightweight parameter server interface

C++ 1,562 546 Updated Mar 2, 2026

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 35,303 3,504 Updated Apr 5, 2026

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 98,811 27,405 Updated Apr 5, 2026

Caffe: a fast open framework for deep learning.

C++ 34,751 18,529 Updated Jul 31, 2024

Open Machine Learning Compiler Framework

Python 13,249 3,848 Updated Apr 5, 2026

(New version is out: https://github.com/hpi-xnor/BMXNet-v2) BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet

C++ 351 94 Updated Nov 18, 2019