Highlights
- Pro
MLSys
Mesh TensorFlow: Model Parallelism Made Easier
Development repository for the Triton language and compiler
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
You like pytorch? You like micrograd? You love tinygrad! ❤️
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
An Open Source Machine Learning Framework for Everyone
Bolt is a deep learning library with high performance and heterogeneous flexibility.
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its…
Reliable Allreduce and Broadcast Interface for distributed machine learning
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
Tengine is a lite, high performance, modular inference engine for embedded device
header only, dependency-free deep learning framework in C++14
dabnn is an accelerated binary neural networks inference framework for mobile platform
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Tensors and Dynamic neural networks in Python with strong GPU acceleration
(New version is out: https://github.com/hpi-xnor/BMXNet-v2) BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet