-
Alibaba Group
- Hangzhou ⇌ Hong Kong
-
20:23
(UTC +08:00) - https://www.lingyunyang.com/
- https://orcid.org/0000-0002-3186-3189
- @stephenyang1999
- in/stephenyang1999
Highlights
- Pro
Starred repositories
An Open Source Machine Learning Framework for Everyone
Protocol Buffers - Google's data interchange format
LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
Productive, portable, and performant GPU programming in Python.
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
ncnn is a high-performance neural network inference framework optimized for the mobile platform
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
FoundationDB - the open source, distributed, transactional key-value store
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
Transformer related optimization, including BERT, GPT
A flexible, high-performance serving system for machine learning models
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
🚀 A very efficient Texas Holdem GTO solver
Tutorial code on how to build your own Deep Learning System in 2k Lines
[ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs
MSCCL++: A GPU-driven communication stack for scalable AI applications
heterogeneity-aware-lowering-and-optimization
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
An efficient GPU resource sharing system with fine-grained control for Linux platforms.
This repository is an archive. Refer to https://github.com/gvirtus/GVirtuS