Skip to content
View szha's full-sized avatar

Organizations

@apache @awslabs @amzn @dmlc @data-apis

Block or report szha

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
42 stars written in C++
Clear filter

Open Source Computer Vision Library

C++ 86,839 56,547 Updated Mar 27, 2026

Caffe: a fast open framework for deep learning.

C++ 34,765 18,535 Updated Jul 31, 2024

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ 28,189 8,860 Updated Mar 25, 2026

Productive, portable, and performant GPU programming in Python.

C++ 28,111 2,379 Updated Jan 5, 2026

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

C++ 20,815 6,718 Updated Oct 25, 2023

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 19,691 3,798 Updated Mar 30, 2026

UPX - the Ultimate Packer for eXecutables

C++ 17,308 1,499 Updated Mar 29, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,544 1,007 Updated Feb 6, 2026

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 11,721 1,332 Updated Mar 29, 2026

Turi Create simplifies the development of custom machine learning models.

C++ 11,181 1,131 Updated Nov 1, 2023

cuDF - GPU DataFrame Library

C++ 9,590 1,034 Updated Mar 27, 2026

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,502 1,757 Updated Mar 24, 2026

A Python-embedded modeling language for convex optimization problems.

C++ 6,160 1,170 Updated Mar 29, 2026

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++ 5,655 660 Updated Mar 24, 2026

cuML - RAPIDS Machine Learning Library

C++ 5,163 620 Updated Mar 25, 2026

MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架

C++ 4,805 549 Updated Oct 24, 2024

A tool for use with clang to analyze #includes in C and C++ source files

C++ 4,649 416 Updated Mar 29, 2026

A heap memory profiler for Linux

C++ 4,034 234 Updated Mar 25, 2026

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ 3,680 871 Updated Mar 29, 2026

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,302 333 Updated May 16, 2023

🔥 Pyflame: A Ptracing Profiler For Python. This project is deprecated and not maintained.

C++ 2,985 243 Updated Dec 3, 2019

《金庸群侠传》c++复刻版,已完工

C++ 2,859 401 Updated Mar 30, 2026

The C++ Standard Library for Parallelism and Concurrency

C++ 2,816 540 Updated Mar 29, 2026

An efficient video loader for deep learning with smart shuffling that's super easy to digest

C++ 2,453 221 Updated Jul 17, 2024

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

C++ 2,307 191 Updated Feb 7, 2024

Circuit IR Compilers and Tools

C++ 2,074 445 Updated Mar 28, 2026

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1,773 670 Updated Mar 28, 2026
C++ 1,652 277 Updated Sep 11, 2018

C-Reduce, a C and C++ program reducer

C++ 1,647 139 Updated Jun 1, 2024

A lightweight parameter server interface

C++ 1,562 546 Updated Mar 2, 2026
Next