Skip to content
View szha's full-sized avatar

Organizations

@apache @awslabs @amzn @dmlc @data-apis

Block or report szha

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
42 stars written in C++
Clear filter

Open Source Computer Vision Library

C++ 85,347 56,407 Updated Dec 18, 2025

Caffe: a fast open framework for deep learning.

C++ 34,775 18,584 Updated Jul 31, 2024

Productive, portable, and performant GPU programming in Python.

C++ 27,809 2,374 Updated Oct 6, 2025

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ 27,758 8,829 Updated Dec 18, 2025

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

C++ 20,826 6,746 Updated Oct 25, 2023

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 18,713 3,605 Updated Dec 19, 2025

UPX - the Ultimate Packer for eXecutables

C++ 16,885 1,472 Updated Dec 18, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,923 918 Updated Dec 15, 2025

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 11,518 1,314 Updated Dec 18, 2025

Turi Create simplifies the development of custom machine learning models.

C++ 11,194 1,136 Updated Nov 1, 2023

cuDF - GPU DataFrame Library

C++ 9,389 993 Updated Dec 19, 2025

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,984 1,586 Updated Dec 19, 2025

A Python-embedded modeling language for convex optimization problems.

C++ 6,032 1,135 Updated Dec 18, 2025

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++ 5,578 655 Updated Dec 18, 2025

cuML - RAPIDS Machine Learning Library

C++ 5,062 609 Updated Dec 19, 2025

MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架

C++ 4,807 550 Updated Oct 24, 2024

A tool for use with clang to analyze #includes in C and C++ source files

C++ 4,572 408 Updated Dec 18, 2025

A heap memory profiler for Linux

C++ 3,892 233 Updated Dec 3, 2025

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ 3,517 810 Updated Dec 19, 2025

LightSeq: A High Performance Library for Sequence Processing and Generation

C++ 3,301 333 Updated May 16, 2023

🔥 Pyflame: A Ptracing Profiler For Python. This project is deprecated and not maintained.

C++ 2,985 243 Updated Dec 3, 2019

《金庸群侠传》c++复刻版,已完工

C++ 2,787 393 Updated Sep 29, 2025

The C++ Standard Library for Parallelism and Concurrency

C++ 2,762 494 Updated Dec 18, 2025

An efficient video loader for deep learning with smart shuffling that's super easy to digest

C++ 2,383 215 Updated Jul 17, 2024

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

C++ 2,307 191 Updated Feb 7, 2024

Circuit IR Compilers and Tools

C++ 1,981 402 Updated Dec 19, 2025

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1,699 628 Updated Dec 17, 2025
C++ 1,655 280 Updated Sep 11, 2018

C-Reduce, a C and C++ program reducer

C++ 1,629 137 Updated Jun 1, 2024

A lightweight parameter server interface

C++ 1,557 549 Updated Jan 11, 2023
Next