bbshocking

bbshocking

11 followers · 30 following

Stars

nebuly-ai / optimate

A collection of libraries to optimise AI model performances

Python 8,366 632 Updated Jul 22, 2024

NVIDIA / libnvidia-container

NVIDIA container runtime library

C 1,031 247 Updated Nov 4, 2025

Uniswap / v3-core

🦄 🦄 🦄 Core smart contracts of Uniswap v3

TypeScript 4,860 2,992 Updated Nov 3, 2024

PersiaML / PERSIA

High performance distributed framework for training deep learning recommendation models based on PyTorch.

Rust 409 55 Updated Jun 14, 2025

EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Python 7,325 1,090 Updated Sep 26, 2025

F-Stack / f-stack

F-Stack is an user space network development kit with high performance based on DPDK, FreeBSD TCP/IP stack and coroutine API.

C 4,137 942 Updated Nov 5, 2025

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

Python 41,221 4,536 Updated Oct 13, 2025

ParCoreLab / ComScribe

ComScribe is a tool to identify communication among all GPU-GPU and CPU-GPU pairs in a single-node multi-GPU system.

C++ 27 4 Updated Jul 6, 2023

milvus-io / milvus

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

Go 38,280 3,495 Updated Nov 5, 2025

NVIDIA / nvcomp

Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloaded from https://developer.nvidia.com/nvcomp.

C++ 598 89 Updated Sep 11, 2024

l-nic / chipyard

Forked from ucb-bar/chipyard

An Agile Chisel-Based SoC Design Framework

Scala 26 2 Updated Dec 29, 2021

sohutv / hotcaffeine

热咖啡

JavaScript 188 9 Updated Feb 2, 2023

kaiyuyue / torchshard

Slicing a PyTorch Tensor Into Parallel Shards

Python 301 15 Updated Jun 7, 2025

gabaker / TARUC_Bench

A benchmark for testing PCIe and host/device memory bandwith and communication contention on multi-GPU and multi-CPU systems.

C++ 9 1 Updated Jun 9, 2016

intelxed / xed

The X86 Encoder Decoder (XED), is a software library for encoding and decoding X86 (IA32 and Intel64) instructions

Python 1,523 164 Updated Jun 11, 2025

flame / blis

BLAS-like Library Instantiation Software Framework

C 2,547 402 Updated Oct 21, 2025

curtisseizert / CUDA-uint128

A 128 bit unsigned integer class for CUDA

C++ 46 17 Updated Jan 3, 2025

ceph / cbt

The Ceph Benchmarking Tool

Python 299 146 Updated Oct 9, 2025

ververica / flink-sql-benchmark

Java 106 53 Updated Jul 20, 2023

onnx / onnx-tensorrt

ONNX-TensorRT: TensorRT backend for ONNX

C++ 3,165 543 Updated Sep 8, 2025

pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

C++ 1,468 674 Updated Nov 5, 2025

onnx / onnx-tensorflow

Tensorflow Backend for ONNX

Python 1,325 298 Updated Mar 28, 2024

NVIDIA / cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C 8,388 2,170 Updated Sep 5, 2025

triton-inference-server / model_analyzer

Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.

Python 495 80 Updated Nov 5, 2025

BDHU / CUDA_Device_Attribute_Generation

Automatically generate a C++ header file including Cuda device-specific parameters

C++ 3 Updated Jul 1, 2020

uber / aresdb

A GPU-powered real-time analytics storage and query engine.

Go 3,067 235 Updated Jul 13, 2024

yuhc / gpu-rodinia

Rodinia benchmark

C 189 103 Updated Apr 14, 2023

bytedance / effective_transformer

Running BERT without Padding

C++ 475 53 Updated Mar 18, 2022

virtual-kubelet / virtual-kubelet

Virtual Kubelet is an open source Kubernetes kubelet implementation.

Go 4,439 651 Updated Nov 3, 2025

apache / brpc

brpc is an Industrial-grade RPC framework using C++ Language, which is often used in high performance system such as Search, Storage, Machine learning, Advertisement, Recommendation etc. "brpc" mea…

C++ 17,359 4,066 Updated Nov 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly