Lists (2)
Sort Name ascending (A-Z)
Stars
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
AI wearables. Put it on, speak, transcribe, automatically
Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4
Optimizing Mobile Deep Learning on ARM GPU with TVM
Yahoo! Cloud Serving Benchmark in C++, a C++ version of YCSB (https://github.com/brianfrankcooper/YCSB/wiki)
使用卷积神经网络在STM32F401C-DISCO上实现人体活动识别
A Microsoft-SEAL-compatible implementation of homomorphic encryption targeting Azure Sphere and other embedded devices.
Johnny Cache: the End of DRAM Cache Conflicts (in Tiered Main Memory Systems)
This is the official implementation of the paper Soar: Design and Deployment of A Smart Roadside Infrastructure System for Autonomous Driving. (MobiCom 2024))
yaoyaoding / cuda-samples
Forked from NVIDIA/cuda-samplesSamples for CUDA Developers which demonstrates features in CUDA Toolkit