- Sunnyvale, CA
-
FasterTransformer Public
Forked from NVIDIA/FasterTransformerTransformer related optimization, including BERT, GPT
C++ Apache License 2.0 UpdatedSep 22, 2025 -
unsloth Public
Forked from unslothai/unslothFine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
Python Apache License 2.0 UpdatedAug 22, 2025 -
open-r1 Public
Forked from huggingface/open-r1Fully open reproduction of DeepSeek-R1
Python Apache License 2.0 UpdatedAug 4, 2025 -
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ Other UpdatedApr 27, 2025 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python Apache License 2.0 UpdatedFeb 6, 2025 -
neural-speed Public
Forked from intel/neural-speedAn innovation library for efficient LLM inference via low-bit quantization and sparsity
C++ Apache License 2.0 UpdatedFeb 27, 2024 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedDec 15, 2023 -
bitsandbytes Public
Forked from bitsandbytes-foundation/bitsandbytes8-bit CUDA functions for PyTorch
Python MIT License UpdatedSep 24, 2023 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedSep 22, 2023 -
llama Public
Forked from meta-llama/llamaInference code for LLaMA models
Python Other UpdatedAug 24, 2023 -
whisper Public
Forked from openai/whisperRobust Speech Recognition via Large-Scale Weak Supervision
Python MIT License UpdatedApr 17, 2023 -
optimum Public
Forked from huggingface/optimum🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools
Python Apache License 2.0 UpdatedMar 15, 2023 -
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
C++ MIT License UpdatedNov 7, 2022 -
-
diffusers Public
Forked from huggingface/diffusers🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
Python Apache License 2.0 UpdatedSep 26, 2022 -
onnxruntime Public
Forked from microsoft/onnxruntimeONNX Runtime: cross-platform, high performance scoring engine for ML models
C++ MIT License UpdatedMay 16, 2022 -
Open Neural Network Exchange
-
mmperf Public
Forked from mmperf/mmperfMatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.
C++ Apache License 2.0 UpdatedFeb 2, 2022 -
tutorials Public
Forked from onnx/tutorialsTutorials for creating and using ONNX models
Jupyter Notebook MIT License UpdatedJan 30, 2020 -
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
C++ Other UpdatedSep 26, 2019 -
Windows-Machine-Learning Public
Forked from microsoft/Windows-Machine-LearningSamples for Windows ML.
MIT License UpdatedJun 13, 2018