-
llmc Public
Forked from ModelTC/LightCompressllmc is an efficient LLM compression tool with various advanced compression methods, supporting multiple inference backends.
Python Apache License 2.0 UpdatedApr 17, 2025 -
awesome-lm-system Public
Forked from ModelTC/awesome-lm-systemSummary of system papers/frameworks/codes/tools on training or serving large model
Apache License 2.0 UpdatedJul 4, 2023 -
FasterTransformer Public
Forked from NVIDIA/FasterTransformerTransformer related optimization, including BERT, GPT
C++ Apache License 2.0 UpdatedFeb 8, 2023 -
DeepSpeedExamples Public
Forked from deepspeedai/DeepSpeedExamplesExample models using DeepSpeed
Python MIT License UpdatedJan 14, 2023 -
fairseq Public
Forked from facebookresearch/fairseqFacebook AI Research Sequence-to-Sequence Toolkit written in Python.
Python MIT License UpdatedAug 22, 2022 -
MQBench Public
Forked from ModelTC/MQBenchModel Quantization Benchmark
Shell Apache License 2.0 UpdatedAug 12, 2022 -
TensorRT Public
Forked from NVIDIA/TensorRTTensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
C++ Apache License 2.0 UpdatedJun 24, 2022 -
Model-Compression-Research-Package Public
Forked from ofirzaf/Model-Compression-Research-PackageA library for researching neural networks compression and acceleration methods.
Python Apache License 2.0 UpdatedDec 13, 2021