-
ByteDance Seed
- Singapore
-
08:26
(UTC +08:00)
Stars
Making large AI models cheaper, faster and more accessible
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
一个用于提取简体中文字符串中省,市和区并能够进行映射,检验和简单绘图的python模块
ByteCheckpoint: An Unified Checkpointing Library for LFMs
PyTorch implementation and reproduction of 3DCvT for lip reading, with training, evaluation, and inference on LRW and LRW-1000.