Skip to content
View sssssux's full-sized avatar

Block or report sssssux

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Making large AI models cheaper, faster and more accessible

Python 41,298 4,546 Updated Dec 8, 2025

The official Meta Llama 3 GitHub site

Python 29,145 3,501 Updated Jan 26, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,830 1,813 Updated Oct 13, 2025

A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks

Jupyter Notebook 269 11 Updated Jul 30, 2024

Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]

Python 110 9 Updated Feb 20, 2025

ChatLaw:A Powerful LLM Tailored for Chinese Legal. 中文法律大模型

7,395 596 Updated Jan 4, 2025

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.

Python 1,463 85 Updated Nov 7, 2023

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 2 5 Updated Aug 26, 2024

allreduce benchmark using deepspeed

Shell 4 2 Updated Mar 21, 2025

Ongoing research training transformer models at scale

Python 394 51 Updated Aug 20, 2024

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2,169 216 Updated Oct 8, 2024

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 154,116 31,502 Updated Dec 21, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,052 4,669 Updated Dec 19, 2025

Inference code for Llama models

Python 58,998 9,816 Updated Jan 26, 2025

Code for the paper "Evaluating Large Language Models Trained on Code"

Python 3,061 426 Updated Jan 17, 2025

An annotated implementation of the Transformer paper.

Jupyter Notebook 6,854 1,472 Updated Apr 7, 2024

🦜🔗 The platform for reliable agents.

Python 122,379 20,173 Updated Dec 20, 2025

放一些有趣的code

Python 77 32 Updated Oct 29, 2023

http request/response parser for c

C 6,429 1,532 Updated Jun 19, 2022