Stars
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
Neural Networks: Zero to Hero
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
StableLM: Stability AI Language Models
llama3 implementation one matrix multiplication at a time
Public facing notes page
Official inference library for Mistral models
A series of large language models trained from scratch by developers @01-ai
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
Implement a reasoning LLM in PyTorch from scratch, step by step
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.
OLMoE: Open Mixture-of-Experts Language Models
Llama from scratch, or How to implement a paper without crying
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
LLM Chess - evaluating Large Language Models' reasoning and instruction-following abilities by simulating chess games
Train and run a small Llama 2 model from scratch on the TinyStories dataset.
Load larger models by offloading model layers to both GPU and CPU
Simple Single Neuron Neural Network
Example of Sentiment Analysis using TensorFlow and BERT