Lists (1)
Sort Name ascending (A-Z)
Stars
neural networks don't minimize loss [caution: probably due to batchnorm]
NanoGPT speedrun in JAX. Originally at https://nor-git.pages.dev/modded-nanogpt-jax/
RWKV / RWKV-LM
Forked from BlinkDL/RWKV-LMRWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…
BlinkDL / nanoRWKV
Forked from karpathy/nanoGPTRWKV in nanoGPT style
Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting
A visual playground for agentic workflows: Iterate over your agents 10x faster
This is the official repository for Inheritune.
Implementation and evaluation of Scaling Embedding Layers in Language Models research paper
Code for Bolmo: Byteifying the Next Generation of Language Models
wolfecameron / nanoMoE
Forked from karpathy/nanoGPTAn extension of the nanoGPT repository for training small MOE models.
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
Official implementation of Vector-ICL: In-context Learning with Continuous Vector Representations (ICLR 2025)
Official implementation for Text Generation Beyond Discrete Token Sampling
đź“– This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
[CVPR2025] Official Implementation of ViStream: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network
Data and code for paper Quantum Transformer: Accelerating model inference via quantum linear algebra
The original code for paper "Towards a Holistic Framework for Multimodal LLM in 3D Brain CT Radiology Report Generation"
Official repository of SpikeZIP-TF in ICML2024
NdLinear by Ensemble is a drop-in PyTorch module that shrinks your models with no accuracy loss. It powers the Ensemble Platform—upload any model and get back a smaller, faster version, ready to de…
An experiment that applies Google Research's `ReasoningBank` technique to Small Language Models. This experiment hopes to show that the same gains from the ReasoningBank paper also applies to much …