Skip to content
View vipmath's full-sized avatar

Block or report vipmath

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

neural networks don't minimize loss [caution: probably due to batchnorm]

Python 3 Updated Aug 28, 2024

NanoGPT speedrun in JAX. Originally at https://nor-git.pages.dev/modded-nanogpt-jax/

Python 8 3 Updated Aug 28, 2025
Python 12 Updated Mar 20, 2025

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…

Python 58 5 Updated Mar 17, 2025

RWKV in nanoGPT style

Python 197 11 Updated Jun 9, 2024
Python 230 16 Updated Dec 2, 2024

Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting

Python 5 3 Updated Feb 12, 2025

Stick-breaking attention

Python 62 5 Updated Jul 1, 2025

A visual playground for agentic workflows: Iterate over your agents 10x faster

TypeScript 5,624 420 Updated Jul 20, 2025

This is the official repository for Inheritune.

Python 117 10 Updated Feb 10, 2025

Gumini 1B - 1.5B Benchmark Report

HTML 1 Updated Dec 17, 2025

Implementation and evaluation of Scaling Embedding Layers in Language Models research paper

Python 8 2 Updated Mar 1, 2025

Code for Bolmo: Byteifying the Next Generation of Language Models

Python 81 8 Updated Dec 15, 2025

An extension of the nanoGPT repository for training small MOE models.

Python 219 26 Updated Mar 9, 2025

From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

Jupyter Notebook 778 91 Updated Oct 30, 2024
Python 2 Updated Oct 12, 2025

Official implementation of Vector-ICL: In-context Learning with Continuous Vector Representations (ICLR 2025)

21 2 Updated Jun 2, 2025

Official implementation for Text Generation Beyond Discrete Token Sampling

Python 20 2 Updated Aug 11, 2025

s1: Simple test-time scaling

Python 6,615 764 Updated Jun 25, 2025
Python 25 Updated Dec 10, 2025

đź“– This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.

315 6 Updated Nov 5, 2025

OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.

Python 174 23 Updated Jan 16, 2025

[CVPR2025] Official Implementation of ViStream: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network

Python 4 Updated Oct 27, 2025

Data and code for paper Quantum Transformer: Accelerating model inference via quantum linear algebra

Jupyter Notebook 1 Updated Jun 16, 2025

The original code for paper "Towards a Holistic Framework for Multimodal LLM in 3D Brain CT Radiology Report Generation"

Python 45 4 Updated Apr 24, 2025

Official repository of SpikeZIP-TF in ICML2024

Python 47 13 Updated Dec 4, 2024

NdLinear by Ensemble is a drop-in PyTorch module that shrinks your models with no accuracy loss. It powers the Ensemble Platform—upload any model and get back a smaller, faster version, ready to de…

Python 299 19 Updated Jun 4, 2025

An experiment that applies Google Research's `ReasoningBank` technique to Small Language Models. This experiment hopes to show that the same gains from the ReasoningBank paper also applies to much …

Python 77 9 Updated Oct 14, 2025
Python 302 40 Updated Aug 7, 2025
Next