Skip to content
View bgub's full-sized avatar
💻
Improving the web
💻
Improving the web

Sponsors

@generaltranslation

Block or report bgub

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

groovy AI

17 repositories

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 18,986 2,216 Updated Jul 29, 2024

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

Python 1,479 97 Updated May 1, 2025
Jupyter Notebook 8,756 626 Updated Oct 25, 2025

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python 4,905 368 Updated Dec 7, 2024

Structured Outputs

Python 13,143 658 Updated Dec 12, 2025

🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data

TypeScript 70,508 5,537 Updated Dec 22, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 81,437 12,178 Updated Dec 21, 2025

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

70,853 8,110 Updated Dec 22, 2025

A guidance language for controlling large language models.

Jupyter Notebook 21,017 1,129 Updated Dec 17, 2025

A Bulletproof Way to Generate Structured JSON from Language Models

Jupyter Notebook 4,860 187 Updated Feb 24, 2024

LLM(😽)

Python 1,699 96 Updated Feb 3, 2025

Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).

Python 9,128 879 Updated Dec 22, 2025

Training Sparse Autoencoders on Language Models

Python 1,124 207 Updated Dec 22, 2025

Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript

Go 606 20 Updated Jul 2, 2024

A JavaScript library like PyTorch, with GPU acceleration.

JavaScript 1,220 56 Updated Nov 15, 2024

train with kittens!

Python 63 4 Updated Oct 25, 2024

A data visualization and analytics component, especially well-suited for large and/or streaming datasets.

C++ 10,048 1,265 Updated Dec 18, 2025