Skip to content
View RobinQrtz's full-sized avatar

Block or report RobinQrtz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Rich is a Python library for rich text and beautiful formatting in the terminal.

Python 56,661 2,194 Updated Jun 15, 2026

Style guides for Google-originated open-source projects

HTML 39,390 12,966 Updated Jun 3, 2026

A framework for graph-based dependency parsing.

Python 19 5 Updated Feb 9, 2022

Speech recognition with word-level timestamps, optimized for batch inference.

Python 31 1 Updated Jun 16, 2026

Suplementary code for the NORA large language models

Python 9 Updated Feb 3, 2025

The Enhanced Edition versions of Baldur's Gate, Baldur's Gate II, Planescape: Torment and Icewind Dale come with missing dependencies on Linux. Here are the missing files and instructions.

Shell 10 5 Updated Aug 1, 2025

Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy

Python 8,121 307 Updated Jun 4, 2026

A library for open-source data processing tools to create language model training datasets

5 Updated Jan 9, 2025

Bringing BERT into modernity via both architecture changes and scaling

Python 1,694 146 Updated Mar 1, 2026

This repository contains demos I made with the Transformers library by HuggingFace.

Jupyter Notebook 11,648 1,725 Updated Apr 20, 2026

Unofficial implementation of the Ask-LLM paper 'How to Train Data-Efficient LLMs', arXiv:2402.09668.

Python 12 3 Updated Jun 19, 2024

Efficient Triton Kernels for LLM Training

Python 6,445 543 Updated Jun 17, 2026

AllenAI's post-training codebase

Python 3,758 548 Updated Jun 20, 2026

Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models

Python 267 32 Updated Apr 23, 2024

A PyTorch native platform for training generative AI models

Python 5,452 868 Updated Jun 22, 2026

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python 4,950 373 Updated Jun 3, 2026
Rust 39 9 Updated Apr 17, 2024

library supporting NLP and CV research on scientific papers

Python 797 64 Updated Nov 8, 2024

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 1,700 137 Updated Apr 4, 2026

Benchmark for Scandinavian Tokenizers

Python 8 Updated Apr 23, 2024
Python 1,571 229 Updated Mar 25, 2026

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python 2 1 Updated May 22, 2024

Swedish parliamentary proceedings - Riksdagens protokoll 1867-today

Python 26 5 Updated May 3, 2024

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 3,098 273 Updated May 26, 2026

Scaling Data-Constrained Language Models

Jupyter Notebook 342 18 Updated Jun 28, 2025

HPLT Analytics

JavaScript 15 4 Updated Feb 23, 2026

A fast implementation of T5/UL2 in PyTorch using Flash Attention

Python 116 9 Updated Oct 30, 2025

LTG-Bert

Python 34 4 Updated Jan 8, 2024

Minimalistic large language model 3D-parallelism training

Python 2,720 318 Updated May 26, 2026

Machine Learning Engineering Open Book

Python 18,156 1,152 Updated May 18, 2026
Next