Skip to content
View ljubomirj's full-sized avatar

Block or report ljubomirj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
41 results for forked starred repositories
Clear filter

A fork of OpenCode for local AI models.

TypeScript 4 Updated Jun 8, 2026

DeepSeek 4 Flash local inference engine for Metal

C 4 Updated Jun 12, 2026

[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Python 28 1 Updated May 21, 2026

LLAMA Turboquant implementation with CUDA support

C++ 657 71 Updated Jun 4, 2026

DeepSeek 4 Flash local inference engine for Metal and CUDA with M5 optimizations.

C 18 Updated May 24, 2026

Local AI app and inference engine for agents. Run open-weight LLMs locally — private, 100% offline on your computer.

TypeScript 910 81 Updated Jun 15, 2026

llama.cpp fork with TurboQuant WHT-rotated KV cache & weight compression + Gemma 4 MTP and Qwen 3.6 NextN speculative decoding (+30-50% throughput).

C++ 263 37 Updated Jun 15, 2026

LLM inference in C/C++

C++ 192 42 Updated Jun 12, 2026

llama.cpp fork with TQ3_1S/4S CUDA kernels — 3.5-bit WHT quantization achieving Q4s quality at 10% smaller size. Based on RaBitQ-inspired Walsh-Hadamard transform. Enables 27B models on 16GB GPUs w…

C++ 191 11 Updated Jun 14, 2026

Provider-agnostic, open-source evaluation infrastructure for language models

Python 12 Updated Mar 15, 2026

AI agents running research on single-GPU nanochat training automatically

Python 490 29 Updated Mar 13, 2026

AI agents running research on single-GPU nanochat training automatically adopted for MacOS

Python 2,237 330 Updated Mar 17, 2026

LLM training on Apple's Neural Engine — native Obj-C, private APIs, zero GPU. Dynamic weight pipeline for training without kernel recompilation.

Python 55 5 Updated Mar 17, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 1 Updated Feb 1, 2026

Training neural network potentials

Python 1 Updated Jan 11, 2026

infinite coding agent

Rust 86 5 Updated Jun 11, 2026

Every Code - push frontier AI to it limits. A fork of the Codex CLI with validation, automation, browser integration, multi-agents, theming, and much more. Orchestrate agents from OpenAI, Claude, G…

Rust 3,798 233 Updated Jun 14, 2026

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!

Python 2 1 Updated Feb 9, 2026

HRM Agent repo

Python 10 1 Updated Oct 29, 2025

Tora: Torchtune-LoRA for RL

Python 87 7 Updated Dec 2, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 1 Updated Oct 7, 2025

REverse-Engineered Reasoning for Open-Ended Generation

Python 97 7 Updated Sep 10, 2025

A curated list of awesome platforms, tools, practices and resources that helps run LLMs locally

1 Updated Aug 22, 2025

The official github repo for "Diffusion Language Models are Super Data Learners".

1 Updated Aug 9, 2025

LLM text generation and fine-tuning with MLX

Python 5 1 Updated May 11, 2026

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 6 3 Updated May 16, 2025

The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Python 43 Updated Dec 29, 2025
Next