Skip to content
View khotyn's full-sized avatar
😌
Focusing
😌
Focusing

Organizations

@acug @sofastack

Block or report khotyn

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Context7 Platform -- Up-to-date code documentation for LLMs and AI code editors

TypeScript 57,653 2,713 Updated Jun 18, 2026

The API to search, scrape, and interact with the web at scale. 🔥

TypeScript 134,730 7,855 Updated Jun 18, 2026

Markdown Architectural Decision Records

Markdown 2,273 462 Updated Jun 12, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 379,404 79,423 Updated Jun 18, 2026

A simple, performant and scalable Jax LLM!

Python 2,329 539 Updated Jun 18, 2026

an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM

Rust 49,776 5,273 Updated Jun 18, 2026

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one

Rust 93,330 4,714 Updated Jun 18, 2026

A high-performance inference engine for LLM, VLM, DiT and REC models, optimized for diverse AI accelerators.

C++ 1,348 232 Updated Jun 18, 2026

Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…

Rust 6,690 724 Updated Jun 18, 2026

Public repository for Agent Skills

Python 152,524 17,973 Updated Jun 9, 2026

The best ChatGPT that $100 can buy.

Python 55,209 7,585 Updated May 5, 2026

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Go 4,440 715 Updated Jun 18, 2026

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Python 133,225 21,538 Updated Jun 18, 2026

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 897 260 Updated Jun 18, 2026

Data driven agentic landscapes and insights. Produced by Ant Open Source and inclusionAI.

TypeScript 498 35 Updated Jun 17, 2026

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

Python 9,655 971 Updated Jun 17, 2026

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,563 316 Updated Jul 17, 2025

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 35,847 3,641 Updated Jun 18, 2026

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 22,032 4,095 Updated Jun 18, 2026

iTerm2 is a terminal emulator for Mac OS X that does amazing things.

Objective-C 17,704 1,411 Updated Jun 18, 2026

[HPCA 2026] AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.

Python 365 130 Updated Apr 22, 2026

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,319 523 Updated Jun 17, 2026

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 7,391 1,053 Updated Jun 4, 2026

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,919 1,916 Updated Jun 18, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,705 1,062 Updated Apr 30, 2026

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

Python 17,308 1,318 Updated Jun 18, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 29,175 6,605 Updated Jun 18, 2026

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 72,281 8,846 Updated Jun 17, 2026

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

Python 336 22 Updated Apr 24, 2025

Module, Model, and Tensor Serialization/Deserialization

Python 313 52 Updated Apr 30, 2026
Next