Skip to content
View xaskasdf's full-sized avatar

Block or report xaskasdf

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. ntransformer ntransformer Public

    High-efficiency LLM inference engine in C++/CUDA. Run Llama 70B on RTX 3090.

    C++ 462 20

  2. gpu-nvme-direct gpu-nvme-direct Public

    GPU-initiated NVMe I/O via PCIe BAR MMIO — CUDA kernels directly issue NVMe commands, eliminating CPU from the storage data path

    Cuda 22 1

  3. brandon-tiny brandon-tiny Public

    Ultra-small instruction-following language models (10M-110M params) that run on a PlayStation 2. BLiMP 73.3%, HellaSwag 32.4% at just 10.7M parameters.

    Python 2

  4. ps2-llm ps2-llm Public

    Running a large language model on a PlayStation 2

    C 42 1

  5. osito-k osito-k Public

    Bare-metal preemptive kernel for ESP8266 — 3D wireframe Elite demo, zForth interpreter, OsitoFS, all in 24KB of IRAM

    C++ 2

  6. ter ter Public

    Ternary computing transformer inference

    Cuda