Skip to content
View loretoparisi's full-sized avatar
🐍
Pythoning
🐍
Pythoning

Organizations

@Musixmatchdev @musixmatchresearch

Block or report loretoparisi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A vector index built on TurboQuant, written in Rust with Python bindings

Rust 1,009 89 Updated May 16, 2026

SU-01: Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Python 70 4 Updated May 17, 2026

DeepSeek 4 Flash local inference engine for Metal and CUDA

C 10,268 838 Updated May 17, 2026

TokenSpeed is a speed-of-light LLM inference engine.

Python 1,035 94 Updated May 17, 2026

Utilities to obtain, generate, and post-process TV listings data in XMLTV format

Perl 407 109 Updated May 10, 2026

An open source template for building cloud agents.

TypeScript 5,462 692 Updated May 15, 2026

An hardware-aware Efficient Implementation for "Mixture-of-Depths Attention".

Python 261 9 Updated May 6, 2026
Lean 23 Updated Apr 16, 2026

open-source healthcare ai

Python 1,211 147 Updated May 17, 2026

DFlash: Block Diffusion for Flash Speculative Decoding

Python 4,615 330 Updated May 10, 2026

mac code — Claude Code, but it runs on your Mac for free. 35B AI agent at 30 tok/s via Apple Silicon flash-paging. $0/month.

Python 983 108 Updated Apr 9, 2026
Python 575 57 Updated May 13, 2026

Hundreds of models & providers. One command to find what runs on your hardware.

Rust 26,311 1,576 Updated May 17, 2026

Official data.gouv.fr Model Context Protocol (MCP) server that allows AI chatbots to search, explore, and analyze datasets from the French national Open Data platform, directly through conversation.

Python 1,462 120 Updated May 13, 2026
Python 189 6 Updated Jan 26, 2026

🖥 Neural Computers' Data Engine

Python 194 26 Updated Apr 13, 2026

Flash-MoE sidecar slot-bank runtime for large GGUF MoE models on Apple Silicon — llama.cpp fork

C++ 100 11 Updated May 16, 2026
Python 176 9 Updated May 5, 2026

Lucebox: LLM inference server built for speed for specific consumer hardware.

C++ 2,130 200 Updated May 17, 2026

Fine-tune Gemma 4 and 3n with audio, images and text on Apple Silicon, using PyTorch and Metal Performance Shaders.

Python 1,439 103 Updated May 12, 2026

`Paper repo for “Coherence-Guided Dead-Head Identification in Frozen Transformers,” including manuscript sources, figures, frozen result artifacts, and verification scripts.`

Python 48 3 Updated Apr 8, 2026

TriAttention — Efficient long reasoning with trigonometric KV cache compression. Enables OpenClaw local deployment on memory-constrained GPUs.

Python 741 66 Updated Apr 23, 2026
Python 762 60 Updated Apr 16, 2026

The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.

Rust 191,767 109,913 Updated May 16, 2026

DGX Spark setup and vLLM deployment scripts for Qwen, GPT-OSS, and Nemotron 3.

Shell 9 2 Updated May 14, 2026

Universal Ethernet Direct Connect (UniEDC)

Shell 32 4 Updated Apr 1, 2026

An independent Python feature port of alike to Claude Code, entirely rewritting from scratch using oh-my-codex. Educational Purpose only.

Rust 26 2 Updated Apr 1, 2026

vLLM TurboQuant

Python 588 101 Updated Apr 16, 2026
C++ 310 31 Updated May 15, 2026
Next