Arvintian

Arvin Arvintian

60 followers · 42 following

Beijing,China
www.arvintian.cn

Achievements

Lists (3)

Sort

Stars

hanxiao / knowledge-graph-extractor

Turn any document or a whole zip into an interactive knowledge graph, using a self-hosted Qwen3.6-35B-A3B-MTP on a single NVIDIA L4

Python 126 15 Updated Jun 12, 2026

sgl-project / sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 28,961 6,512 Updated Jun 13, 2026

ZhuLinsen / daily_stock_analysis

LLM驱动的 A/H/美股智能分析：多数据源行情 + 实时新闻 + LLM决策仪表盘 + 多渠道推送，零成本定时运行，纯白嫖. LLM-powered stock analysis system for A/H/US markets.

Python 42,393 40,186 Updated Jun 13, 2026

lfbear / chatbox-lite

One HTML file. Chat with OpenAI, Claude, Gemini, DeepSeek, Ollama and any OpenAI-compatible endpoint — streaming, reasoning, vision, fully client-side.

HTML 1 Updated Jun 10, 2026

nfs-ganesha / nfs-ganesha

NFS-Ganesha is an NFSv3,v4,v4.1 fileserver that runs in user mode on most UNIX/Linux systems

C 1,764 574 Updated Jun 12, 2026

zgwl / chinese-buy-us-stock-guide

美股指南

4,281 661 Updated Jun 11, 2026

maximhq / bifrost

Fastest enterprise AI gateway (50x faster than LiteLLM) with adaptive load balancer, cluster mode, guardrails, 1000+ models support & <100 µs overhead at 5k RPS.

Go 5,728 750 Updated Jun 13, 2026

AI-FanGe / AI_DesktopCat_Qwen3.5Omni

An ai hardware using qwen3.5 omni as its model.

Python 226 55 Updated May 1, 2026

maillab / cloud-mail

A Cloudflare-based email service | 基于 Cloudflare 的邮箱服务 | Cloudflare Email 邮箱 Mail

JavaScript 11,127 15,266 Updated Jun 9, 2026

NVIDIA / Model-Optimizer

A unified library of SOTA model optimization techniques like quantization, distillation, pruning, neural architecture search, speculative decoding, etc. It compresses deep learning models for downs…

Python 2,920 436 Updated Jun 13, 2026

z-lab / paroquant

[ICLR 2026] ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference

Python 306 29 Updated Jun 8, 2026

go-lark / lark

An easy-to-use SDK for Feishu and Lark Open Platform (Instant Messaging API only)

Go 244 37 Updated May 14, 2026

CloakHQ / CloakBrowser

Stealth Chromium that passes every bot detection test. Drop-in Playwright replacement with source-level fingerprint patches. 30/30 tests passed.

Python 25,876 2,051 Updated Jun 9, 2026

vllm-project / llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 3,392 545 Updated Jun 12, 2026

NVIDIA / nccl

Optimized primitives for collective multi-GPU communication

C++ 4,808 1,295 Updated Jun 13, 2026

ggml-org / ggml

Tensor library for machine learning

C++ 14,805 1,673 Updated Jun 12, 2026

eugr / llama-benchy

llama-benchy - llama-bench style benchmarking tool for all backends

Python 464 42 Updated Jun 10, 2026

AmesianX / TurboQuant

TurboQuant KV Cache Compression for llama.cpp — 5.2x memory reduction with near-lossless quality | Implementation of Google DeepMind's TurboQuant (ICLR 2026)

C++ 82 13 Updated Jun 13, 2026

localai-org / apex-quant

Adaptive Precision for EXpert Models: MoE-aware mixed-precision quantization

Shell 348 26 Updated May 29, 2026

mudler / LocalAI

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

Go 46,826 4,134 Updated Jun 13, 2026

nyarime / NRUP

🥤 NRUP - A reliable encrypted UDP transport protocol built on DTLS

Go 143 13 Updated Apr 21, 2026

MHSanaei / 3x-ui

Xray panel supporting multi-protocol multi-user expire day & traffic & IP limit (Vmess, Vless, Trojan, ShadowSocks, Wireguard, Hysteria, Tunnel, Mixed, HTTP, Tun)

TypeScript 40,538 7,601 Updated Jun 13, 2026

PrefectHQ / fastmcp

🚀 The fast, Pythonic way to build MCP servers and clients.

Python 25,616 2,070 Updated Jun 6, 2026

NanmiCoder / cc-haha

Claude Code 泄露源码 - 本地可运行版本，新增跨平台桌面端软件补齐Computer Use（附带核心模块解析）

TypeScript 12,575 8,235 Updated Jun 13, 2026

EfficientContext / ContextPilot

Accelerating Long Context LLM Inference with Accuracy-Preserving Context Optimization in SGLang, vLLM, llama.cpp, OpenClaw, RAG, and Agentic AI.

Python 115 5 Updated Jun 13, 2026

HKUDS / OpenHarness

"OpenHarness: Open Agent Harness with a Built-in Personal Agent--Ohmo!"

Python 13,810 2,258 Updated Jun 4, 2026

NousResearch / hermes-agent

The agent that grows with you

Python 192,540 33,575 Updated Jun 13, 2026

google-ai-edge / LiteRT-LM

LiteRT-LM is Google's production-ready, high-performance, open-source inference framework for deploying Large Language Models on edge devices.

C++ 5,574 575 Updated Jun 13, 2026

agentscope-ai / QwenPaw

Your Personal AI Assistant; easy to install, deploy on your own machine or on the cloud; supports multiple chat apps with easily extensible capabilities.

Python 17,519 2,601 Updated Jun 12, 2026

eleiton / ollama-intel-arc

Make use of Intel Arc Series GPU to Run Ollama, StableDiffusion, Whisper and Open WebUI, for image generation, speech recognition and interaction with Large Language Models (LLM).

Dockerfile 357 45 Updated Jun 13, 2026

Arvin Arvintian

Lists (3)

Game-emulator

LLM

STORAGE

Stars