Skip to content
View Jye-525's full-sized avatar

Highlights

  • Pro

Block or report Jye-525

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A version of CloverLeaf using NVIDIA's CUDA

Fortran 6 6 Updated Feb 15, 2023

Agentic framework for computational chemistry and materials science workflows

Python 96 34 Updated Apr 17, 2026

Paper list of agent for science

234 16 Updated Mar 12, 2026

Runtime provenance for AI and scientific workflows—capture, enrich, and query workflow data via observability adapters and code annotation across edge, cloud, and HPC.

Python 22 7 Updated Apr 3, 2026

KV-Direct: Bounded-Memory Transformer Inference via Residual Stream Checkpointing

3 1 Updated Mar 18, 2026

Run OpenClaw more securely inside NVIDIA OpenShell with managed inference

TypeScript 19,358 2,397 Updated Apr 17, 2026

A high-performance and light-weight router for vLLM large scale deployment

Rust 196 71 Updated Apr 16, 2026
Python 4 1 Updated Mar 6, 2025

collection of benchmarks to measure basic GPU capabilities

C++ 513 82 Updated Oct 24, 2025

Harnessing distributed, tiered storage for context management

C++ 14 8 Updated Apr 17, 2026

The academic meta-prompting framework for AI agents like Claude Code, Gemini CLI, OpenCode. Features citation-aware drafting, hallucination checks, and rigorous structural planning. Built for PhDs …

JavaScript 13 2 Updated Feb 9, 2026
Python 145 29 Updated Jun 24, 2024

LLM KV cache compression made easy

Python 1,042 134 Updated Apr 14, 2026

Tile-Based Runtime for Ultra-Low-Latency LLM Inference

Python 708 42 Updated Mar 8, 2026

TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.

Cuda 106 6 Updated Jun 28, 2025

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 2,023 131 Updated Apr 17, 2026

ArcticInference: vLLM plugin for high-throughput, low-latency inference

Python 424 58 Updated Mar 28, 2026

Official Implementation of APB (ACL 2025 main Oral) and Spava (ACL 2026 main).

C++ 37 5 Updated Apr 6, 2026

Open Machine Learning Compiler Framework

Python 13,277 3,857 Updated Apr 17, 2026

[ICLR'26] The official code implementation for "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"

Python 377 46 Updated Mar 13, 2026

Code, Data and Model for COLM 2025 Paper "E2-RAG: Towards Editable Efficient RAG by Editing Compressed KV Caches"

Python 6 Updated Sep 11, 2025

gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling

Python 56 4 Updated Apr 15, 2026

Demystify AI agents by building them yourself. Local LLMs, no black boxes, real understanding of function calling, memory, and ReAct patterns.

JavaScript 3,402 515 Updated Jan 21, 2026

agentUniverse is a LLM multi-agent framework that allows developers to easily build multi-agent applications.

Python 2,201 377 Updated Apr 1, 2026

A programming framework for agentic AI

Python 57,170 8,606 Updated Apr 15, 2026

Autonomous Agents (LLMs) research papers. Updated Daily.

1,224 95 Updated Apr 16, 2026
Python 158 24 Updated Oct 9, 2024

Lightweight coding agent that runs in your terminal

Rust 75,920 10,770 Updated Apr 17, 2026
Next