-
Celonis
- New York
-
18:22
(UTC -04:00)
Stars
CUDA Templates and Python DSLs for High-Performance Linear Algebra
Pre-indexed code knowledge graph, auto syncs on code changes, for Claude Code, Codex, Gemini, Cursor, OpenCode, AntiGravity, Kiro, and Hermes Agent — fewer tokens, fewer tool calls, 100% local
DeepSeek 4 Flash and PRO local inference engine for Metal, CUDA and ROCm
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
TokenSpeed is a speed-of-light LLM inference engine.
Ongoing research training transformer models at scale
Robin: A multi-agent system for automating scientific discovery
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞
🔥 LeetCode for PyTorch — practice implementing softmax, attention, GPT-2 and more from scratch with instant auto-grading. Jupyter-based, self-hosted or try online.
Curated list of AutoResearch use cases with optimization traces and open source implementations
PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
The design language that makes your AI harness better at design.
OpenViking is an open-source context database designed specifically for AI Agents(such as openclaw). OpenViking unifies the management of context (memory, resources, and skills) that Agents need th…
A complete AI agency at your fingertips - From frontend wizards to Reddit community ninjas, from whimsy injectors to reality checkers. Each agent is a specialized expert with personality, processes…
An agentic skills framework & software development methodology that works.
An open-source, AI-integrated, cross-platform terminal for seamless workflows
Fast and memory-efficient exact attention
SGLang is a high-performance serving framework for large language models and multimodal models.
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…