- San Francisco, CA
- dharmeshkakadia.com
- @dharmeshkakadia
Lists (4)
Sort Name ascending (A-Z)
- All languages
- Arc
- Assembly
- Bikeshed
- C
- C#
- C++
- CSS
- Clojure
- CoffeeScript
- Common Lisp
- Coq
- Cuda
- DM
- Dhall
- Dockerfile
- Eagle
- Elixir
- Emacs Lisp
- Erlang
- Git Attributes
- Go
- Groovy
- HCL
- HTML
- Haskell
- Java
- JavaScript
- Jsonnet
- Julia
- Jupyter Notebook
- Kotlin
- Lean
- Lua
- MLIR
- Makefile
- Markdown
- Mustache
- NSIS
- Nunjucks
- OCaml
- Objective-C
- OpenQASM
- PHP
- PLpgSQL
- Perl
- Pony
- PowerShell
- Prolog
- Protocol Buffer
- Pug
- Python
- R
- Rocq Prover
- Roff
- Ruby
- Rust
- SCSS
- Sass
- Scala
- Scheme
- Shell
- Solidity
- Starlark
- Swift
- TeX
- Thrift
- TypeScript
- V
- Vim Script
- Vue
- Wikitext
- XSLT
- Zig
Starred repositories
nanobind: tiny and efficient C++/Python bindings
FlagGems is an operator library for large language models implemented in the Triton Language.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
A kernel library written in tilelang
Desktop app to manage markdown knowledge bases
[CVPR 2026] Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching
CUDA kernels for linear attention variants, written in CuTe DSL and CUTLASS C++.
🚀 Efficient implementations for emerging model architectures
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.
Write HTML. Render video. Built for agents.
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
Create powerful Hydra applications without the yaml files and boilerplate code.
A workload for deploying LLM inference services on Kubernetes
Alibaba Cloud's high-performance KVCache system for LLM inference, with components for global cache management, inference simulation(HiSim), and more.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Distributed DuckDB - dual execution and differential storage
Native space switching on macOS with no animation
Native macOS micro-UI for scripts and agents — sub-50ms WKWebView windows with bidirectional JSON communication
I replicated Ng's RYS method and found that duplicating 3 specific layers in Qwen2.5-32B boosts reasoning by 17% and duplicating layers 12-14 in Devstral-24B improves logical deduction from 0.22→0.…
Sub-millisecond VM sandboxes for AI agents via copy-on-write forking
Index and overview of neuroimaging datasets for visual perception reconstruction.