-
DaoCloud
- Shanghai
- https://carlory.github.io/
Lists (4)
Sort Name ascending (A-Z)
Starred repositories
6
stars
written in C++
Clear filter
FlashMLA: Efficient Multi-head Latent Attention Kernels
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
Talk to your Mac, query your docs, no cloud required. On-device voice AI + RAG
A CUDA API interception library that simulates GPU devices in even non-GPU environments.
A C++ template for decoupling the invocation of CUDA kernels from the nvcc compiler driver.