- Bangalore, India
-
17:59
(UTC +05:30) - cyber-machine.github.io
- @maazkarim_
- in/maaz-karim
Lists (9)
Sort Name ascending (A-Z)
Computer Vision
Image and Video based modelsCool
CUDA
Graph Neural Networks
Some awesome things to explore here, when learning or exploring GNN's.📚 LLM Repos
Collection of useful and unique repos for LLM and it's application.MCP Servers
ML Systems
Machine Learning Systems designs, resources and orchestration toolsSystem Design
Starred repositories
👨🎨 The ergonomic way to storyboard. Turns sketches and annotations into videos by drawing on a canvas.
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
Open Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.
Implement a reasoning LLM in PyTorch from scratch, step by step
A multi-agent LLM system for detecting and resolving cognitive dissonance.
My learning notes for ML SYS.
An MCP Multimodal AI Agent with eyes and ears!
The source code of "DINet: deformation inpainting network for realistic face visually dubbing on high resolution video."
Learn Low Level Design (LLD) and prepare for interviews using free resources.
A tool for creating and running Linux containers using lightweight virtual machines on a Mac. It is written in Swift, and optimized for Apple silicon.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Implementing DeepSeek R1's GRPO algorithm from scratch
Open source and self-hostable browser automation library for AI agents
[ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning
A TTS model capable of generating ultra-realistic dialogue in one pass.
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
Distributed Compiler based on Triton for Parallel Systems
🚀 The fast, Pythonic way to build MCP servers and clients
AWS MCP Servers — helping you get the most out of AWS, wherever you use MCP.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
A high-throughput and memory-efficient inference and serving engine for LLMs
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
YT Navigator: AI-powered YouTube content explorer that lets you search and chat with channel videos using AI agents. Extract insights from hours of content in seconds with semantic search and preci…
Fully local web research and report writing assistant
[EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation