Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Your AI agent army, commanded from Slack/Discord/Wechat/Lark. Stream Claude Code, OpenCode, or Codex in real-time — from anywhere.
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
Official repository of Utonia: Toward One Encoder for All Point Clouds
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Blog Write multi agent AI is a custom multi-agent system designed to autonomously create high-quality, research-driven blogs. Using LangChain, Gemini 2.0-Flash-EXP, and Serper Web Search Tool, it a…
A paper list of some recent works about Token Compress for Vit and VLM
RM-R1: Unleashing the Reasoning Potential of Reward Models
Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
The official GitHub page for the survey paper "Self-Supervised learning for Videos: A survey"
Unleashing Reasoning in Medical Large Language Models
PyTorch code and models for V-JEPA self-supervised learning from video.
A simple PyTorch implementation of influence functions.
Official Code for NeurIPS 2022 Paper: How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders
[NeurIPS 2023] code for "DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
[Survey] Masked Modeling for Self-supervised Representation Learning on Vision and Beyond (https://arxiv.org/abs/2401.00897)
Official implementation of MAIA, A Multimodal Automated Interpretability Agent
A comprehensive list of awesome contrastive self-supervised learning papers.
MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.
VFS Appointment Bot - This script automates checking for appointments at VFS Global offices in a specified country.