Stars
Simple and reliable way to track your USB devices directly in the macOS menu bar
🥢像老乡鸡🐔那样做饭。已添加2026年发布的《老乡鸡菜品溯源报告 2.0中新出现的菜品。主要部分于2024年完工,非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》,并做归纳、编辑与整理。CookLikeHOC.
Task and time-tracking management with calendar integration for Obsidian
Kimi K2 is the large language model series developed by Moonshot AI team
Official PyTorch implementation for "Large Language Diffusion Models"
Download market data from Yahoo! Finance's API
DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images int…
OpenOCR: An Open-Source Toolkit for General-OCR Research and Applications, integrates a unified training and evaluation benchmark, commercial-grade OCR and Document Parsing systems, and faithful re…
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
🔥Awesome Multimodal Large Language Models Paper List
MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning
[ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
A fork to add multimodal model training to open-r1
SGLang is a high-performance serving framework for large language models and multimodal models.
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Fully open reproduction of DeepSeek-R1
[Interspeech 2024] SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization
The simplest, fastest repository for training/finetuning medium-sized GPTs.
aider is AI pair programming in your terminal
This repository contains demos I made with the Transformers library by HuggingFace.
A simple Mac app that simulates mouse clicks
Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation
A PyTorch implementation of DTrOCR: Decoder-only Transformer for Optical Character Recognition
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
The Best Autofill Since Sliced Bread.