Highlights
- Pro
Stars
Tensors and Dynamic neural networks in Python with strong GPU acceleration
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
A hex editor for WeChat/QQ/TIM - PC版微信/QQ/TIM防撤回补丁(我已经看到了,撤回也没用了)
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Github Pages template based upon HTML and Markdown for personal, portfolio-based websites.
verl: Volcano Engine Reinforcement Learning for LLMs
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
COCO API - Dataset @ http://cocodataset.org/
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.
A Survey of Reinforcement Learning for Large Reasoning Models
This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.
[TPAMI 2023] LibFewShot: A Comprehensive Library for Few-shot Learning.
Awesome Unified Multimodal Models
The implementation of the technical report: "Customized Segment Anything Model for Medical Image Segmentation"
[CVPR 2023] Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation
The development and future prospects of large multimodal reasoning models.
A curated collection of papers, datasets, and resources on Scientific Datasets and Large Language Models (LLMs)
OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.
[ICCV 2019] Tag2Pix: Line Art Colorization Using Text Tag With SECat and Changing Loss
Pixel-Level Reasoning Model trained with RL [NeuIPS25]
Implementation of MedSegDiff in Pytorch - SOTA medical segmentation using DDPM and filtering of features in fourier space