Skip to content
View h4nwei's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report h4nwei

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS 2025] Official PyTorch implementation of paper "BADiff: Bandwidth Adaptive Diffusion Model"

7 1 Updated Oct 24, 2025

A powerful toolkit for compressing large models including LLM, VLM, and video generation models.

Python 607 62 Updated Nov 4, 2025

**Deep Video Discovery (DVD)** is a deep-research style question answering agent designed for understanding extra-long videos.

Python 301 5 Updated Nov 3, 2025

Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)

Python 647 23 Updated Sep 24, 2025

This repository contains low-bit quantization papers from 2020 to 2025 on top conference.

65 2 Updated Sep 24, 2025

[NeurIPS 2025 Spotlight] VisualQuality-R1 is the first open-sourced NR-IQA model can accurately describe and rate the image quality.

Python 120 4 Updated Oct 15, 2025

Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents

Python 197 12 Updated May 5, 2025

Q-Insight is open-sourced at https://github.com/bytedance/Q-Insight. This repository will not receive further updates.

142 3 Updated May 30, 2025

Beyond Accuracy: What Matters in Designing Well-Behaved Models?

Python 12 1 Updated Oct 9, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,601 2,235 Updated Feb 1, 2025
6 Updated Dec 31, 2024

[Paper List‘25] Paper List of Visual Data Coding for Machines, including Image/Video Coding for Machines, Feature Compression, Point Cloud Compression for Machines and Image/Video Coding for Machin…

29 Updated Aug 17, 2025

Model Compression Toolbox for Large Language Models and Diffusion Models

Python 689 65 Updated Aug 14, 2025

Official codes for "Q-Ground: Image Quality Grounding with Large Multi-modality Models", ACM MM2024 (Oral)

42 Updated Oct 25, 2024

Official repo for `LMM-PCQA: Assisting Point Cloud Quality Assessment with LMM', ACM MM2024 Oral

Python 18 1 Updated Nov 21, 2024

[NeurIPS'24] Compare2Score

Python 4 Updated Nov 7, 2024

[Neurips 24 Spotlight] Training in Pairs + Inference on Single Image with Anchors

Python 45 3 Updated Feb 20, 2025

🔥Official PyTorch implementation for "LM4LV: A Frozen Large Language Model for Low-level Vision Tasks".

Python 53 2 Updated Jun 12, 2024

A curated list of recent diffusion models for video generation, editing, and various other applications.

5,169 318 Updated Oct 15, 2025

✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

677 25 Updated Aug 22, 2025

[ICLR 2025] What do we expect from LMMs as AIGI evaluators and how do they perform?

139 4 Updated Feb 3, 2025

[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

Python 196 5 Updated Sep 26, 2024

MambaOut: Do We Really Need Mamba for Vision? (CVPR 2025)

Python 2,570 48 Updated Mar 9, 2025

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

743 42 Updated Nov 5, 2025

A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems

374 20 Updated Sep 22, 2025

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,255 419 Updated Nov 3, 2025

Collections of papers and code for employing MLLM for quality assessment tasks.

12 Updated Apr 18, 2024
Next