Skip to content
View wzk1015's full-sized avatar
😎
😎

Highlights

  • Pro

Organizations

@OpenGVLab

Block or report wzk1015

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

文字版三国杀,10000+行java实现

Java 99 21 Updated Mar 29, 2023

[CVPR2020] A Dataset for SPAtial REasoning on Three-View Line Drawings

Python 57 9 Updated Jul 18, 2024

[ArXiv 2025] Co-Training Vision Language Models for Remote Sensing Multi-task Learning

Python 16 Updated Nov 30, 2025

Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation

Python 121 1 Updated Aug 12, 2025

A review for remote sensing vision language models

61 1 Updated Apr 12, 2025

MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning [NeurIPS 2025 Poster]

Python 22 Updated Dec 10, 2025
Python 17 1 Updated Nov 13, 2025

ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding(书生 · 妙析多模态美学理解大模型)

Python 104 3 Updated Oct 16, 2025
Python 20 1 Updated Dec 10, 2025

Processed / Cleaned Data for Paper Copilot

Python 786 36 Updated Dec 4, 2025

Echos is a headless, API-driven DAW engine. It’s the backend for building AI tools that automate the entire music production lifecycle.

Python 53 Updated Nov 10, 2025
3 Updated Nov 3, 2025

Native Multimodal Models are World Learners

Python 1,365 51 Updated Nov 28, 2025

Best Papers of Top Venues like CVPR, NeurIPS, ICLR, ICML, ICCV, ECCV, ...

253 12 Updated Dec 16, 2025

A visual interpretation tool for Deformable DETR

Python 12 1 Updated Dec 22, 2021
Python 30 1 Updated Aug 25, 2025

Official repository of InternSVG.

Python 85 1 Updated Oct 16, 2025

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.

Python 604 51 Updated Oct 29, 2025

QeRL enables RL for 32B LLMs on a single H100 GPU.

Python 467 44 Updated Nov 27, 2025
Python 41 3 Updated Oct 31, 2025

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

Python 36 Updated Dec 17, 2025

Full-stack AI booking platform with RAG retrieval, Multi-Agent collaboration & smart pricing engine

TypeScript 8 1 Updated Oct 14, 2025
Python 86 7 Updated Oct 10, 2025

Official Repository of paper MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Python 80 2 Updated Oct 19, 2025

Factuality Matters: When Image Generation and Editing Meet Structured Visuals

Python 30 Updated Nov 13, 2025

[AAAI 2026] Open-Source LLM-Based Data Analysis Agents

Python 56 4 Updated Nov 10, 2025

A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Language Models (LLMs).

85 7 Updated Dec 12, 2025

The SAIL-VL2 series model developed by the BytedanceDouyinContent Group

75 6 Updated Sep 18, 2025
Next