wzk1015

Follow

😎

Zhaokai Wang wzk1015

😎

Follow

PhD candidate @ SJTU & Shanghai AI Lab; B.Eng @ BUAA

136 followers · 90 following

Shanghai Jiao Tong University
Shanghai
www.wzk.plus
https://scholar.google.com/citations?user=W0zVf-oAAAAJ

Achievements

Achievements

Highlights

Pro

Organizations

Starred repositories

wzk1015 / sanguosha

文字版三国杀，10000+行java实现

Java 99 21 Updated Mar 29, 2023

meituan-longcat / LongCat-Image

Python 468 34 Updated Dec 16, 2025

appletea233 / EditThinker

68 2 Updated Dec 8, 2025

ai4ce / SPARE3D

[CVPR2020] A Dataset for SPAtial REasoning on Three-View Line Drawings

Python 57 9 Updated Jul 18, 2024

VisionXLab / RSCoVLM

[ArXiv 2025] Co-Training Vision Language Models for Remote Sensing Multi-task Learning

Python 16 Updated Nov 30, 2025

better-chao / perceptual_abilities_evaluation

Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation

Python 121 1 Updated Aug 12, 2025

IRIP-BUAA / A-Survey-on-Remote-Sensing-Foundation-Models-From-Vision-to-Multimodality

A review for remote sensing vision language models

61 1 Updated Apr 12, 2025

MMMGBench / MMMG

MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning [NeurIPS 2025 Poster]

Python 22 Updated Dec 10, 2025

qychen2001 / iclr-score

Python 17 1 Updated Nov 13, 2025

thunderbolt215 / ArtiMuse

ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding（书生 · 妙析多模态美学理解大模型）

Python 104 3 Updated Oct 16, 2025

Vchitect / Uni-MMMU

Python 20 1 Updated Dec 10, 2025

papercopilot / paperlists

Processed / Cleaned Data for Paper Copilot

Python 786 36 Updated Dec 4, 2025

Linzwcs / echos

Echos is a headless, API-driven DAW engine. It’s the backend for building AI tools that automate the entire music production lifecycle.

Python 53 Updated Nov 10, 2025

YihongT / Sparkle

3 Updated Nov 3, 2025

baaivision / Emu3.5

Native Multimodal Models are World Learners

Python 1,365 51 Updated Nov 28, 2025

SarahRastegar / Best-Papers-Top-Venues

Best Papers of Top Venues like CVPR, NeurIPS, ICLR, ICML, ICCV, ECCV, ...

253 12 Updated Dec 16, 2025

xirui-li / Interpretation_DETR

A visual interpretation tool for Deformable DETR

Python 12 1 Updated Dec 22, 2021

pointarena / pointarena

Python 30 1 Updated Aug 25, 2025

hmwang2002 / InternSVG

Official repository of InternSVG.

Python 85 1 Updated Oct 16, 2025

NVlabs / OmniVinci

OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.

Python 604 51 Updated Oct 29, 2025

NVlabs / QeRL

QeRL enables RL for 32B LLMs on a single H100 GPU.

Python 467 44 Updated Nov 27, 2025

OpenGVLab / MetaCaptioner

Python 41 3 Updated Oct 31, 2025

OpenGVLab / Vlaser

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

Python 36 Updated Dec 17, 2025

Rainiver / airbnb-ai-booking-platform

Full-stack AI booking platform with RAG retrieval, Multi-Agent collaboration & smart pricing engine

TypeScript 8 1 Updated Oct 14, 2025

OpenGVLab / NaViL

Python 86 7 Updated Oct 10, 2025

PhoenixZ810 / MM-HELIX

Official Repository of paper MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Python 80 2 Updated Oct 19, 2025

zhuole1025 / Structured-Visuals

Factuality Matters: When Image Generation and Editing Meet Structured Visuals

Python 30 Updated Nov 13, 2025

zjunlp / DataMind

[AAAI 2026] Open-Source LLM-Based Data Analysis Agents

Python 56 4 Updated Nov 10, 2025

rdi-berkeley / awesome-RLVR-boundary

A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Language Models (LLMs).

85 7 Updated Dec 12, 2025

BytedanceDouyinContent / SAIL-VL2

The SAIL-VL2 series model developed by the BytedanceDouyinContent Group

75 6 Updated Sep 18, 2025

Starred topics

Awesome Lists