Skip to content
View zytx121's full-sized avatar
🎯
Focusing
🎯
Focusing
  • BUPT -> SJTU -> NTU -> ECNU
  • China
  • 22:32 (UTC +08:00)

Organizations

@Justice-Eternal

Block or report zytx121

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

利用 GitHub 的 Issues 和 GitHub Pages 搭建个人博客站点/数据展示。多屏幕适配。

Dart 1 1 Updated Oct 10, 2025

Official Repository of paper MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Python 73 1 Updated Oct 19, 2025

A GitHub Actions workflow for automatically counting open issues and their labels, and saving the statistics to a tag message for further request.

JavaScript 1 Updated Sep 29, 2025

je曲谱库·移动端

Dart 9 Updated Oct 10, 2025

Flutter 或 Vue 全家桶(Vue + VueRouter + Vuex + Axios)抓取 GitHub 上的 Issues,结合 GitHub Pages 搭建个人博客站点,支持 GitHub 登录和评论

Dart 273 54 Updated Dec 10, 2024

Multimodal Mathematical Reasoning Embedded in Aerial Vehicle Imagery: Benchmarking, Analysis, and Exploration

Python 10 1 Updated Sep 15, 2025

ASANet: Asymmetric Semantic Aligning Network for RGB and SAR image land cover classification

Python 49 2 Updated Dec 5, 2024

explore AMT from the perspective of timbre

Jupyter Notebook 8 2 Updated Jun 26, 2025

Adapting VLMs to Bench2Drive.

Python 163 20 Updated Oct 12, 2025

[TGRS'25] AirSpatialBot: A Spatially-Aware Aerial Agent for Fine-Grained Vehicle Attribute Recognization and Retrieval

Python 21 Updated Aug 24, 2025

[IGARSS 2025 Oral] A Simple Aerial Detection Baseline of Multimodal Language Models.

Jupyter Notebook 85 6 Updated Jul 3, 2025

【Numbered musical notation tools】je 简谱 处理工具,包括转调、播放、制谱、midi提取(转换)与制作等

JavaScript 71 10 Updated Sep 20, 2025

Solve Visual Understanding with Reinforced VLMs

Python 5,671 366 Updated Oct 21, 2025

GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks

Python 84 6 Updated Jul 1, 2025

GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding

72 2 Updated May 10, 2025

[ICLR2025] Text4Seg: Reimagining Image Segmentation as Text Generation

Python 149 3 Updated Sep 15, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 10,891 946 Updated Nov 5, 2025

汽车之家车型品牌车系车型等的数据

3 1 Updated Sep 16, 2023

Monthly updated data from AutoHome, including comprehensive specifications and configurations for all vehicle models.

Python 90 25 Updated Aug 29, 2025

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,255 419 Updated Nov 3, 2025
Python 172 14 Updated May 6, 2024

Summary of Geoguesser Models / Agents

5 Updated Jun 27, 2024

TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients. Published in Nature.

Python 3,063 253 Updated Jul 25, 2025

[NeurIPS 2024 Spotlight ⭐️ & TPAMI 2025] Parameter-Inverted Image Pyramid Networks (PIIP)

Python 105 5 Updated Aug 5, 2025

[ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 400 6 Updated May 5, 2025

[TPAMI] Oriented object detection on STAR dataset.

Python 83 5 Updated Feb 3, 2025

A Survey on Vision-Language Geo-Foundation Models (VLGFMs)

175 7 Updated May 24, 2025

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Python 2,900 176 Updated May 26, 2025
Next