Skip to content
View KidsXH's full-sized avatar

Highlights

  • Pro

Organizations

@ZJUVAI

Block or report KidsXH

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

CLI for common Playwright actions. Record and generate Playwright code, inspect selectors and take screenshots.

TypeScript 3,700 135 Updated Feb 14, 2026
Python 40 3 Updated Aug 31, 2025

Automatic solver for plane geometry problems.

Jupyter Notebook 25 6 Updated Feb 5, 2026

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 18,234 1,404 Updated Feb 7, 2026

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,332 41 Updated Feb 3, 2026

[AAAI 2026] Multimodal Deepresearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework

HTML 45 3 Updated Jan 25, 2026

R1-onevision, a visual language model capable of deep CoT reasoning.

Python 575 15 Updated Apr 13, 2025
TypeScript 9 3 Updated Jun 20, 2024

Leveraging Multimodal Prompt for Visualization Authoring with LLMs

TypeScript 7 1 Updated Jan 29, 2026

Vega-Lite Chart Dataset and NL Generation Framework using LLMs

Python 135 17 Updated May 30, 2024

Code for BLT research paper

Python 2,028 190 Updated Nov 3, 2025

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 22,021 2,693 Updated Jan 23, 2026

PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Python 27 5 Updated Oct 10, 2024

Awesome-Paper-list: Visualization meets LLM

69 4 Updated Jan 21, 2026

A benchmark designed to evaluate visualization generation methods.

Python 57 12 Updated Nov 4, 2025

Graph Diffusion Policy Optimization

Python 42 5 Updated Mar 17, 2024

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 67,307 8,184 Updated Feb 12, 2026

General technology for enabling AI capabilities w/ LLMs and MLLMs

Python 4,283 367 Updated Dec 22, 2025
TypeScript 25 1 Updated Apr 19, 2024

[CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Models

Python 279 16 Updated Apr 17, 2024

Repository for the paper "Cognitive Mirage: A Review of Hallucinations in Large Language Models"

49 2 Updated Oct 21, 2023

Build AI Agents, Visually

TypeScript 49,147 23,719 Updated Feb 16, 2026
TypeScript 18 5 Updated Jul 19, 2023

Here is the official implementation of the model KD3A in paper "KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation".

Python 119 14 Updated Aug 30, 2022

OI / ACM-ICPC essays and learning materials

Rich Text Format 1,626 389 Updated Jun 1, 2025