- Zhejiang University
- https://person.zju.edu.cn/shengyuzhang
Stars
一个基于 Hermes 的 agent skill:每天自动从 arXiv 抓取论文,用 AI 生成中文摘要和作者单位,推送到飞书,并提供本地静态阅读网站。
[AAAI2026] AccKV: Towards Efficient Audio-Video LLMs Inference via Adaptive-Focusing and Cross-Calibration KV Cache Optimization
[CVPR'26] Official implementation for “Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs”
[AAAI 2026 Oral] Official repository for InfiGUI-G1. We introduce Adaptive Exploration Policy Optimization (AEPO) to overcome semantic alignment bottlenecks in GUI agents through efficient, guided …
Official implementation for “HarmonyGuard: Toward Safety and Utility in Web Agents via Adaptive Policy Enhancement and Dual-Objective Optimization”
Repository for the paper "InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners"
[AAAI2025] FedCFA: Alleviating Simpson’s Paradox in Model Aggregation with Counterfactual Federated Learning
[ICLR'24] AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation
Awesome GUI Agent Paper List
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`
llm deploy project based mnn. This project has merged into MNN.
Simple macOS menu bar application to view and interact with reminders. Developed with SwiftUI and using Apple Reminders as a source.
An open-source remote desktop application designed for self-hosting, as an alternative to TeamViewer.
DeerSheep0314 / Re4-Learning-to-Re-contrast-Re-attend-Re-construct-for-Multi-interest-Recommendation
Official repo of Future-aware Diverse Trends Framework for Recommendation
The dataset for paper "Why Do We Click: Visual Impression-aware News Recommendation", ACM MM 2021
Audio Visual Instance Discrimination with Cross-Modal Agreement