| 2026.06.05 |
Yanfan Zhu (Agents) |
《LightMem: Lightweight and Efficient Memory-Augmented Generation》 (ICLR_2026) |
|
| 2026.06.05 |
Yanfan Zhu (Agents) |
《Continual Harness: Online Adaptation for Self-Improving Foundation Agents》 (arXive) |
|
| 2026.06.05 |
Yanfan Zhu (Agents) |
《OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models》 (WWW_2026) |
|
| 2026.05.15 |
Marilyn Lionts (DNA Barcoding) |
《DNA barcoding increases the taxonomic resolution of shark diet analysis compared to morphological stomach contents identification》 (2026) |
|
| 2026.05.15 |
Marilyn Lionts (CT for Food Science) |
《Morphometric Characterization Workflows of Praline Chocolates using X-ray Computed Tomography》 (2026) |
|
| 2026.05.15 |
Marilyn Lionts (Spectroscopy for Food Science) |
《Analytical Chemistry Nutritional Insights: Exploring ED-XRF, LIBS, and Chemometric Techniques for Macronutrient Determination in Non-conventional Food Plants (PANC)》 (2026) |
|
| 2026.05.07 |
Zhengyi Lu (RL in LLM) |
《PretrainZero: Reinforcement Active Pretraining》 (ICML 2026) |
|
| 2026.05.07 |
Zhengyi Lu (RL in LLM) |
《Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning》 (ICML 2026) |
|
| 2026.05.07 |
Zhengyi Lu (RL in LLM) |
《Solving Physics Olympiad via Reinforcement Learning on Physics Simulators》 (ICML 2026) |
|
| 2026.04.29 |
Yuechen Yang (LLM) |
《Beyond Pixel Agreement: Large Language Models as Clinical Guardrails for Reliable Medical Image Segmentation》 (arXive) |
|
| 2026.04.29 |
Yuechen Yang (VLM) |
《VLM-Guided Iterative Refinement for Surgical Image Segmentation with Foundation Models》 (arXive) |
|
| 2026.04.29 |
Yuechen Yang (QC) |
《SegQC: a segmentation network-based framework for multi-metric segmentation quality control and segmentation error detection in volumetric medical images》 (achXive) |
|
| 2026.04.16 |
Junchao Zhu (LLM) |
《When Agents "Misremember" Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems》 (ICLR 2026) |
|
| 2026.04.16 |
Junchao Zhu (Agents) |
《Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning》 (ICLR 2025) |
|
| 2026.04.16 |
Junchao Zhu (Agents) |
《scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery》 (Neurips 2025) |
|
| 2026.04.10 |
Junlin Guo (Deep Fake) |
《The Coherence Trap: When MLLM-Crafted Narratives Exploit Manipulated Visual Contexts》 (CVPR 2026) |
|
| 2026.04.10 |
Junlin Guo (VLM Foundation Models) |
《Tell2Adapt: A Unified Framework for Source Free Unsupervised Domain Adaptation via Vision Foundation Model》 (CVPR 2026) |
|
| 2026.04.10 |
Junlin Guo (VLM Hallucinations) |
《Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding》 (CVPR 2026) |
|
| 2026.04.03 |
Yanfan Zhu (Selective Classifier) |
《What Does It Take to Build a Performant Selective Classifier?》 (NeurlPS) |
|
| 2026.04.03 |
Yanfan Zhu (LLM Abstein) |
《MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions》 (AAAI 2026) |
|
| 2026.04.03 |
Yanfan Zhu (OOD Detection) |
《Leveraging Perturbation Robustness to Enhance Out-of-Distribution Detection》 (CVPR 2025) |
|
| 2026.03.27 |
Zhengyi Lu (Generation) |
《Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models》 (ICLR 2026) |
|
| 2026.03.27 |
Zhengyi Lu (Efficient Reasoning) |
《Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models》 (NeurlPS 2025) |
|
| 2026.03.27 |
Zhengyi Lu (Efficient Reasoning) |
《ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning》 (NeurlPS 2025) |
|
| 2026.03.20 |
Marilyn Lionts (Raman in Food Science) |
《Honey Differentiation Using Infrared and Raman Spectroscopy Analysis and the Employment of Machine-Learning-Based Authentication Models》 (2026) |
|
| 2026.03.20 |
Marilyn Lionts (Raman in Food Science) |
《Machine learning-assisted Raman spectroscopy for non-destructive analysis of crude palm oil quality》 (2026) |
|
| 2026.03.20 |
Marilyn Lionts (Raman in Food Science) |
《Raman on the palm: handheld Raman spectroscopy for enhanced traceability of palm oil》 (2025) |
|
| 2026.02.13 |
Junchao Zhu (LLM) |
《NAIPv2: Debiased Pairwise Learning for Efficient Paper Quality Estimation》 (ICLR 2026) |
|
| 2026.02.13 |
Junchao Zhu (Agents) |
《Open-World Reinforcement Learning over Long Short-Term Imagination》 (ICLR 2025) |
|
| 2026.02.13 |
Junchao Zhu (LLM) |
《LLM DNA: Tracing Model Evolution via Functional Representations》 (ICLR 2026) |
|
| 2026.02.06 |
Junlin Guo (Agents) |
《PaperBanana: Automating Academic Illustration for AI Scientists》 (Arxiv 2026) |
|
| 2026.02.06 |
Junlin Guo (CoT & VLM) |
《PathReasoner-R1: Instilling Structured Reasoning into Pathology Vision-Language Model via Knowledge-Guided Policy Optimization》 (Arxiv 2026) |
|
| 2026.02.06 |
Junlin Guo (Pretraining & Localization) |
《A multimodal vision–language model for generalizable annotation-free pathology localization》 (Nature biomedical engineering 2026) |
|
| 2026.01.16 |
Yanfan Zhu (Segmentation) |
《SAM 3: Segment Anything with Concepts》 (ICLR 2026) |
|
| 2026.01.16 |
Yanfan Zhu (Segmentation) |
《SAM-Veteran: An MLLM-Based Human-like SAM Agent for Reasoning Segmentation》 (ICLR 2026) |
|
| 2026.01.16 |
Yanfan Zhu (Segmentation) |
《LSP-DETR: Efficient and Scalable Nuclei Segmentation in Whole Slide Images》 (Arxiv) |
|
| 2026.01.08 |
Zhengyi Lu (Agent Evaluation) |
《Agent-as-a-Judge: Evaluate Agents with Agents》 (ICML 2025) |
|
| 2026.01.08 |
Zhengyi Lu (VLM Reasoning) |
《More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models》 (ArXiv) |
|
| 2026.01.08 |
Zhengyi Lu (MLLM Reasoning) |
《OneThinker: All-in-one Reasoning Model for Image and Video》 (ArXiv) |
|
| 2025.12.05 |
Yuechen Yang (Diffusion Model) |
《Back to Basics: Let Denoising Generative Models Denoise》 |
|
| 2025.12.05 |
Yuechen Yang (Vision Model) |
《ARC Is a Vision Problem》 |
|
| 2025.12.05 |
Yuechen Yang (LLMs) |
《Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)》 |
|
| 2025.11.14 |
Junchao Zhu (Large-language Model) |
《Glyph: Scaling Context Windows via Visual-Text Compression》 |
|
| 2025.11.14 |
Junchao Zhu (Vision-language Model) |
《VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection(CVPR 2025)》 |
|
| 2025.11.14 |
Junchao Zhu (Vision-language Model) |
《DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction》(ICCV 2025) |
|
| 2025.11.09 |
Yanfan Zhu (3D Reconstruction) |
《Sparse3Diff: A Diffusion Framework for 3D Reconstruction from Sparse 2D Slices in Volumetric Optical Imaging》 (MICCAI 2025) |
|
| 2025.11.09 |
Yanfan Zhu (3D Reconstruction) |
《Robust 3D Shape Reconstruction in Zero-Shot from a Single Image in the Wild》 (CVPR2025) |
|
| 2025.11.09 |
Yanfan Zhu (3D Reconstruction) |
《Wonder3D++: Cross-domain Diffusion for High-fidelity 3D Generation from a Single Image》 (Arxiv) |
|
| 2025.10.30 |
Junlin Guo (VLM & Reinforcement Learning) |
《Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models (IEEE TMI 2025) |
|
| 2025.10.30 |
Junlin Guo (LLM Agent & Reinforcement Learning) |
《Agent Learning via Early Experience》 (Arxiv) |
|
| 2025.10.30 |
Junlin Guo (3D point cloud & 2D-3D Foundation Models) |
《Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations》 (Neurips2025) |
|
| 2025.10.24 |
Zhengyi Lu (GUI Agent) |
《Less is More: Empowering GUI Agent with Context-Aware Simplification》 (ICCV 2025) |
|
| 2025.10.24 |
Zhengyi Lu (MRI Geneartion) |
《MRGen: Segmentation Data Engine for Underrepresented MRI Modalities》 (ICCV 2025) |
|
| 2025.10.24 |
Zhengyi Lu (MRI Segmentation) |
《Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior》 (ArXiv) |
|
| 2025.10.10 |
Marilyn Lionts (Vector Embeddings) |
《On the Theoretical Limitations of Embedding-Based Retrieval》 (Arxiv) |
|
| 2025.10.10 |
Marilyn Lionts (Environmental Impact) |
《Measuring the environmental impact of delivering AI at Google Scale》 (Arxiv) |
|
| 2025.10.10 |
Marilyn Lionts (Real-time Models) |
《Life Music Models》 (Arxiv) |
|
| 2025.10.03 |
Tianyuan yao (Vision Transformer) |
《TransNeXt: Robust Foveal Visual Perception for Vision Transformers》 (CVPR 2024) |
|
| 2025.10.03 |
Tianyuan yao (Transformer) |
《Agent Attention: On the Integration of Softmax and Linear Attention》 (ECCV 2024) |
|
| 2025.10.03 |
Tianyuan yao (Transformer) |
《Permutation Equivariance of Transformers and Its Applications》 (CVPR 2024) |
|
| 2025.09.19 |
Yanfan Zhu (3D Object Reconstruction) |
《Articulate-Anything: Automatic Modeling of Articulated Objects via a Vision-Language Foundation Model》 (ICLR2025) |
|
| 2025.09.19 |
Yanfan Zhu (Generative Vision) |
《4K4DGen: Panoramic 4D Generation at 4K Resolution》 (ICLR2025) |
|
| 2025.09.19 |
Yanfan Zhu (Monocular Depth Estimation) |
《Depth Pro: Sharp Monocular Metric Depth in Less Than a Second》 (ICLR2025) |
|
| 2025.09.12 |
Junchao Zhu (Vision Language Model) |
《Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models》 (ICLR2025) |
|
| 2025.09.12 |
Junchao Zhu (Vision Language Model) |
《Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation》 (ICLR2025) |
|
| 2025.09.12 |
Junchao Zhu (Vision Language Model) |
《MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models》 (ICLR2025) |
|
| 2025.08.29 |
Chongyu Qu (Efficient Generative Model) |
《Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models》 (ICLR2025) |
|
| 2025.08.29 |
Chongyu Qu (Efficient Generative Model) |
《DC-AE 1.5: Accelerating Diffusion Model Convergence with Structured Latent Space》 (ICCV 2025) |
|
| 2025.08.29 |
Chongyu Qu (Efficient Generative Model) |
《DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer》 (ICCV 2025) |
|
| 2025.08.15 |
Junlin Guo (Vision Foundation Model) |
《DINOv3》 (ArXiv) |
|
| 2025.08.15 |
Junlin Guo (Vision Foundation Model) |
《Galileo: Learning Global & Local Features of Many Remote Sensing Modalities》 (ICLR2025) |
|
| 2025.08.15 |
Junlin Guo (Vision Foundation Model) |
《AnySat: One Earth Observation Model for Many Resolutions, Scales, and Modalities》 (CVPP2025) |
|
| 2025.08.08 |
Zhengyi Lu (Image Refinement) |
《IMPRINT: Generative Object Compositing by Learning Identity-Preserving Representation》 (ArXiv) |
|
| 2025.08.08 |
Zhengyi Lu (Image Refinement) |
《Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment》 (ArXiv) |
|
| 2025.08.08 |
Zhengyi Lu (Image Refinement) |
《Type-R: Automatically Retouching Typos for Text-to-Image Generation》 (ArXiv) |
|
| 2025.08.01 |
Tianyuan yao (LLM) |
《SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator》 (ICML2025) |
|
| 2025.08.01 |
Tianyuan yao (diffusion, CNN) |
《DiC: Rethinking Conv3x3 Designs in Diffusion Models》 (CVPR2025) |
|
| 2025.08.01 |
Tianyuan yao (LLM) |
《Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation of Causal Transformers without Position》 (Google DeepMind) |
|
| 2025.07.25 |
Xindong Zheng (Semi-Supervised Segmentation) |
《Annotation Ambiguity Aware Semi-Supervised Medical Image Segmentation》 (CVPR2025) |
|
| 2025.07.25 |
Xindong Zheng (Diffusion) |
《Anatomical Consistency and Adaptive Prior-informed Transformation for Multi-contrast MR Image Synthesis via Diffusion Model》 (CVPR2025) |
|
| 2025.07.25 |
Xindong Zheng (Anomaly Detection) |
《PIAD: Pose and Illumination agnostic Anomaly Detection》 (CVPR2025) |
|
| 2025.07.18 |
Marilyn Lionts (VLM) |
《Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models》 (CVPR2025) |
|
| 2025.07.18 |
Marilyn Lionts (Computer Vision Video) |
《Navigation World Models》 (CVPR2025) |
|
| 2025.07.18 |
Marilyn Lionts (MLLM) |
《Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens》 (CVPR2025) |
|
| 2025.06.27 |
Yanfan Zhu (ReID) |
《From Poses to Identity: Training‑Free Person Re‑Identification via Feature Centralization》 (CVPR2025) |
|
| 2025.06.27 |
Yanfan Zhu (Reconstruction) |
《Reconstructing Humans with a Biomechanically Accurate Skeleton》 (CVPR2025) |
|
| 2025.06.27 |
Yanfan Zhu (Reconstruction) |
《Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass》 (CVPR2025) |
|
| 2025.06.20 |
Junlin Guo (Model Explanation & Visualization) |
《Interpreting Object-level Foundation Models via Visual Precision Search》 (CVPR2025) |
|
| 2025.06.20 |
Junlin Guo (3D Visual Grounding & Transformer) |
《VGGT: Visual Geometry Grounded Transformer》 (CVPR2025) |
|
| 2025.06.20 |
Junlin Guo (Pathology & Foundation Model) |
《A whole-slide foundation model for digital pathology from real-world data》 (Nature 2024) |
|
| 2025.05.16 |
Zhengyi Lu (3D Diffusion& Mesh) |
《MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation》 (ArXiv) |
|
| 2025.05.16 |
Zhengyi Lu (3D Diffusion& Mesh) |
《One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion》 (ArXiv) |
|
| 2025.05.16 |
Zhengyi Lu (3D Diffusion& Mesh) |
《One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization》 (ArXiv) |
|
| 2025.04.25 |
Yuechen Yang (Pathomics) |
《Comparison and Optimization of Cellular Neighbor Preference Methods for Quantitative Tissue Analysis》 |
|
| 2025.04.25 |
Yuechen Yang (Pathomics) |
《Clinical Relevance of Computational Pathology Analysis of Interplay Between Kidney Microvasculature and Interstitial Microenvironment》 (ArXiv) |
|
| 2025.04.25 |
Yuechen Yang (Pathomics) |
《Large-scale extraction of interpretable features provides new insights into kidney histopathology – A proof-of-concept study |
|
| 2025.04.25 |
Chongyu Qu (Quantization Method) |
《SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models》 (ArXiv) |
|
| 2025.04.25 |
Chongyu Qu (Quantization Method) |
《SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models》 (ArXiv) |
|
| 2025.04.25 |
Chongyu Qu (Quantization Method) |
《QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs》 (ArXiv) |
|
| 2025.04.18 |
Yanfan Zhu (TDA) |
《TopOC: Topological Deep Learning for Ovarian and Breast Cancer Diagnosis》 (ArXiv) |
|
| 2025.04.18 |
Yanfan Zhu (TDA) |
《PI-Att: Topology Attention for Segmentation Networks through Adaptive Persistence Image Representation》 (ArXiv) |
|
| 2025.04.18 |
Yanfan Zhu (TDA) |
《Topologically Faithful Multi-class Segmentation in Medical Images》 (ArXiv) |
|
| 2025.04.11 |
Junchao Zhu (Diffusion) |
《DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation》 (CVPR 2025) |
|
| 2025.04.11 |
Junchao Zhu (Generation Model) |
《Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step》 (ArXiv Jan 2025) |
|
| 2025.04.11 |
Junchao Zhu (Diffusion) |
《Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models》 (CVPR 2025) |
|
| 2025.04.04 |
Marilyn Lionts (Diffusion) |
《Unified Multimodal Discrete Diffusion》 (ArXiv March 2025) |
|
| 2025.04.04 |
Marilyn Lionts (AI Ethics) |
《Users Favor LLM-Generated Content—Until They Know It’s AI》 (ArXiv February 2025) |
|
| 2025.04.04 |
Marilyn Lionts (AI Ethics) |
《Position: Model Collapse Does Not Mean What You Think》 (ArXiv March 2025) |
|
| 2025.03.28 |
Tianyuan yao (Transformer, PE) |
《RoFormer: Enhanced Transformer with Rotary Position Embedding》 (ArXiv) |
|
| 2025.03.28 |
Tianyuan yao (Transformer, PE) |
《Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer》 (ArXiv) |
|
| 2025.03.28 |
Tianyuan yao (Transformer, PE) |
《Length Generalization of Causal Transformers without Position Encoding》 (ArXiv) |
|
| 2025.03.07 |
Zhengyi Lu (Cinemagraph) |
《StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN》 (ArXiv) |
|
| 2025.03.07 |
Zhengyi Lu (GAN&Diffusion) |
《Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation》 (ArXiv) |
|
| 2025.03.07 |
Zhengyi Lu (GAN&Diffusion) |
《When StyleGAN Meets Stable Diffusion: a W+ Adapter for Personalized Image Generation》 (ArXiv) |
|
| 2025.02.28 |
Yanfan Zhu (LLM Sparsity) |
《Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters》 (ArXiv) |
|
| 2025.02.28 |
Yanfan Zhu (Framework Acceleration) |
《A Multi-Level Framework for Accelerating Training Transformer Models》 (ArXiv) |
|
| 2025.02.28 |
Yanfan Zhu (Hardware Acceleration) |
《Flash Attention-3: Fast and Accurate Attention with Asynchrony and Low-precision》 (ArXiv) |
|
| 2025.02.21 |
Yuechen Yang (Mesh Gereration) |
《MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model》 (ArXiv) |
|
| 2025.02.21 |
Yuechen Yang (Mesh Gereration) |
《MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers》 (ArXiv) |
|
| 2025.02.21 |
Yuechen Yang (Mesh Gereration) |
《MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization》 (ArXiv) |
|
| 2025.01.17 |
Juming Xiong (PPT Agent) |
《PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides》 (ArXiv) |
|
| 2025.01.17 |
Juming Xiong (PPT Agent) |
《AUTOPRESENT: Designing Structured Visuals from Scratch》 (ArXiv) |
|
| 2025.01.17 |
Juming Xiong (PPT Agent) |
《Enhancing Presentation Slide Generation by LLMs with a Multi-Staged End-to-End Approach》 (ArXiv) |
|
| 2025.01.10 |
Tianyuan Yao (Large Language Model) |
《DeepSeek-V3 Technical Report》 (ArXiv) |
|
| 2025.01.10 |
Tianyuan Yao (Large Language Model) |
《DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model》 (ArXiv) |
|
| 2025.01.10 |
Tianyuan Yao (Large Language Model) |
《DeepSeek LLM: Scaling Open-Source Language Models with Longtermism》 (ArXiv) |
|
| 2024.12.13 |
Marilyn Lionts (Foundation Model) |
《Solaris: A Foundation Model of the Sun》 |
|
| 2024.12.13 |
Marilyn Lionts (LLM) |
《Star Attention: Efficient LLM Inference over Long Sequences》 |
|
| 2024.12.13 |
Marilyn Lionts (LLM) |
《Ring Attention with Blockwise Transformers for Near-Infinite Context》 (Neurips2024) |
|
| 2024.12.6 |
Junchao Zhu (Generative model) |
《Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction》 (Neurips2024) |
|
| 2024.12.6 |
Junchao Zhu (LLM) |
《RHO-1: Not All Tokens Are What You Need》 (Neurips2024) |
|
| 2024.12.6 |
Junchao Zhu (GNN) |
《Dynamic Graph Representation with Knowledge-Aware Attention for Histopathology Whole Slide Image Analysis》 (CVPR2024) |
|
| 2024.10.18 |
Junchao Zhu (GNN+Super-resolution) |
《Image Processing GNN: Breaking Rigidity in Super-Resolution》 (CVPR2024) |
|
| 2024.10.18 |
Junchao Zhu (GNN+Finetuning) |
《Fine-tuning Graph Neural Networks by Preserving Graph Generative Patterns》 (AAAI2024) |
|
| 2024.10.18 |
Junchao Zhu (Spatial Transcriptomics) |
《Accurate spatial gene expression prediction by integrating multi-resolution features》 (CVPR2024) |
|
| 2024.10.4 |
Yuechen Yang Guo (Generative model) |
《Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization》 (CVPR2023) |
|
| 2024.10.4 |
Yuechen Yang (Generative model) |
《Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation》 (CVPR2023) |
|
| 2024.10.4 |
Yuechen Yang (Generative model) |
《Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction》 |
|
| 2024.09.27 |
Junlin Guo (Vision-language model) |
《Enhancing Representation in Radiography-Reports Foundation Model: A Granular Alignment Algorithm Using Masked Contrastive Learning》 (Nature Communication) |
|
| 2024.09.27 |
Junlin Guo (Vision-language model) |
《Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding》 (Arxiv) |
|
| 2024.09.27 |
Junlin Guo (Segmentation) |
《Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding》 (CVPR2024) |
|
| 2024.09.20 |
Juming Xiong (Image Registration) |
《RegWSI: Whole slide image registration using combined deep feature-and intensity-based methods: Winner of the ACROBAT 2023 challenge》 (Computer Methods and Programs in Biomedicine) |
|
| 2024.09.20 |
Juming Xiong (Image Registration) |
《Unsupervised Non-rigid Histological Image Registration Guided by Keypoint Correspondences Based on Learnable Deep Features with Iterative Training》 (TMI) |
|
| 2024.09.20 |
Juming Xiong (Image Segmentation) |
《Feature-prompting GBMSeg: One-Shot Reference Guided Training-Free Prompt Engineering for Glomerular Basement Membrane Segmentation》 (MICCAI) |
|
| 2024.09.13 |
Cathy Cui (Vision-language model) |
《Segment Everything Everywhere All at Once》 (NeurIPS 2023) |
|
| 2024.09.13 |
Cathy Cui (Vision-language model) |
《Semantic-SAM: Segment and Recognize Anything at Any Granularity》 (ArXiv) |
|
| 2024.09.13 |
Cathy Cui (Vision-language model) |
《BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once》 (ArXiv) |
|
| 2024. 9.6 |
Ruining Deng (GAN-based application) |
《CP2Image: Generating high-quality single-cell images using CellProfiler representations》 (MIDL2023) |
|
| 2024. 9.6 |
Ruining Deng (Image Registration) |
《Unsupervised Histological Image Registration Using Structural Feature Guided Convolutional Neural Network》 (IEEE TMI) |
|
| 2024. 9.6 |
Ruining Deng (Vision-Language model) |
《ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification》 (CVPR2024) |
|
| 2024.08.30 |
Tianyuan Yao (Vision language Model) |
《BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models》 (ArXiv) |
|
| 2024.08.30 |
Tianyuan Yao (Vision language Model) |
《BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation》 (ArXiv) |
|
| 2024.08.30 |
Tianyuan Yao (Vision language Model) |
《Align before Fuse: Vision and Language Representation Learning with Momentum Distillation》 (ArXiv) |
|
| 2024.08.23 |
Marilyn Lionts (digital pathology virtual staining) |
《Virtual histological staining of unlabeled autopsy tissue》 (Nature Communications 2024) |
|
| 2024.08.23 |
Marilyn Lionts (LLM) |
《META-REWARDING LANGUAGE MODELS: Self-Improving Alignment with LLM-as-a-Meta-Judge》 (ArXiv 2024) |
|
| 2024.08.23 |
Marilyn Lionts (AI Safety) |
《Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?》 (ArXiv 2024) |
|
| 2024.07.26 |
Junchao Zhu (pseudo label + semi-supervised learning) |
《Co-training with High-Confidence Pseudo Labels for Semi-supervised Medical Image Segmentation》 (IJCAI 2023) |
|
| 2024.07.26 |
Junchao Zhu (pseudo label + semi-supervised learning) |
《Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation》 (CVPR2023) |
|
| 2024.07.26 |
Junchao Zhug (pseudo label + semi-supervised learning) |
《Mutual learning with reliable pseudo label for semi-supervised medical image segmentation》 (MEDIA) |
|
| 2024.07.19 |
Yuechen Yang (image analysis toolbox) |
《TIAToolbox as an end-to-end library for advanced tissue image analytics》 ( communications medicine 2022) |
|
| 2024.07.19 |
Yuechen Yang (feature extraction + ML) |
《Classification of Citrus Type Based on Leaf Image Using Shape Extraction and GLCM with the Decision Tree Method》 (IEEE 2021) |
|
| 2024.07.19 |
Yuechen Yang (feature extraction + ML) |
《Sliding Window Based Support Vector Machine System for Classification of Breast Cancer Using Histopathological Microscopic Images》 (IETE 2019) |
|
| 2024.07.05 |
Ruining Deng (Multi-modal Learning) |
《Transcriptomics-guided Slide Representation Learning in Computational Pathology》 (CVPR2024) |
|
| 2024.07.05 |
Ruining Deng (Multi-rater Learning) |
《Stochastic In-Context Learning for Medical Image Segmentation》 (CVPR2024) |
|
| 2024.07.05 |
Ruining Deng (Continual Learning) |
《Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning》 (CVPR2024) |
|
| 2024.06.21 |
Juming Xiong (Image Stitching) |
《Unsupervised Deep Image Stitching: Reconstructing Stitched Features to Images》(IEEE TRANSACTIONS ON IMAGE PROCESSING) |
|
| 2024.06.21 |
Juming Xiong (Image Stitching) |
《Parallax-Tolerant Unsupervised Deep Image Stitching》) |
|
| 2024.06.21 |
Juming Xiong (Image Stitching) |
《Implicit Neural Image Stitching With Enhanced and Blended Feature Reconstruction》 |
|
| 2024.06.14 |
Tianyuan Yao (Time series foundation model) |
《Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting》 |
|
| 2024.06.14 |
Tianyuan Yao (Time series foundation model) |
《Spatial-Temporal Transformer Networks for Traffic Flow Forecasting》 |
|
| 2024.06.14 |
Tianyuan Yao (Time series foundation model) |
《Foundation Models for Time Series Analysis: A Tutorial and Survey》 |
|
| 2024.05.24 |
Marilyn Lionts (Transformers) |
《Improving Transformers Using Faithful Positional Encoding》 (ArXiv) |
|
| 2024.05.24 |
Marilyn Lionts (Transformers) |
《Zero-Shot Tokenizer Transfer》 (ArXiv) |
|
| 2024.05.24 |
Marilyn Lionts (Language Models) |
《Observational Scaling Laws and the Predictability of Language Model Performance》 (ArXiv) |
|
| 2024.05.03 |
Junlin Guo (RLHF + Large Language Model) |
《Aligning Large Multimodal Models with Factually Augmented RLHF》 (ArXiv) |
|
| 2024.05.03 |
Junlin Guo (RLHF + Diffusion Model) |
《Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model》 (CVPR2024) |
|
| 2024.04.26 |
Tianyuan Yao (Large language Model) |
《Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking》 (ArXiv) |
|
| 2024.04.26 |
Tianyuan Yao (Large language Model) |
《Mixture-of-Depths: Dynamically allocating compute in transformer-based language models》 (ArXiv) |
|
| 2024.04.26 |
Tianyuan Yao (Large language Model) |
《Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention》 (ArXiv) |
|
| 2024.04.19 |
Marilyn Lionts (Spatial Awareness LLMs) |
《BLINK: Multimodal Large Language Models Can See but Not Perceive》 (ArXiv) |
|
| 2024.04.19 |
Marilyn Lionts (Spatial Awareness LLMs) |
《Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models》 (ArXiv) |
|
| 2024.04.19 |
Marilyn Lionts (Adversarial LLMs) |
《Manipulating Large Language Models to Increase Product Visibility》 (ArXiv) |
|
| 2024.04.12 |
Quan Liu (Small Language Model) |
《Textbooks Are All You Need》 (ArXiv) |
|
| 2024.04.12 |
Quan Liu (Small Language Model) |
《Small Models are Valuable Plug-ins for Large Language Models》 (ArXiv) |
|
| 2024.04.12 |
Quan Liu (Small Language Model) |
《MobileVLM V2: Faster and Stronger Baseline for Vision Language Model》 (ArXiv) |
|
| 2024.04.05 |
Ruining Deng (Class-incremental Learning) |
《PLOP: Learning without Forgetting for Continual Semantic Segmentation》 (CVPR2021) |
|
| 2024.04.05 |
Ruining Deng (Class-incremental Learning) |
《Class Similarity Weighted Knowledge Distillation for Continual Semantic Segmentation》 (CVPR2022) |
|
| 2024.04.05 |
Ruining Deng (Class-incremental Learning) |
《CoMFormer: Continual Learning in Semantic and Panoptic Segmentation》 (CVPR2023) |
|
| 2024.03.29 |
Cathy Cui (Efficient Model) |
《PromptKD: Unsupervised Prompt Distillation for Vision-Language Models》 |
|
| 2024.03.29 |
Cathy Cui (Efficient Model) |
《Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts》 |
|
| 2024.03.29 |
Cathy Cui (Efficient Model) |
《EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything》 |
|
| 2024.03.22 |
Juming Xiong (Generative Model) |
《Endora: Video Generation Models as Endoscopy Simulators》 |
|
| 2024.03.22 |
Juming Xiong (Image Segmentation) |
《OMG-Seg: Is One Model Good Enough For All Segmentation》(CVPR 2024) |
|
| 2024.03.22 |
Juming Xiong (Image registration) |
《Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration》(CVPR 2024) |
|
| 2024.03.15 |
Yucheng Tang (Autoregressive Models) |
《Taming Transformers for High-Resolution Image Synthesis》(CVPR 2021) |
|
| 2024.03.15 |
Yucheng Tang (Autoregressive Models) |
《Sequential Modeling Enables Scalable Learning for Large Vision Models》 |
|
| 2024.03.15 |
Yucheng Tang (Autoregressive Models) |
《VILA: On Pre-training for Visual Language Models》(CVPR 2024) |
|
| 2024.03.01 |
Junlin Guo (Visual Language model + Dataset denoising) |
《Filtering, distillation, and hard negatives for vision-language pre-training》(CVPR 2023) |
|
| 2024.03.01 |
Junlin Guo (Foundation model + Weakly supervised learning) |
《Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation》(CVPR 2023) |
|
| 2024.03.01 |
Junlin Guo (Self-supervised Pre-training) |
《Geometric Visual Similarity Learning in 3D Medical Image Self-supervised Pre-training》(CVPR 2023) |
|
| 2024.02.23 |
Tianyuan Yao (Vision 'language' Model) |
《Images Speak in Images: A Generalist Painter for In-Context Visual Learning》 |
|
| 2024.02.23 |
Tianyuan Yao (Machine unlearning) |
《UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models》 |
|
| 2024.02.16 |
Marilyn Lionts (Unlearnable Datasets) |
《UNLEARNABLE EXAMPLES: MAKING PERSONAL DATA UNEXPLOITABLE》(ICLR2021) |
|
| 2024.02.16 |
Marilyn Lionts (Unlearnable Datasets) |
《CUDA: Convolution-based Unlearnable Datasets》(CVPR 2023) |
|
| 2024.02.16 |
Marilyn Lionts (Unlearnable Datasets) |
《Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples》(CVPR 2023) |
|
| 2024.02.09 |
Quan Liu (Multi-modal Large Language Models (MLLM) |
《Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization》(ArXiv) |
|
| 2024.02.09 |
Quan Liu (MLLM) |
《GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest》(ArXiv) |
|
| 2024.02.09 |
Quan Liu (MLLM) |
《DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding》(ArXiv) |
|
| 2024.02.02 |
Ruining Deng (Hierarchical Semantic Segmentation) |
《Deep Hierarchical Semantic Segmentation》 (CVPR2022) |
|
| 2024.02.02 |
Ruining Deng (Hierarchical Semantic Segmentation) |
《Unsupervised Hierarchical Semantic Segmentation with Multiview Cosegmentation and Clustering Transformers 》 (CVPR2022) |
|
| 2024.02.02 |
Ruining Deng (Universal segmentation) |
《UniverSeg: Universal Medical Imaging Segmentation》 (ICCV2023 |
|
| 2024.01.26 |
Can(Cathy) Cui (Vision Language Model) |
《LISA: Reasoning Segmentation via Large Language Model》 (ArXiv) |
|
| 2024.01.26 |
Can(Cathy) Cui (Vision Language Model) |
《Making Large Multimodal Models Understand Arbitrary Visual Prompts 》(ArXiv) |
|
| 2024.01.26 |
Can(Cathy) Cui (Network Structure) |
《U-Mamba Enhancing Long-range Dependency for Biomedical Image Segmentation》(ArXiv) |
|
| 2024.01.12 |
Yucheng Tang (Efficient ViT) |
《EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction》 (ICCV 2023) |
|
| 2024.01.12 |
Yucheng Tang Sparse ViT) |
《SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer》 (CVPR) 2023) |
|
| 2024.01.12 |
Yucheng Tang (Open-Vocabulary SAM) |
《Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively》 |
|
| 2023.11.17 |
Dr. Huo (Spatial Transcriptomics) |
《Visualization and Analysis of Gene Expression in Tissue Sections by Spatial Transcriptomics》 (Science 2016) |
|
| 2023.11.17 |
Dr. Huo (Spatial Transcriptomics) |
《Spatially Resolved Transcriptomes—Next Generation Tools for Tissue Exploration》 (BioEssay 2020) |
|
| 2023.11.17 |
Dr. Huo (Spatial Transcriptomics) |
《Alignment and Integration of Spatial Transcriptomics Data》 (Nature Method 2022) |
|
| 2023.11.10 |
Quan Liu (Vision Language Foundation Model) |
《Multimodal Few-Shot Learning with Frozen Language Models》 (NeruIPS 2021) |
|
| 2023.11.10 |
Quan Liu (Vision Language Foundation Model) |
《Frozen Transformers in Language Models Are Effective Visual Encoder Layers》 (arxiv) |
|
| 2023.11.10 |
Quan Liu (Tranformer CNN backbone comparison) |
《ConvNets Match Vision Transformers at Scale》 (DeepMind) |
|
| 2023.11.03 |
Junlin Guo (Long-Tailed Learning + Knowledge Distillation) |
《Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation》 (CVPR 2023) |
|
| 2023.11.03 |
Junlin Guo (Universal instance cell segmentation) |
《Cellpose: a generalist algorithm for cellular segmentation》 (Nature. 2021) |
|
| 2023.11.03 |
Junlin Guo (Universal instance cell segmentation + Harmony) |
《MEDIAR: Harmony of Data-Centric and Model-Centric for Multi-Modality Microscopy》 (NeurIPS 2022) |
|
| 2023.10.27 |
Marilyn Lionts (Variational Autoencoders and Active Learning) |
《An Active Learning Method Based on Variational Autoencoder and DBSCAN Clustering》 (2021) |
|
| 2023.10.27 |
Marilyn Lionts (Variational Autoencoders and Active Learning) |
《The Power of Ensembles for Active Learning in Image Classification》 (CVPR 2018) |
|
| 2023.10.27 |
Marilyn Lionts (Variational Autoencoders and Active Learning) |
《Variational Adversarial Active Learning》 (ICCV 2019) |
|
| 2023.10.20 |
Can(Cathy) Cui (Anomaly Detection and Localization) |
《Anomaly Detection via Reverse Distillation from One-Class Embedding》 (CVPR2022) |
|
| 2023.10.20 |
Can(Cathy) Cui (Anomaly Detection and Localization) |
《Revisiting Reverse Distillation for Anomaly Detection》 (CVPR2023) |
|
| 2023.10.20 |
Can(Cathy) Cui (Anomaly Detection and Localization) |
《ReContrast: Domain-Specific Anomaly Detection via Contrastive Reconstruction》 (NeurIPS) |
|
| 2023.10.13 |
Yucheng Tang (Vision Language Foundation Model) |
《Flamingo: a Visual Language Model for Few-Shot Learning》 (DeepMind) |
|
| 2023.10.13 |
Yucheng Tang (Vision Language Foundation Model) |
《PaLM: Scaling Language Modeling with Pathways》 (Google) |
|
| 2023.10.13 |
Yucheng Tang (Vision Language Foundation Model) |
《PaLM-E: An Embodied Multimodal Language Model》 (Google) |
|
| 2023.10.13 |
Yucheng Tang (Vision Language Foundation Model) |
《GPT-4 Technical Report 》 (OPEN AI) |
|
| 2023.10.13 |
Yucheng Tang (Vision Language Foundation Model) |
《LLaMA: Open and Efficient Foundation Language Models》 (Meta) |
|
| 2023.10.13 |
Yucheng Tang (Vision Language Foundation Model) |
《LLAVA: Visual Instruction Tuning》 (Microsoft, UWM) |
|
| 2023.10.13 |
Yucheng Tang (Vision Language Foundation Model --- Medical) |
《Med-PALM : Large Language Models Encode Clinical Knowledge》 (Google) |
|
| 2023.10.13 |
Yucheng Tang (Vision Language Foundation Model --- Medical) |
《BioMedCLIP: LARGE-SCALE DOMAIN-SPECIFIC PRETRAINING FOR BIOMEDICAL VISION-LANGUAGE PROCESSING》 (Microsoft) |
|
| 2023.10.13 |
Yucheng Tang (Vision Language Foundation Model --- Medical) |
《LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day 》 (Microsoft) |
|
| 2023.10.13 |
Yucheng Tang (Vision Language Foundation Model --- Medical) |
《Med-Flamingo: MED-FLAMINGO: A MULTIMODAL MEDICAL FEWSHOT LEARNER 》 (Stanford) |
|
| 2023.10.13 |
Yucheng Tang (Vision Language Foundation Model --- Medical) |
《Towards Generalist Foundation Model for Radiology 》 (Shanghai AI Lab) |
|
| 2023.10.6 |
Dr. Huo (Vision language model) |
《CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection》 (arxiv) |
|
| 2023.10.6 |
Dr. Huo (Fast data curation) |
《Annotating 8,000 Abdominal CT Volumes for Multi-Organ Segmentation in Three Weeks》 (ICCV 2023) |
|
| 2023.10.6 |
Dr. Huo (Tranformer backbone) |
《UNesT: Local Spatial Representation Learning with Hierarchical Transformer for Efficient Medical Segmentation》 (MeDIA 2023) |
|
| 2023.9.22 |
Tianyuan Yao (Vision language model) |
《BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning》 (AAAI 2023) |
|
| 2023.9.22 |
Tianyuan Yao (Vision language model) |
《PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents》 (MICCAI 2023) |
|
| 2023.9.22 |
Tianyuan Yao (Representation disentanglement + segmentation) |
《Directional Connectivity-based Segmentation of Medical Images》 (CVPR 2023) |
|
| 2023.9.22 |
Tianyuan Yao (Semi-supervised Segmentation) |
《Orthogonal Annotation Benefits Barely-supervised Medical Image Segmentation》 (CVPR 2023) |
|
| 2023.9.15 |
Ruining Deng (Prompt-based Segmentation) |
《Incrementer: Transformer for Class-Incremental Semantic Segmentation with Knowledge Distillation Focusing on Old Class》 (CVPR2023) |
|
| 2023.9.15 |
Ruining Deng (Prompt-based Segmentation) |
《SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning》 (ICCV) |
|
| 2023.9.15 |
Ruining Deng (Prompt-based Segmentation) |
《ProSFDA: Prompt Learning based Source-free Domain Adaptation for Medical Image Segmentation》 (ArXiv) |
|
| 2023.9.08 |
Dr. Huo (Text-to-image Segmentation) |
《Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models》 (ArXiv) |
|
| 2023.9.08 |
Dr. Huo (Fundation Models) |
《DINOv2 from Meta AI – Finally a Foundational Model in Computer Vision》 (Web Site) (ArXiv) |
|
| 2023.9.08 |
Dr. Huo (Fundation Models) |
《SAM-Med2D》 (ArXiv) |
|
| 2023.8.25 |
Quan Liu (Self-supervised Learning) |
《EMP-SSL: Towards Self-Supervised Learning in One Training Epoch》 (CVPR 2023) |
|
| 2023.8.25 |
Quan Liu (Vision language model + zero-shot learning) |
《Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images》 (CVPR 2023) |
|
| 2023.8.25 |
Quan Liu (Image perturbation) |
《Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation》 (CVPR 2023) |
|