Skip to content

hrlblab/journal_club

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

184 Commits
 
 

Repository files navigation

Journal-Club

Time: Friday morning 10:00 - 10:30 AM, FGH 313

Paper-Reading-Group

Agenda

Date Speaker Paper Remark
2026.06.05 Yanfan Zhu
(Agents)
《LightMem: Lightweight and Efficient Memory-Augmented Generation》 (ICLR_2026)
2026.06.05 Yanfan Zhu
(Agents)
《Continual Harness: Online Adaptation for Self-Improving Foundation Agents》 (arXive)
2026.06.05 Yanfan Zhu
(Agents)
《OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models》 (WWW_2026)
2026.05.15 Marilyn Lionts
(DNA Barcoding)
《DNA barcoding increases the taxonomic resolution of shark diet analysis compared to morphological stomach contents identification》 (2026)
2026.05.15 Marilyn Lionts
(CT for Food Science)
《Morphometric Characterization Workflows of Praline Chocolates using X-ray Computed Tomography》 (2026)
2026.05.15 Marilyn Lionts
(Spectroscopy for Food Science)
《Analytical Chemistry Nutritional Insights: Exploring ED-XRF, LIBS, and Chemometric Techniques for Macronutrient Determination in Non-conventional Food Plants (PANC)》 (2026)
2026.05.07 Zhengyi Lu
(RL in LLM)
《PretrainZero: Reinforcement Active Pretraining》 (ICML 2026)
2026.05.07 Zhengyi Lu
(RL in LLM)
《Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning》 (ICML 2026)
2026.05.07 Zhengyi Lu
(RL in LLM)
《Solving Physics Olympiad via Reinforcement Learning on Physics Simulators》 (ICML 2026)
2026.04.29 Yuechen Yang
(LLM)
《Beyond Pixel Agreement: Large Language Models as Clinical Guardrails for Reliable Medical Image Segmentation》 (arXive)
2026.04.29 Yuechen Yang
(VLM)
《VLM-Guided Iterative Refinement for Surgical Image Segmentation with Foundation Models》 (arXive)
2026.04.29 Yuechen Yang
(QC)
《SegQC: a segmentation network-based framework for multi-metric segmentation quality control and segmentation error detection in volumetric medical images》 (achXive)
2026.04.16 Junchao Zhu
(LLM)
《When Agents "Misremember" Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems》 (ICLR 2026)
2026.04.16 Junchao Zhu
(Agents)
《Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning》 (ICLR 2025)
2026.04.16 Junchao Zhu
(Agents)
《scPilot: Large Language Model Reasoning Toward Automated Single-Cell Analysis and Discovery》 (Neurips 2025)
2026.04.10 Junlin Guo
(Deep Fake)
《The Coherence Trap: When MLLM-Crafted Narratives Exploit Manipulated Visual Contexts》 (CVPR 2026)
2026.04.10 Junlin Guo
(VLM Foundation Models)
《Tell2Adapt: A Unified Framework for Source Free Unsupervised Domain Adaptation via Vision Foundation Model》 (CVPR 2026)
2026.04.10 Junlin Guo
(VLM Hallucinations)
《Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding》 (CVPR 2026)
2026.04.03 Yanfan Zhu
(Selective Classifier)
《What Does It Take to Build a Performant Selective Classifier?》 (NeurlPS)
2026.04.03 Yanfan Zhu
(LLM Abstein)
《MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions》 (AAAI 2026)
2026.04.03 Yanfan Zhu
(OOD Detection)
《Leveraging Perturbation Robustness to Enhance Out-of-Distribution Detection》 (CVPR 2025)
2026.03.27 Zhengyi Lu
(Generation)
《Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models》 (ICLR 2026)
2026.03.27 Zhengyi Lu
(Efficient Reasoning)
《Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models》 (NeurlPS 2025)
2026.03.27 Zhengyi Lu
(Efficient Reasoning)
《ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning》 (NeurlPS 2025)
2026.03.20 Marilyn Lionts
(Raman in Food Science)
《Honey Differentiation Using Infrared and Raman Spectroscopy Analysis and the Employment of Machine-Learning-Based Authentication Models》 (2026)
2026.03.20 Marilyn Lionts
(Raman in Food Science)
《Machine learning-assisted Raman spectroscopy for non-destructive analysis of crude palm oil quality》 (2026)
2026.03.20 Marilyn Lionts
(Raman in Food Science)
《Raman on the palm: handheld Raman spectroscopy for enhanced traceability of palm oil》 (2025)
2026.02.13 Junchao Zhu
(LLM)
《NAIPv2: Debiased Pairwise Learning for Efficient Paper Quality Estimation》 (ICLR 2026)
2026.02.13 Junchao Zhu
(Agents)
《Open-World Reinforcement Learning over Long Short-Term Imagination》 (ICLR 2025)
2026.02.13 Junchao Zhu
(LLM)
《LLM DNA: Tracing Model Evolution via Functional Representations》 (ICLR 2026)
2026.02.06 Junlin Guo
(Agents)
《PaperBanana: Automating Academic Illustration for AI Scientists》 (Arxiv 2026)
2026.02.06 Junlin Guo
(CoT & VLM)
《PathReasoner-R1: Instilling Structured Reasoning into Pathology Vision-Language Model via Knowledge-Guided Policy Optimization》 (Arxiv 2026)
2026.02.06 Junlin Guo
(Pretraining & Localization)
《A multimodal vision–language model for generalizable annotation-free pathology localization》 (Nature biomedical engineering 2026)
2026.01.16 Yanfan Zhu
(Segmentation)
《SAM 3: Segment Anything with Concepts》 (ICLR 2026)
2026.01.16 Yanfan Zhu
(Segmentation)
《SAM-Veteran: An MLLM-Based Human-like SAM Agent for Reasoning Segmentation》 (ICLR 2026)
2026.01.16 Yanfan Zhu
(Segmentation)
《LSP-DETR: Efficient and Scalable Nuclei Segmentation in Whole Slide Images》 (Arxiv)
2026.01.08 Zhengyi Lu
(Agent Evaluation)
《Agent-as-a-Judge: Evaluate Agents with Agents》 (ICML 2025)
2026.01.08 Zhengyi Lu
(VLM Reasoning)
《More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models》 (ArXiv)
2026.01.08 Zhengyi Lu
(MLLM Reasoning)
《OneThinker: All-in-one Reasoning Model for Image and Video》 (ArXiv)
2025.12.05 Yuechen Yang
(Diffusion Model)
《Back to Basics: Let Denoising Generative Models Denoise》
2025.12.05 Yuechen Yang
(Vision Model)
《ARC Is a Vision Problem》
2025.12.05 Yuechen Yang
(LLMs)
《Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)》
2025.11.14 Junchao Zhu
(Large-language Model)
《Glyph: Scaling Context Windows via Visual-Text Compression》
2025.11.14 Junchao Zhu
(Vision-language Model)
《VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection(CVPR 2025)》
2025.11.14 Junchao Zhu
(Vision-language Model)
《DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction》(ICCV 2025)
2025.11.09 Yanfan Zhu
(3D Reconstruction)
《Sparse3Diff: A Diffusion Framework for 3D Reconstruction from Sparse 2D Slices in Volumetric Optical Imaging》 (MICCAI 2025)
2025.11.09 Yanfan Zhu
(3D Reconstruction)
《Robust 3D Shape Reconstruction in Zero-Shot from a Single Image in the Wild》 (CVPR2025)
2025.11.09 Yanfan Zhu
(3D Reconstruction)
《Wonder3D++: Cross-domain Diffusion for High-fidelity 3D Generation from a Single Image》 (Arxiv)
2025.10.30 Junlin Guo
(VLM & Reinforcement Learning)
《Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models (IEEE TMI 2025)
2025.10.30 Junlin Guo
(LLM Agent & Reinforcement Learning)
《Agent Learning via Early Experience》 (Arxiv)
2025.10.30 Junlin Guo
(3D point cloud & 2D-3D Foundation Models)
《Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations》 (Neurips2025)
2025.10.24 Zhengyi Lu
(GUI Agent)
《Less is More: Empowering GUI Agent with Context-Aware Simplification》 (ICCV 2025)
2025.10.24 Zhengyi Lu
(MRI Geneartion)
《MRGen: Segmentation Data Engine for Underrepresented MRI Modalities》 (ICCV 2025)
2025.10.24 Zhengyi Lu
(MRI Segmentation)
《Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior》 (ArXiv)
2025.10.10 Marilyn Lionts
(Vector Embeddings)
《On the Theoretical Limitations of Embedding-Based Retrieval》 (Arxiv)
2025.10.10 Marilyn Lionts
(Environmental Impact)
《Measuring the environmental impact of delivering AI at Google Scale》 (Arxiv)
2025.10.10 Marilyn Lionts
(Real-time Models)
《Life Music Models》 (Arxiv)
2025.10.03 Tianyuan yao
(Vision Transformer)
《TransNeXt: Robust Foveal Visual Perception for Vision Transformers》 (CVPR 2024)
2025.10.03 Tianyuan yao
(Transformer)
《Agent Attention: On the Integration of Softmax and Linear Attention》 (ECCV 2024)
2025.10.03 Tianyuan yao
(Transformer)
《Permutation Equivariance of Transformers and Its Applications》 (CVPR 2024)
2025.09.19 Yanfan Zhu
(3D Object Reconstruction)
《Articulate-Anything: Automatic Modeling of Articulated Objects via a Vision-Language Foundation Model》 (ICLR2025)
2025.09.19 Yanfan Zhu
(Generative Vision)
《4K4DGen: Panoramic 4D Generation at 4K Resolution》 (ICLR2025)
2025.09.19 Yanfan Zhu
(Monocular Depth Estimation)
《Depth Pro: Sharp Monocular Metric Depth in Less Than a Second》 (ICLR2025)
2025.09.12 Junchao Zhu
(Vision Language Model)
《Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models》 (ICLR2025)
2025.09.12 Junchao Zhu
(Vision Language Model)
《Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation》 (ICLR2025)
2025.09.12 Junchao Zhu
(Vision Language Model)
《MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models》 (ICLR2025)
2025.08.29 Chongyu Qu
(Efficient Generative Model)
《Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models》 (ICLR2025)
2025.08.29 Chongyu Qu
(Efficient Generative Model)
《DC-AE 1.5: Accelerating Diffusion Model Convergence with Structured Latent Space》 (ICCV 2025)
2025.08.29 Chongyu Qu
(Efficient Generative Model)
《DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer》 (ICCV 2025)
2025.08.15 Junlin Guo
(Vision Foundation Model)
《DINOv3》 (ArXiv)
2025.08.15 Junlin Guo
(Vision Foundation Model)
《Galileo: Learning Global & Local Features of Many Remote Sensing Modalities》 (ICLR2025)
2025.08.15 Junlin Guo
(Vision Foundation Model)
《AnySat: One Earth Observation Model for Many Resolutions, Scales, and Modalities》 (CVPP2025)
2025.08.08 Zhengyi Lu
(Image Refinement)
《IMPRINT: Generative Object Compositing by Learning Identity-Preserving Representation》 (ArXiv)
2025.08.08 Zhengyi Lu
(Image Refinement)
《Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment》 (ArXiv)
2025.08.08 Zhengyi Lu
(Image Refinement)
《Type-R: Automatically Retouching Typos for Text-to-Image Generation》 (ArXiv)
2025.08.01 Tianyuan yao
(LLM)
《SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator》 (ICML2025)
2025.08.01 Tianyuan yao
(diffusion, CNN)
《DiC: Rethinking Conv3x3 Designs in Diffusion Models》 (CVPR2025)
2025.08.01 Tianyuan yao
(LLM)
《Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation of Causal Transformers without Position》 (Google DeepMind)
2025.07.25 Xindong Zheng
(Semi-Supervised Segmentation)
《Annotation Ambiguity Aware Semi-Supervised Medical Image Segmentation》 (CVPR2025)
2025.07.25 Xindong Zheng
(Diffusion)
《Anatomical Consistency and Adaptive Prior-informed Transformation for Multi-contrast MR Image Synthesis via Diffusion Model》 (CVPR2025)
2025.07.25 Xindong Zheng
(Anomaly Detection)
《PIAD: Pose and Illumination agnostic Anomaly Detection》 (CVPR2025)
2025.07.18 Marilyn Lionts
(VLM)
《Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models》 (CVPR2025)
2025.07.18 Marilyn Lionts
(Computer Vision Video)
《Navigation World Models》 (CVPR2025)
2025.07.18 Marilyn Lionts
(MLLM)
《Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens》 (CVPR2025)
2025.06.27 Yanfan Zhu
(ReID)
《From Poses to Identity: Training‑Free Person Re‑Identification via Feature Centralization》 (CVPR2025)
2025.06.27 Yanfan Zhu
(Reconstruction)
《Reconstructing Humans with a Biomechanically Accurate Skeleton》 (CVPR2025)
2025.06.27 Yanfan Zhu
(Reconstruction)
《Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass》 (CVPR2025)
2025.06.20 Junlin Guo
(Model Explanation & Visualization)
《Interpreting Object-level Foundation Models via Visual Precision Search》 (CVPR2025)
2025.06.20 Junlin Guo
(3D Visual Grounding & Transformer)
《VGGT: Visual Geometry Grounded Transformer》 (CVPR2025)
2025.06.20 Junlin Guo
(Pathology & Foundation Model)
《A whole-slide foundation model for digital pathology from real-world data》 (Nature 2024)
2025.05.16 Zhengyi Lu
(3D Diffusion& Mesh)
《MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation》 (ArXiv)
2025.05.16 Zhengyi Lu
(3D Diffusion& Mesh)
《One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion》 (ArXiv)
2025.05.16 Zhengyi Lu
(3D Diffusion& Mesh)
《One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization》 (ArXiv)
2025.04.25 Yuechen Yang
(Pathomics)
《Comparison and Optimization of Cellular Neighbor Preference Methods for Quantitative Tissue Analysis》
2025.04.25 Yuechen Yang
(Pathomics)
《Clinical Relevance of Computational Pathology Analysis of Interplay Between Kidney Microvasculature and Interstitial Microenvironment》 (ArXiv)
2025.04.25 Yuechen Yang
(Pathomics)
《Large-scale extraction of interpretable features provides new insights into kidney histopathology – A proof-of-concept study
2025.04.25 Chongyu Qu
(Quantization Method)
《SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models》 (ArXiv)
2025.04.25 Chongyu Qu
(Quantization Method)
《SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models》 (ArXiv)
2025.04.25 Chongyu Qu
(Quantization Method)
《QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs》 (ArXiv)
2025.04.18 Yanfan Zhu
(TDA)
《TopOC: Topological Deep Learning for Ovarian and Breast Cancer Diagnosis》 (ArXiv)
2025.04.18 Yanfan Zhu
(TDA)
《PI-Att: Topology Attention for Segmentation Networks through Adaptive Persistence Image Representation》 (ArXiv)
2025.04.18 Yanfan Zhu
(TDA)
《Topologically Faithful Multi-class Segmentation in Medical Images》 (ArXiv)
2025.04.11 Junchao Zhu
(Diffusion)
《DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation》 (CVPR 2025)
2025.04.11 Junchao Zhu
(Generation Model)
《Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step》 (ArXiv Jan 2025)
2025.04.11 Junchao Zhu
(Diffusion)
《Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models》 (CVPR 2025)
2025.04.04 Marilyn Lionts
(Diffusion)
《Unified Multimodal Discrete Diffusion》 (ArXiv March 2025)
2025.04.04 Marilyn Lionts
(AI Ethics)
《Users Favor LLM-Generated Content—Until They Know It’s AI》 (ArXiv February 2025)
2025.04.04 Marilyn Lionts
(AI Ethics)
《Position: Model Collapse Does Not Mean What You Think》 (ArXiv March 2025)
2025.03.28 Tianyuan yao
(Transformer, PE)
《RoFormer: Enhanced Transformer with Rotary Position Embedding》 (ArXiv)
2025.03.28 Tianyuan yao
(Transformer, PE)
《Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer》 (ArXiv)
2025.03.28 Tianyuan yao
(Transformer, PE)
《Length Generalization of Causal Transformers without Position Encoding》 (ArXiv)
2025.03.07 Zhengyi Lu
(Cinemagraph)
《StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN》 (ArXiv)
2025.03.07 Zhengyi Lu
(GAN&Diffusion)
《Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation》 (ArXiv)
2025.03.07 Zhengyi Lu
(GAN&Diffusion)
《When StyleGAN Meets Stable Diffusion: a W+ Adapter for Personalized Image Generation》 (ArXiv)
2025.02.28 Yanfan Zhu
(LLM Sparsity)
《Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters》 (ArXiv)
2025.02.28 Yanfan Zhu
(Framework Acceleration)
《A Multi-Level Framework for Accelerating Training Transformer Models》 (ArXiv)
2025.02.28 Yanfan Zhu
(Hardware Acceleration)
《Flash Attention-3: Fast and Accurate Attention with Asynchrony and Low-precision》 (ArXiv)
2025.02.21 Yuechen Yang
(Mesh Gereration)
《MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model》 (ArXiv)
2025.02.21 Yuechen Yang
(Mesh Gereration)
《MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers》 (ArXiv)
2025.02.21 Yuechen Yang
(Mesh Gereration)
《MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization》 (ArXiv)
2025.01.17 Juming Xiong
(PPT Agent)
《PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides》 (ArXiv)
2025.01.17 Juming Xiong
(PPT Agent)
《AUTOPRESENT: Designing Structured Visuals from Scratch》 (ArXiv)
2025.01.17 Juming Xiong
(PPT Agent)
《Enhancing Presentation Slide Generation by LLMs with a Multi-Staged End-to-End Approach》 (ArXiv)
2025.01.10 Tianyuan Yao
(Large Language Model)
《DeepSeek-V3 Technical Report》 (ArXiv)
2025.01.10 Tianyuan Yao
(Large Language Model)
《DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model》 (ArXiv)
2025.01.10 Tianyuan Yao
(Large Language Model)
《DeepSeek LLM: Scaling Open-Source Language Models with Longtermism》 (ArXiv)
2024.12.13 Marilyn Lionts
(Foundation Model)
《Solaris: A Foundation Model of the Sun》
2024.12.13 Marilyn Lionts
(LLM)
《Star Attention: Efficient LLM Inference over Long Sequences》
2024.12.13 Marilyn Lionts
(LLM)
《Ring Attention with Blockwise Transformers for Near-Infinite Context》 (Neurips2024)
2024.12.6 Junchao Zhu
(Generative model)
《Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction》 (Neurips2024)
2024.12.6 Junchao Zhu
(LLM)
《RHO-1: Not All Tokens Are What You Need》 (Neurips2024)
2024.12.6 Junchao Zhu
(GNN)
《Dynamic Graph Representation with Knowledge-Aware Attention for Histopathology Whole Slide Image Analysis》 (CVPR2024)
2024.10.18 Junchao Zhu
(GNN+Super-resolution)
《Image Processing GNN: Breaking Rigidity in Super-Resolution》 (CVPR2024)
2024.10.18 Junchao Zhu
(GNN+Finetuning)
《Fine-tuning Graph Neural Networks by Preserving Graph Generative Patterns》 (AAAI2024)
2024.10.18 Junchao Zhu
(Spatial Transcriptomics)
《Accurate spatial gene expression prediction by integrating multi-resolution features》 (CVPR2024)
2024.10.4 Yuechen Yang Guo
(Generative model)
《Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization》 (CVPR2023)
2024.10.4 Yuechen Yang
(Generative model)
《Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation》 (CVPR2023)
2024.10.4 Yuechen Yang
(Generative model)
《Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction》
2024.09.27 Junlin Guo
(Vision-language model)
《Enhancing Representation in Radiography-Reports Foundation Model: A Granular Alignment Algorithm Using Masked Contrastive Learning》 (Nature Communication)
2024.09.27 Junlin Guo
(Vision-language model)
《Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding》 (Arxiv)
2024.09.27 Junlin Guo
(Segmentation)
《Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding》 (CVPR2024)
2024.09.20 Juming Xiong
(Image Registration)
《RegWSI: Whole slide image registration using combined deep feature-and intensity-based methods: Winner of the ACROBAT 2023 challenge》 (Computer Methods and Programs in Biomedicine)
2024.09.20 Juming Xiong
(Image Registration)
《Unsupervised Non-rigid Histological Image Registration Guided by Keypoint Correspondences Based on Learnable Deep Features with Iterative Training》 (TMI)
2024.09.20 Juming Xiong
(Image Segmentation)
《Feature-prompting GBMSeg: One-Shot Reference Guided Training-Free Prompt Engineering for Glomerular Basement Membrane Segmentation》 (MICCAI)
2024.09.13 Cathy Cui
(Vision-language model)
《Segment Everything Everywhere All at Once》 (NeurIPS 2023)
2024.09.13 Cathy Cui
(Vision-language model)
《Semantic-SAM: Segment and Recognize Anything at Any Granularity》 (ArXiv)
2024.09.13 Cathy Cui
(Vision-language model)
《BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once》 (ArXiv)
2024. 9.6 Ruining Deng
(GAN-based application)
《CP2Image: Generating high-quality single-cell images using CellProfiler representations》 (MIDL2023)
2024. 9.6 Ruining Deng
(Image Registration)
《Unsupervised Histological Image Registration Using Structural Feature Guided Convolutional Neural Network》 (IEEE TMI)
2024. 9.6 Ruining Deng
(Vision-Language model)
《ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification》 (CVPR2024)
2024.08.30 Tianyuan Yao
(Vision language Model)
《BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models》 (ArXiv)
2024.08.30 Tianyuan Yao
(Vision language Model)
《BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation》 (ArXiv)
2024.08.30 Tianyuan Yao
(Vision language Model)
《Align before Fuse: Vision and Language Representation Learning with Momentum Distillation》 (ArXiv)
2024.08.23 Marilyn Lionts
(digital pathology virtual staining)
《Virtual histological staining of unlabeled autopsy tissue》 (Nature Communications 2024)
2024.08.23 Marilyn Lionts
(LLM)
《META-REWARDING LANGUAGE MODELS: Self-Improving Alignment with LLM-as-a-Meta-Judge》 (ArXiv 2024)
2024.08.23 Marilyn Lionts
(AI Safety)
《Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?》 (ArXiv 2024)
2024.07.26 Junchao Zhu
(pseudo label + semi-supervised learning)
《Co-training with High-Confidence Pseudo Labels for Semi-supervised Medical Image Segmentation》 (IJCAI 2023)
2024.07.26 Junchao Zhu
(pseudo label + semi-supervised learning)
《Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation》 (CVPR2023)
2024.07.26 Junchao Zhug
(pseudo label + semi-supervised learning)
《Mutual learning with reliable pseudo label for semi-supervised medical image segmentation》 (MEDIA)
2024.07.19 Yuechen Yang
(image analysis toolbox)
《TIAToolbox as an end-to-end library for advanced tissue image analytics》 ( communications medicine 2022)
2024.07.19 Yuechen Yang
(feature extraction + ML)
《Classification of Citrus Type Based on Leaf Image Using Shape Extraction and GLCM with the Decision Tree Method》 (IEEE 2021)
2024.07.19 Yuechen Yang
(feature extraction + ML)
《Sliding Window Based Support Vector Machine System for Classification of Breast Cancer Using Histopathological Microscopic Images》 (IETE 2019)
2024.07.05 Ruining Deng
(Multi-modal Learning)
《Transcriptomics-guided Slide Representation Learning in Computational Pathology》 (CVPR2024)
2024.07.05 Ruining Deng
(Multi-rater Learning)
《Stochastic In-Context Learning for Medical Image Segmentation》 (CVPR2024)
2024.07.05 Ruining Deng
(Continual Learning)
《Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning》 (CVPR2024)
2024.06.21 Juming Xiong
(Image Stitching)
《Unsupervised Deep Image Stitching: Reconstructing Stitched Features to Images》(IEEE TRANSACTIONS ON IMAGE PROCESSING)
2024.06.21 Juming Xiong
(Image Stitching)
《Parallax-Tolerant Unsupervised Deep Image Stitching》)
2024.06.21 Juming Xiong
(Image Stitching)
《Implicit Neural Image Stitching With Enhanced and Blended Feature Reconstruction》
2024.06.14 Tianyuan Yao
(Time series foundation model)
《Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting》
2024.06.14 Tianyuan Yao
(Time series foundation model)
《Spatial-Temporal Transformer Networks for Traffic Flow Forecasting》
2024.06.14 Tianyuan Yao
(Time series foundation model)
《Foundation Models for Time Series Analysis: A Tutorial and Survey》
2024.05.24 Marilyn Lionts
(Transformers)
《Improving Transformers Using Faithful Positional Encoding》 (ArXiv)
2024.05.24 Marilyn Lionts
(Transformers)
《Zero-Shot Tokenizer Transfer》 (ArXiv)
2024.05.24 Marilyn Lionts
(Language Models)
《Observational Scaling Laws and the Predictability of Language Model Performance》 (ArXiv)
2024.05.03 Junlin Guo
(RLHF + Large Language Model)
《Aligning Large Multimodal Models with Factually Augmented RLHF》 (ArXiv)
2024.05.03 Junlin Guo
(RLHF + Diffusion Model)
《Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model》 (CVPR2024)
2024.04.26 Tianyuan Yao
(Large language Model)
《Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking》 (ArXiv)
2024.04.26 Tianyuan Yao
(Large language Model)
《Mixture-of-Depths: Dynamically allocating compute in transformer-based language models》 (ArXiv)
2024.04.26 Tianyuan Yao
(Large language Model)
《Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention》 (ArXiv)
2024.04.19 Marilyn Lionts
(Spatial Awareness LLMs)
《BLINK: Multimodal Large Language Models Can See but Not Perceive》 (ArXiv)
2024.04.19 Marilyn Lionts
(Spatial Awareness LLMs)
《Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models》 (ArXiv)
2024.04.19 Marilyn Lionts
(Adversarial LLMs)
《Manipulating Large Language Models to Increase Product Visibility》 (ArXiv)
2024.04.12 Quan Liu
(Small Language Model)
《Textbooks Are All You Need》 (ArXiv)
2024.04.12 Quan Liu
(Small Language Model)
《Small Models are Valuable Plug-ins for Large Language Models》 (ArXiv)
2024.04.12 Quan Liu
(Small Language Model)
《MobileVLM V2: Faster and Stronger Baseline for Vision Language Model》 (ArXiv)
2024.04.05 Ruining Deng
(Class-incremental Learning)
《PLOP: Learning without Forgetting for Continual Semantic Segmentation》 (CVPR2021)
2024.04.05 Ruining Deng
(Class-incremental Learning)
《Class Similarity Weighted Knowledge Distillation for Continual Semantic Segmentation》 (CVPR2022)
2024.04.05 Ruining Deng
(Class-incremental Learning)
《CoMFormer: Continual Learning in Semantic and Panoptic Segmentation》 (CVPR2023)
2024.03.29 Cathy Cui
(Efficient Model)
《PromptKD: Unsupervised Prompt Distillation for Vision-Language Models》
2024.03.29 Cathy Cui
(Efficient Model)
《Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts》
2024.03.29 Cathy Cui
(Efficient Model)
《EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything》
2024.03.22 Juming Xiong
(Generative Model)
《Endora: Video Generation Models as Endoscopy Simulators》
2024.03.22 Juming Xiong
(Image Segmentation)
《OMG-Seg: Is One Model Good Enough For All Segmentation》(CVPR 2024)
2024.03.22 Juming Xiong
(Image registration)
《Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration》(CVPR 2024)
2024.03.15 Yucheng Tang
(Autoregressive Models)
《Taming Transformers for High-Resolution Image Synthesis》(CVPR 2021)
2024.03.15 Yucheng Tang
(Autoregressive Models)
《Sequential Modeling Enables Scalable Learning for Large Vision Models》
2024.03.15 Yucheng Tang
(Autoregressive Models)
《VILA: On Pre-training for Visual Language Models》(CVPR 2024)
2024.03.01 Junlin Guo
(Visual Language model + Dataset denoising)
《Filtering, distillation, and hard negatives for vision-language pre-training》(CVPR 2023)
2024.03.01 Junlin Guo
(Foundation model + Weakly supervised learning)
《Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation》(CVPR 2023)
2024.03.01 Junlin Guo
(Self-supervised Pre-training)
《Geometric Visual Similarity Learning in 3D Medical Image Self-supervised Pre-training》(CVPR 2023)
2024.02.23 Tianyuan Yao
(Vision 'language' Model)
《Images Speak in Images: A Generalist Painter for In-Context Visual Learning》
2024.02.23 Tianyuan Yao
(Machine unlearning)
《UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models》
2024.02.16 Marilyn Lionts
(Unlearnable Datasets)
《UNLEARNABLE EXAMPLES: MAKING PERSONAL DATA UNEXPLOITABLE》(ICLR2021)
2024.02.16 Marilyn Lionts
(Unlearnable Datasets)
《CUDA: Convolution-based Unlearnable Datasets》(CVPR 2023)
2024.02.16 Marilyn Lionts
(Unlearnable Datasets)
《Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples》(CVPR 2023)
2024.02.09 Quan Liu
(Multi-modal Large Language Models (MLLM)
《Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization》(ArXiv)
2024.02.09 Quan Liu
(MLLM)
《GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest》(ArXiv)
2024.02.09 Quan Liu
(MLLM)
《DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding》(ArXiv)
2024.02.02 Ruining Deng
(Hierarchical Semantic Segmentation)
《Deep Hierarchical Semantic Segmentation》 (CVPR2022)
2024.02.02 Ruining Deng
(Hierarchical Semantic Segmentation)
《Unsupervised Hierarchical Semantic Segmentation with Multiview Cosegmentation and Clustering Transformers 》 (CVPR2022)
2024.02.02 Ruining Deng
(Universal segmentation)
《UniverSeg: Universal Medical Imaging Segmentation》 (ICCV2023
2024.01.26 Can(Cathy) Cui
(Vision Language Model)
《LISA: Reasoning Segmentation via Large Language Model》 (ArXiv)
2024.01.26 Can(Cathy) Cui
(Vision Language Model)
《Making Large Multimodal Models Understand Arbitrary Visual Prompts 》(ArXiv)
2024.01.26 Can(Cathy) Cui
(Network Structure)
《U-Mamba Enhancing Long-range Dependency for Biomedical Image Segmentation》(ArXiv)
2024.01.12 Yucheng Tang
(Efficient ViT)
《EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction》 (ICCV 2023)
2024.01.12 Yucheng Tang
Sparse ViT)
《SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer》 (CVPR) 2023)
2024.01.12 Yucheng Tang
(Open-Vocabulary SAM)
《Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively》
2023.11.17 Dr. Huo
(Spatial Transcriptomics)
《Visualization and Analysis of Gene Expression in Tissue Sections by Spatial Transcriptomics》 (Science 2016)
2023.11.17 Dr. Huo
(Spatial Transcriptomics)
《Spatially Resolved Transcriptomes—Next Generation Tools for Tissue Exploration》 (BioEssay 2020)
2023.11.17 Dr. Huo
(Spatial Transcriptomics)
《Alignment and Integration of Spatial Transcriptomics Data》 (Nature Method 2022)
2023.11.10 Quan Liu
(Vision Language Foundation Model)
《Multimodal Few-Shot Learning with Frozen Language Models》 (NeruIPS 2021)
2023.11.10 Quan Liu
(Vision Language Foundation Model)
《Frozen Transformers in Language Models Are Effective Visual Encoder Layers》 (arxiv)
2023.11.10 Quan Liu
(Tranformer CNN backbone comparison)
《ConvNets Match Vision Transformers at Scale》 (DeepMind)
2023.11.03 Junlin Guo
(Long-Tailed Learning + Knowledge Distillation)
《Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation》 (CVPR 2023)
2023.11.03 Junlin Guo
(Universal instance cell segmentation)
《Cellpose: a generalist algorithm for cellular segmentation》 (Nature. 2021)
2023.11.03 Junlin Guo
(Universal instance cell segmentation + Harmony)
《MEDIAR: Harmony of Data-Centric and Model-Centric for Multi-Modality Microscopy》 (NeurIPS 2022)
2023.10.27 Marilyn Lionts
(Variational Autoencoders and Active Learning)
《An Active Learning Method Based on Variational Autoencoder and DBSCAN Clustering》 (2021)
2023.10.27 Marilyn Lionts
(Variational Autoencoders and Active Learning)
《The Power of Ensembles for Active Learning in Image Classification》 (CVPR 2018)
2023.10.27 Marilyn Lionts
(Variational Autoencoders and Active Learning)
《Variational Adversarial Active Learning》 (ICCV 2019)
2023.10.20 Can(Cathy) Cui
(Anomaly Detection and Localization)
《Anomaly Detection via Reverse Distillation from One-Class Embedding》 (CVPR2022)
2023.10.20 Can(Cathy) Cui
(Anomaly Detection and Localization)
《Revisiting Reverse Distillation for Anomaly Detection》 (CVPR2023)
2023.10.20 Can(Cathy) Cui
(Anomaly Detection and Localization)
《ReContrast: Domain-Specific Anomaly Detection via Contrastive Reconstruction》 (NeurIPS)
2023.10.13 Yucheng Tang
(Vision Language Foundation Model)
《Flamingo: a Visual Language Model for Few-Shot Learning》 (DeepMind)
2023.10.13 Yucheng Tang
(Vision Language Foundation Model)
《PaLM: Scaling Language Modeling with Pathways》 (Google)
2023.10.13 Yucheng Tang
(Vision Language Foundation Model)
《PaLM-E: An Embodied Multimodal Language Model》 (Google)
2023.10.13 Yucheng Tang
(Vision Language Foundation Model)
《GPT-4 Technical Report 》 (OPEN AI)
2023.10.13 Yucheng Tang
(Vision Language Foundation Model)
《LLaMA: Open and Efficient Foundation Language Models》 (Meta)
2023.10.13 Yucheng Tang
(Vision Language Foundation Model)
《LLAVA: Visual Instruction Tuning》 (Microsoft, UWM)
2023.10.13 Yucheng Tang
(Vision Language Foundation Model --- Medical)
《Med-PALM : Large Language Models Encode Clinical Knowledge》 (Google)
2023.10.13 Yucheng Tang
(Vision Language Foundation Model --- Medical)
《BioMedCLIP: LARGE-SCALE DOMAIN-SPECIFIC PRETRAINING FOR BIOMEDICAL VISION-LANGUAGE PROCESSING》 (Microsoft)
2023.10.13 Yucheng Tang
(Vision Language Foundation Model --- Medical)
《LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day 》 (Microsoft)
2023.10.13 Yucheng Tang
(Vision Language Foundation Model --- Medical)
《Med-Flamingo: MED-FLAMINGO: A MULTIMODAL MEDICAL FEWSHOT LEARNER 》 (Stanford)
2023.10.13 Yucheng Tang
(Vision Language Foundation Model --- Medical)
《Towards Generalist Foundation Model for Radiology 》 (Shanghai AI Lab)
2023.10.6 Dr. Huo
(Vision language model)
《CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection》 (arxiv)
2023.10.6 Dr. Huo
(Fast data curation)
《Annotating 8,000 Abdominal CT Volumes for Multi-Organ Segmentation in Three Weeks》 (ICCV 2023)
2023.10.6 Dr. Huo
(Tranformer backbone)
《UNesT: Local Spatial Representation Learning with Hierarchical Transformer for Efficient Medical Segmentation》 (MeDIA 2023)
2023.9.22 Tianyuan Yao
(Vision language model)
《BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning》 (AAAI 2023)
2023.9.22 Tianyuan Yao
(Vision language model)
《PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents》 (MICCAI 2023)
2023.9.22 Tianyuan Yao
(Representation disentanglement + segmentation)
《Directional Connectivity-based Segmentation of Medical Images》 (CVPR 2023)
2023.9.22 Tianyuan Yao
(Semi-supervised Segmentation)
《Orthogonal Annotation Benefits Barely-supervised Medical Image Segmentation》 (CVPR 2023)
2023.9.15 Ruining Deng
(Prompt-based Segmentation)
《Incrementer: Transformer for Class-Incremental Semantic Segmentation with Knowledge Distillation Focusing on Old Class》 (CVPR2023)
2023.9.15 Ruining Deng
(Prompt-based Segmentation)
《SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning》 (ICCV)
2023.9.15 Ruining Deng
(Prompt-based Segmentation)
《ProSFDA: Prompt Learning based Source-free Domain Adaptation for Medical Image Segmentation》 (ArXiv)
2023.9.08 Dr. Huo
(Text-to-image Segmentation)
《Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models》 (ArXiv)
2023.9.08 Dr. Huo
(Fundation Models)
《DINOv2 from Meta AI – Finally a Foundational Model in Computer Vision》 (Web Site) (ArXiv)
2023.9.08 Dr. Huo
(Fundation Models)
《SAM-Med2D》 (ArXiv)
2023.8.25 Quan Liu
(Self-supervised Learning)
《EMP-SSL: Towards Self-Supervised Learning in One Training Epoch》 (CVPR 2023)
2023.8.25 Quan Liu
(Vision language model + zero-shot learning)
《Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images》 (CVPR 2023)
2023.8.25 Quan Liu
(Image perturbation)
《Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation》 (CVPR 2023)

Pool of great papers from the team (Senior folks can drop papers here as potential papers to review)

  1. Ye, Shuquan, et al. "Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023. [from Yuankai Huo]

  2. Xie, Ronald, et al. "MAESTER: Masked Autoencoder Guided Segmentation at Pixel Resolution for Accurate, Self-Supervised Subcellular Structure Recognition." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023. [from Yuankai Huo]

  3. Huang, Zhi, et al. "A visual–language foundation model for pathology image analysis using medical Twitter." Nature Medicine (2023): 1-10. [from Yuankai Huo]

About

journal club of HRLB lab

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors