ai4colonoscopy

✨ Enter AI4Colonoscopy -- where cutting-edge AI meets life-saving clinical practice

Hey there 👋 We’re not just making colonoscopies "more effective"; we're redefining what early detection can look like, and pushing the next frontier of intelligent healthcare.

With real-time lesion recognition and decision-support powered by deep learning, AI4Colonoscopy has the potential to significantly improve diagnostic accuracy, reduce missed lesions, and ultimately save lives.

Buckle up — we’re diving into a future where intelligent colonoscopy becomes the new gold standard for cancer prevention.

🙋 News

[2025/Dec/09] 🔥🔥 Released the Colon-X project, focusing on a critical yet underexplored transition — evolving from multimodal understanding to clinical reasoning. Read our paper: https://arxiv.org/abs/2512.03667.
[2024/Oct/30] 🔥 Released the IntelliScope project, pushing colonoscopy research from pure visual analysis to multimodal analysis. Read our paper: https://arxiv.org/abs/2410.17241.
[2024/Sep/01] Created the welcome page.

🏥 Why Should You Care About Colonoscopy?

Illustration of a colonoscope inside large intestine (colon). (Image credit: https://www.vecteezy.com)

Let's face it - colorectal cancer is a big deal. It's one of the top cancer killers worldwide. More details refer to latest research and news by Nature: "subject - colorectal-cancer" and "subject - colonoscopy". Here's the good news: we can often prevent it if we catch it early. That's where colonoscopies come in, as the best preventive measure. They're our best shot at finding and removing those sneaky precancerous polyps before they cause trouble to your body.

🤖 Enter AI: The Game-Changer

We're using cutting-edge artificial intelligence to endower colonoscopy a major upgrade. Think of it as giving doctors a pair of super-AI-powered glasses that help them spot things they might otherwise miss.

That's why we're going to explore the critical role of AI in colonoscopy. Here's what AI brings to the table:

🔍 Improved polyp detection rates
- AI is like a tireless assistant, constantly scanning for even the tiniest polyps that human eyes might overlook.
🎯 High sensitivity in distinguishing precancerous polyps
- Not all polyps are created equal. AI can be trained to differentiate between the harmless ones and those that could become cancerous, helping doctors prioritize treatment.
🖼️ Enhanced overall precision of colon evaluation
- It's not just about spotting polyps. AI provides a comprehensive view of the colon, helping doctors make more accurate assessments.
😀 No added risk to colonoscopy
- Here's the best part - all these benefits come with zero additional risk to the patient. It's like getting a free upgrade on your health check!

🌏 Welcome to the World of Intelligent Colonoscopy!

Next are some of the research breakthroughs from our team that shaped the field of intelligent colonoscopy:

🔴 Survey on Intelligent Colonoscopy Techniques

Note

📌 Make our community great again. If we miss your valuable work in google sheet, please add it and this project would be a nice platform to promote your work. Or anyone can inform us via email (📮gepengai.ji@gmail.com) or push a PR in github. We will work on your request as soon as possible. Thank you for your active feedback.

We introduce "ColonSurvey" via investigating 63 colonoscopy datasets and 137 deep learning models focused on colonoscopic scene perception, all sourced from leading conferences or journals since 2015. The below figure is a quick overview of our investigation; for a more detailed discussion, please refer to our survey paper: Frontiers in Intelligent Colonoscopy.

Our investigation of 63 colonoscopy datasets and 137 deep learning models in colonoscopy.

To better understand developments in this rapidly changing field and accelerate researchers’ progress, we are building a 📖paper reading list, which includes a number of AI-based scientific studies on colonoscopy imaging from the past 12 years.

[UPDATE ON OCT-14-2024] In detail, our online list contains:

Colonoscopy datasets 🔗 Google sheet
Colonoscopy models
- Classification tasks 🔗 Google sheet
- Detection tasks 🔗 Google sheet
- Segmentation tasks 🔗 Google sheet
- Vision language tasks 🔗 Google sheet
- 3D analysis tasks (*supplementary content) 🔗 Google sheet

🔴 Highlight-A -- Image Analysis in Colonoscopy

🎯 A.1. PraNet -- Reverse Attention for Segmenting Camouflaged Lesions

Photo for MICCAI Young Scientist Publication Impact Award 2025

Note

📌 Our Motivation of Reverse Attention -- "Attention Reorienting Mechanism"

The attention reorienting refers to the phenomenon where human visual attention does not remain fixed, but instead shifts periodically and rhythmically across different regions of the visual field.

This process is especially prominent when the brain encounters ambiguous, low-contrast, or uncertain information. Rather than committing to a single viewpoint, the visual system continuously revisits these regions to refine perception, reduce uncertainty, and improve recognition accuracy.

A classic study in visual neuroscience, "ORIENTING OF ATTENTION. Posner, 1980" indicate several key aspects of this phenomenon:

Attention shifts even without eye movements (covert reorientation).
These shifts tend to occur periodically -- the brain scans → evaluates → rescans.
Ambiguous or important regions receive more frequent revisits.

In short, this mechanism is crucial for tasks such as detecting lesions, identifying boundaries, or resolving camouflaged objects. Our Reverse Attention (RA) mimics this periodic reorientation by refocusing on uncertain areas after processing more obvious regions.

📚 [Title] PraNet: Parallel Reverse Attention Network for Polyp Segmentation (Paper link & Code link)
🏆 [Info] Accepted by MICCAI 2020 (Oral Presentation) and has been cited over 2,100+ times (according to Google Scholar as of Dec 2024). Received Most Influential Application Paper Award at the Jittor Developer Conference 2021 and MICCAI Young Scientist Publication Impact Award 2025.
🏛️ [Authors] Deng-Ping Fan (🇦🇪 Inception Institute of Artificial Intelligence), Ge-Peng Ji (🇨🇳 Wuhan University), Tao Zhou (🇦🇪 Inception Institute of Artificial Intelligence), Geng Chen (🇦🇪 Inception Institute of Artificial Intelligence), Huazhu Fu (🇦🇪 Inception Institute of Artificial Intelligence), Jianbing Shen (🇦🇪 Inception Institute of Artificial Intelligence), Ling Shao (🇦🇪 Inception Institute of Artificial Intelligence)
🌟 [Research Highlights]
- The most influential and widely-used baseline for image-level polyp segmentation, shaping subsequent research directions in the field.
- Inspired by human visual behavior when examining lesions and their surroundings, we introduce the Reverse Attention (RA) mechanism, enabling the model to refine its focus on ambiguous regions.
- Deliver state-of-the-art segmentation performance across five challenging polyp datasets. PraNet also achieved 1st Place in the MediaEval 2020 colonoscopy polyp segmentation challenge.

📈 [Citation]

@inproceedings{fan2020pranet,
  title={Pranet: Parallel reverse attention network for polyp segmentation},
  author={Fan, Deng-Ping and Ji, Ge-Peng and Zhou, Tao and Chen, Geng and Fu, Huazhu and Shen, Jianbing and Shao, Ling},
  booktitle={International conference on medical image computing and computer-assisted intervention},
  pages={263--273},
  year={2020},
  organization={Springer}
}

🎯 A.2. PraNet v2 -- Adapting Reverse Attention for Multi-class Medical Segmentation

vis.mp4

📚 [Title] PraNet-V2: Dual-Supervised Reverse Attention for Medical Image Segmentation (Paper link & Code link)
🏆 [Info] Accepted by Computational Visual Media 2025
🏛️ [Authors] Bo-Cheng Hu (🇨🇳 Nankai University), Ge-Peng Ji (🇦🇺 Australian National University), Dian Shao (🇨🇳 Northwest Polytechnical University, Deng-Ping Fan* (🇨🇳 Nankai University)
🌟 [Research Highlights]
- We extend the Reverse Attention (RA) mechanism from binary polyp segmentation (ie., ours PraNet-V1 published at MICCAI2020) to multi-class medical image segmentation.
- We introduce a dual-supervised RA learning strategy, which incorporates both primary and auxiliary supervision to enhance feature representation.
- PraNet-V2 achieves state-of-the-art performance on multiple challenging medical image segmentation datasets, demonstrating its versatility and effectiveness across various tasks.

📈 [Citation]

@article{hu2025pranet,
  title={PraNet-V2: Dual-Supervised Reverse Attention for Medical Image Segmentation},
  author={Hu, Bo-Cheng and Ji, Ge-Peng and Shao, Dian and Fan, Deng-Ping},
  journal={Computational Visual Media},
  year={2025}
}

🎯 A.3. DGNet -- The Importance of Boundary Cues in Detecting Camouflaged Lesions

MIR.Series.Deep.Gradient.Learning.for.Efficient.Camouflaged.Object.Detection-2.mp4

📚 [Title] Deep Gradient Learning for Efficient Camouflaged Object Detection (Paper link & Code link)
🏆 [Info] Accepted by MIR 2023
🏛️ [Authors] Ge-Peng Ji (🇦🇪 Inception Institute of Artificial Intelligence), Deng-Ping Fan (🇨🇭 ETH Zürich), Yu-Cheng Chou (🇨🇳 Wuhan University), Dengxin Dai (🇨🇭 ETH Zürich), Alexander Liniger(🇨🇭 ETH Zürich), Luc Van Gool (🇨🇭 ETH Zürich)
🌟 [Research Highlights]
- We propose DGNet, a novel framework that decouples context and texture learning via a Gradient-Induced Transition (GIT) module to exploit object gradient supervision.
- We develop DGNet-S, a highly efficient model running at 80 fps with only 6.82% of the parameters found in cutting-edge competitors like JCSOD.
- Achieve SOTA performance on three COD benchmarks and demonstrate strong generalization across downstream tasks like polyp segmentation and defect detection.
📝 [Downstream Applications] (Ongoing developments, and scheduled release time by March 2026, stay tuned!)
- Polyp Segmentation
- Teeth Segmentation

📈 [Citation]

@article{ji2023gradient,
  title={Deep Gradient Learning for Efficient Camouflaged Object Detection},
  author={Ji, Ge-Peng and Fan, Deng-Ping and Chou, Yu-Cheng and Dai, Dengxin and Liniger, Alexander and Van Gool, Luc},
  journal={Machine Intelligence Research},
  pages={92-108},
  volume={20},
  issue={1},
  year={2023}
}

🔴 Highlight-B -- Video Analysis in Colonoscopy

🎯 B.1. PNS-Net -- A Super-efficient Model for Video Polyp Segmentation

Visualization -- 1st row: Input colonoscopy video frames; 2nd row: Ground-truth masks; 3rd row: Predicted masks by our PNS-Net.

📚 [Title] Progressively Normalized Self-Attention Network for Video Polyp Segmentation (Paper link & Code link)
🏆 [Info] Accepted by MICCAI 2021 and received MICCAI Student Travel Award
🏛️ [Authors] Ge-Peng Ji (🇦🇪 Inception Institute of Artificial Intelligence), Yu-Cheng Chou (🇨🇳 Wuhan University), Deng-Ping Fan (🇦🇪 Inception Institute of Artificial Intelligence), Geng Chen (🇦🇪 Inception Institute of Artificial Intelligence), Huazhu Fu (🇦🇪 Inception Institute of Artificial Intelligence), Debesh Jha (🇳🇴 SimulaMet), Ling Shao (🇦🇪 Inception Institute of Artificial Intelligence)
🌟 [Research Highlights]
- Propose progressively normalized self-attention (PNS) module to capture short-and long-term dependencies across colonoscopy frames.
- Develop PNS-Net, a super-efficient VPS model, that runs at ~140fps on a single RTX 2080Ti GPU, making it highly practical for real-world endoscopy systems

📈 [Citation]

@inproceedings{ji2021progressively,
  title={Progressively normalized self-attention network for video polyp segmentation},
  author={Ji, Ge-Peng and Chou, Yu-Cheng and Fan, Deng-Ping and Chen, Geng and Fu, Huazhu and Jha, Debesh and Shao, Ling},
  booktitle={International conference on medical image computing and computer-assisted intervention},
  pages={142--152},
  year={2021},
  organization={Springer}
}

🎯 B.2. SUN-SEG -- A Large-scale Benchmark for Video Polyp Segmentation

Sample gallery from our SUN-SEG dataset.

📚 [Title] Video Polyp Segmentation: A Deep Learning Perspective (Paper link & Code link)
🏆 [Info] Published in Machine Intelligence Research 2022
🏛️ [Authors] Ge-Peng Ji (🇦🇺 Australian National University), Guobao Xiao (🇨🇳 Minjiang University), Yu-Cheng Chou (🇺🇸 Johns Hopkins University), Deng-Ping Fan (🇨🇭 ETH Zürich), Kai Zhao (🇺🇸 University of California, Los Angeles), Geng Chen (🇦🇪 Inception Institute of Artificial Intelligence), Luc Van Gool (🇨🇭 ETH Zürich)
🌟 [Research Highlights]
- We construct the largest-scale and high-quality per-frame annotated VPS dataset, SUN-SEG, which includes 158,690 video frames along with diverse types of expert labels, including object mask, boundary, scribble, polygon, and visual attribute.
- Paving the way for future research in colonoscopy video analysis.

📈 [Citation]

@article{ji2022video,
  title={Video polyp segmentation: A deep learning perspective},
  author={Ji, Ge-Peng and Xiao, Guobao and Chou, Yu-Cheng and Fan, Deng-Ping and Zhao, Kai and Chen, Geng and Van Gool, Luc},
  journal={Machine Intelligence Research},
  volume={19},
  number={6},
  pages={531--549},
  year={2022},
  publisher={Springer}
}

🔴 Highlight-C -- Multimodal Analysis in Colonoscopy

🎯 C.1. ColonINST & ColonGPT -- Pioneering Multimodal Intelligence in Colonoscopy

Details of our multimodal instruction tuning dataset, ColonINST. (a) Three sequential steps to create the instruction tuning dataset for multimodal research. (b) Numbers of colonoscopy images designated for training, validation, and testing purposes. (c) Data taxonomy of three-level categories. (d) A word cloud of the category distribution by name size. (e) Caption generation pipeline using the VL prompting mode of GPT-4V. (f) Numbers of human-machine dialogues created for four downstream tasks.

📚 [Title] Frontiers in Intelligent Colonoscopy (Paper link & Code link)
🏆 [Info] Accepted by Machine Intelligence Research 2026
🏛️ [Authors] Ge-Peng Ji (🇦🇺 Australian National University), Jingyi Liu (🇯🇵 Keio University), Peng Xu (🇨🇳 Tsinghua University), Nick Barnes (🇦🇺 Australian National University), Fahad Shahbaz Khan (🇦🇪 MBZUAI), Salman Khan (🇦🇪 MBZUAI), Deng-Ping Fan* (🇨🇳 Nankai University)
🌟 [Research Highlights] This year, we’re taking intelligent colonoscopy to the next level, a multimodal world, with three groundbreaking initiatives:
- 💥 Collecting a large-scale multimodal instruction tuning dataset ColonINST, featuring 300K+ colonoscopy images, 62 categories, 128K+ GPT-4V-generated medical captions, and 450K+ human-machine dialogues.
- 💥 Developing the first multimodal language model ColonGPT that can handle conversational tasks based on user preferences.
- Launching a multimodal benchmark to enable fair and rapid comparisons going forward.

📈 [Citation]

@article{ji2026frontiers,
  title={Frontiers in intelligent colonoscopy},
  author={Ji, Ge-Peng and Liu, Jingyi and Xu, Peng and Barnes, Nick and Khan, Fahad Shahbaz and Khan, Salman and Fan, Deng-Ping},
  journal={Machine Intelligence Research},
  year={2026}
}

🎯 C.2. Colon-X -- Advancing colonoscopy from Multimodal Understanding to Clinical Reasoning

Research roadmap of our Colon-X project. Building upon the most comprehensive multimodal colonoscopy dataset (ColonVQA), we propel a pivotal transition in intelligent colonoscopy, evolving from multimodal understanding (ColonEval and ColonPert) to clinical reasoning (ColonReason and ColonR1). These efforts collectively illuminate the path to next-generation advances in clinical colonoscopy and broader medical applications.

📚 [Title] Colon-X: Advancing Intelligent Colonoscopy from Multimodal Understanding to Clinical Reasoning (arXiv Paper & Project page)
🏆 [Info] Under reviewing
🏛️ [Authors] Ge-Peng Ji (🇦🇺 Australian National University), Jingyi Liu (🇨🇳 Nankai University), Deng-Ping Fan* (🇨🇳 Nankai University), Nick Barnes (🇦🇺 Australian National University)
🌟 [Research Highlights] In this project, we are pushing the boundaries of intelligent colonoscopy by transitioning from multimodal understanding to clinical reasoning. Our key contributions are three-fold:
- We introduce ColonVQA, the most extensive (1.1+ million VQA entries), category-rich (212,742 images across 76 clinically meaningful findings), and task-diverse (18 multimodal tasks organized within a five-level taxonomy) dataset ever built for multimodal colonoscopy analysis.
- We characterize two multimodal understanding behaviors – generalizability (ColonEval) and reliability (ColonPert) – in colonoscopy tasks, and reveal that clinical outputs from leading MLLMs remain far from robust and trustworthy.
- 💥 We propose a reasoning-focused solution. It includes ColonReason, a reasoning dataset annotated by a multi-expert debating pipeline, and ColonR1, an R1-styled model enhanced with task-adaptive rewards and gradient-stable optimization, setting a new cutting-edge baseline for colonoscopy analysis.

📈 [Citation]

@article{ji2025colon,
  title={Colon-X: Advancing Intelligent Colonoscopy from Multimodal Understanding to Clinical Reasoning},
  author={Ji, Ge-Peng and Liu, Jingyi and Fan, Deng-Ping and Barnes, Nick},
  journal={arXiv preprint arXiv:2512.03667},
  year={2025}
}

🧩 Collaborating towards the neXt frontier

We are actively looking for potential collaborators to help push this community forward — especially hospitals or medical institutions that can provide diverse, real-world clinical colonoscopy data (eg., data across different devices, modalities, patient populations, and clinical workflows). If you’re interested in contributing or partnering with us, we’d be very happy to connect.

We’re still on the journey toward building truly intelligent colonoscopy systems, and this project is very much under active development. We warmly welcome any feedback, ideas, or suggestions that can help shape its future.

For any inquiries or thoughts you’d like to share, feel free to reach out to us at 📮gepengai.ji@gmail.com

💬 Discussion Forum

This is just the start of building our Roman Empire 🔱. We’re on a mission to make colonoscopies smarter, more accurate, and ultimately, save more lives. Want to join us on this exciting journey? Welcome to our AI4Colonoscopy Discussion Forum

(SUBFORUM#1) ask any questions 😥 “论文中遇见了问题？代码不会跑？“
(SUBFORUM#2) showcase/promote your work 😥 ”想增加论文影响力？如何向社区宣传自己的工作？“
(SUBFORUM#3) access data resources 😥 “下载不到数据？如何使用/处理手头的数据？”
(SUBFORUM#4) share research ideas 😥 ”找不到合作者？想不出有趣的idea？“

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ai4colonoscopy

✨ Enter AI4Colonoscopy -- where cutting-edge AI meets life-saving clinical practice

🙋 News

🏥 Why Should You Care About Colonoscopy?

🤖 Enter AI: The Game-Changer

🌏 Welcome to the World of Intelligent Colonoscopy!

🔴 Survey on Intelligent Colonoscopy Techniques

🔴 Highlight-A -- Image Analysis in Colonoscopy

🎯 A.1. PraNet -- Reverse Attention for Segmenting Camouflaged Lesions

🎯 A.2. PraNet v2 -- Adapting Reverse Attention for Multi-class Medical Segmentation

🎯 A.3. DGNet -- The Importance of Boundary Cues in Detecting Camouflaged Lesions

🔴 Highlight-B -- Video Analysis in Colonoscopy

🎯 B.1. PNS-Net -- A Super-efficient Model for Video Polyp Segmentation

🎯 B.2. SUN-SEG -- A Large-scale Benchmark for Video Polyp Segmentation

🔴 Highlight-C -- Multimodal Analysis in Colonoscopy

🎯 C.1. ColonINST & ColonGPT -- Pioneering Multimodal Intelligence in Colonoscopy

🎯 C.2. Colon-X -- Advancing colonoscopy from Multimodal Understanding to Clinical Reasoning

🧩 Collaborating towards the neXt frontier

💬 Discussion Forum

Pinned Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!