Fully Automated Segmentation of Head CT Neuroanatomy Using Deep Learning Jason Cai, Kenneth Philbrick, Zeynettin Akkus, Bradley Erickson Radiology Informatics Lab, Mayo Clinic, Rochester MN |
|
Abstract
Semantic segmentation of the brain on CT can assist in diagnosis (1-7) and treatment planning (8,9). We present a 2D U-Net that simultaneously segments 16 intracranial structures from head CT. Our model generalized to external scans from the RSNA Hemorrhage Detection Challenge (10), as well as scans demonstrating idiopathic normal pressure hydrocephalus (iNPH). Overall Dice coefficients were comparable to expert annotations and higher that of existing segmentation methods. Although the training dataset consisted of noncontrast studies, our model handled contrast-enhanced studies equally well upon visual inspection. Developers can leverage transfer learning and fine-tuning to further optimize the model for their specific needs.
|
Dataset
Primary Dataset (Training, Validation, and Testing) · 62 normal non-contrast head CTs. Mean patient age: 73.4 years old (range: 27-95). · Training: 40 volumes; validation: 10 volumes; testing: 12 volumes.
Secondary Datasets (Testing Only) · 12 non-contrast head CTs demonstrating iNPH. Mean patient age: 74.3 years old (range: 60-84). · 30 normal non-contrast head CTs from the RSNA Hemorrhage Detection Challenge (10).
Dataset annotation All slices were annotated using RIL-Contour (11) by a team of trained analysts and supervised by a neuroradiologist. Ground Truth Masks: From each test volume, 3 observers segmented the same 5 slices independently (capturing all 16 structures). From these annotations, a set of multi-rater consensus labels was constructed using STAPLE (12).
To protect patient privacy, the datasets are not available for download.
DICOM information |
Primary Dataset |
Secondary Dataset |
||
Training & Validation |
Test (12 volumes) |
Test (12 volumes) |
|
Manufacturer and model |
|
|
|
·GE Discovery CT750 HD |
11 |
5 |
- |
·GE Optima CT660 |
2 |
1 |
- |
·Siemens Sensation 64 |
4 |
- |
3 |
·Siemens SOMATOM Definition AS |
1 |
- |
- |
·Siemens SOMATOM Definition Edge |
2 |
- |
5 |
·Siemens SOMATOM Definition Flash |
27 |
6 |
3 |
·Siemens SOMATOM Force |
1 |
- |
- |
·Toshiba Aquilion |
1 |
- |
- |
·Toshiba Aquilion Prime SP |
1 |
- |
- |
·Toshiba Aquilion ONE |
- |
- |
1 |
Slice Thickness (mm) |
|
|
|
·1.5 |
1 |
- |
- |
·3 |
1 |
- |
1 |
·3.75 |
12 |
6 |
- |
·4 |
20 |
6 |
- |
·5 |
16 |
- |
11 |
Tube Voltage (kVp) |
120 (all scans) |
120 (all scans) |
120 (all scans) |
Tube Current (mA) |
Mean: 334; |
Mean: 400; |
Mean: 213; |
Pixel Spacing (mm) |
Mean: 0.46; |
Mean: 0.43; |
Mean: 0.48; |
Image Dimensions |
512x512 (all scans) |
512x512 (all scans) |
512x512 (all scans) |
Model (13) |
Sample Images
Sample images from the primary test dataset (click to enlarge).
Sample images from the iNPH dataset (click to enlarge). In our paper, the iNPH dataset was used solely for testing. We additionally trained our model on these examinations and made its weights available separately on our GitHub page.
Test dataset workflow (click to enlarge): DICE - Dice coefficients, reported below; VOL - Differences in structure volume, reported in our paper. |
Results
Box and whisker plot comparing the model's predictions with the observers' annotations, using Ground Truth Masks as a reference (click to enlarge). Ground Truth Masks: From each test volume, 3 observers segmented the same 5 slices independently (capturing all 16 structures, n=59 total slices). From these annotations, a set of multi-rater consensus labels was constructed using STAPLE (12).
Box and whisker plot comparing the model's performance between the primary and secondary test datasets (click to enlarge). Red Boxes: Dice Coefficients of the model on the primary test dataset vs. Dice coefficients of the model on the iNPH dataset. Results from these datasets were analyzed together because both contained fully annotated volumes.
Note: The model could not consistently identify the central sulcus in iNPH patients because ventricular enlargement severely distorted its appearance. Two volumes were excluded because the central sulcus could not be identified manually as well. |
Citation
JC Cai, Z Akkus, KA Philbrick, A Boonrod, S Hoodeshenas, AD Weston, P Rouzrokh, GM Conte, A Zeinoddini, DC Vogelsang, Q Huang, BJ Erickson Click here to download citation data.
References
1. Frisoni GB, Geroldi C, Beltramello A, Bianchetti A, Binetti G, Bordiga G, et al. Radial Width of the Temporal Horn: A Sensitive Measure in Alzheimer Disease. Am J Neuroradiol [Internet]. 2002;23(1):35. Available from: http://www.ajnr.org/content/23/1/35.abstract |