The repository contains official code for
- Introduction
- Results
- Visualization
- Getting Started
- Installation
- Datasets
- Inference
- Training
- Custom Dataset
Anomaly Detection involves identifying deviations from normal data distributions and is critical in fields such as medical diagnostics and industrial defect detection. Traditional AD methods typically require the availability of normal training samples; however, this assumption is not always feasible. Recently, the rich pretraining knowledge of CLIP has shown promising zero-shot generalization in detecting anomalies
without the need for training samples from target domains. However, CLIPβs coarse-grained image-text alignment limits localization and detection performance for fine-grained anomalies due to: (1) spatial misalignment, and (2) the limited sensitivity of global features to local anomalous patterns. In this paper, we propose
Samples of zero-shot anomaly localization of $Crane^+$ for both the main setting and the medical setting (discussed in Appendix E). The complete set of visualizations can be found in Appendix of the paper.
To reproduce the results, follow the instructions below to run inference and training:
All required libraries, including the correct PyTorch version, are specified in environment.yaml. Running setup.sh will automatically create the environment and install all dependencies.
git clone https://github.com/AlirezaSalehy/Crane.git && cd Crane
bash setup.sh
conda activate crane_env
The required checkpoints for CLIP and DINO checkpoints will be downloaded automatically by the code and stored in ~/.cache
. However, the ViT-B SAM checkpoint must be downloaded manually.
Please download sam_vit_b_01ec64.pth
from the official Segment Anything repository here to the following directory:
~/.cache/sam/sam_vit_b_01ec64.pth
You can download the datasets from their official sources, and use utilities in datasets/generate_dataset_json/
to generate a compatible meta.json. Alternatively from the AdaCLIP repository which has provided a compatible format of the datasets. Place all datasets under DATASETS_ROOT
, which is defined in ./__init__.py
.
The checkpoints for our trained "default" model are available in checkpoints
directory. After installing needed libraries, reproduce the results by running:
bash test.sh "0"
Here, "0"
specifies the CUDA device ID(s).
To train new checkpoints and test on the medical and industrial datasets using the default setting, simply run:
bash reproduce.sh new_model 0
where new_model
and 0
specify the name for the checkpoint and the available cuda device ID.
You can use your custom dataset with our model easily following instructions below:
Your dataset must either include a meta.json
file at the root directory, or be organized so that one can be automatically generated.
The meta.json
should follow this format:
- A dictionary with
"train"
and"test"
at the highest level - Each section contains class names mapped to a list of samples
- Each sample includes:
img_path
: path to the image relative to the root dirmask_path
: path to the mask relative to the root dir (empty for normal samples)cls_name
: class namespecie_name
: subclass or condition (e.g.,"good"
,"fault1"
)anomaly
: anomaly label; 0 (normal) or 1 (anomalous)
If your dataset does not include the required meta.json
, you can generate it automatically by organizing your data as shown below and running datasets/generate_dataset_json/custom_dataset.py
:
datasets/your_dataset/
βββ train/
β βββ c1/
β β βββ good/
β β βββ <NAME>.png
β βββ c2/
β βββ good/
β βββ <NAME>.png
βββ test/
β βββ c1/
β β βββ good/
β β β βββ <NAME>.png
β β βββ fault1/
β β β βββ <NAME>.png
β β βββ fault2/
β β β βββ <NAME>.png
β β βββ masks/
β β βββ <NAME>.png
β βββ c2/
β βββ good/
... ...
Once organized, run the script to generate a meta.json
automatically at the dataset root.
Then you should place your dataset in the DATASETS_ROOT
, specified in datasets/generate_dataset_json/__init__.py
and run the inference:
python test.py --dataset YOUR_DATASET --model_name default --epoch 5
- For fair inference throughput comparison with other methods, the default setting is single GPU and original AUPRO implementation. But below, you can get to know some of the enhancements that you can enable.
- Due to the unusual slowness of the original implementation of AUPRO and not finding a good alternative, I made a few optimizations and tested them against the original. The results are available here in FasterAUPRO. The optimized version computes AUPRO 3Γ to 38Γ faster, saving you hours in performance evaluation.
- The
test.py
implementation supports multi-GPU, and by specifying more CUDA IDs with--devices
, you can benefit from further execution speedup.
This project is licensed under the MIT License. See the LICENSE file for details.
If you find this project helpful for your research, please consider citing the following BibTeX entry.
BibTeX:
@article{salehi2025crane,
title={Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections},
author={Salehi, Alireza and Salehi, Mohammadreza and Hosseini, Reshad and Snoek, Cees GM and Yamada, Makoto and Sabokrou, Mohammad},
journal={arXiv preprint arXiv:2504.11055},
year={2025}
}
This project builds upon:
We greatly appreciate the authors for their contributions and open-source support.
For questions or collaborations, please contact alireza99salehy@gmail.com.