The dataset consists of approximately 40 thousand images collected underwater from 20 habitats in the marine-environments of tropical Australia. The dataset originally contained only classification labels. Thus, we collected point-level and segmentation labels to have a more comprehensive fish analysis benchmark. Videos for DeepFish were collected for 20 habitats from remote coastal marine environments of tropical Australia. These videos were acquired using cameras mounted on metal frames, deployed over the side of a vessel to acquire video footage underwater. The cameras were lowered to the seabed and left to record the natural fish community, while the vessel maintained a distance of 100 m. The depth and the map coordinates of the cameras were collected using an acoustic depth sounder and a GPS, respectively. Video recording was carried out during daylight hours and in relatively low turbidity periods. The video clips were captured in full HD resolution (1920 × 1080 pixels) from a digital camera. In total, the number of video frames taken is 39,766. [more]
python setup.py install OR pip install -e .
pip install -r requirements.txt
pip install git+https://github.com/ElementAI/LCFCN
- Download the DeepFish dataset from here
python scripts/train_single_image.py -e loc -d ${PATH_TO_DATASET}
This outputs the following image
python scripts/train_single_image.py -e seg -d ${PATH_TO_DATASET}
This outputs the following image
Run the following command to reproduce the experiments in the paper:
python trainval.py -e ${TASK} -sb ${SAVEDIR_BASE} -d ${DATADIR} -r 1
The variables (${...}) can be substituted with the following values:
TASK: loc, seg, clf, regSAVEDIR_BASE: Absolute path to where results will be savedDATADIR: Absolute path containing the downloaded datasets
Experiment hyperparameters are defined in exp_configs.py
To run a trained model on your own data (arbitrary images or a video you
recorded) without the DeepFish dataset structure (no CSV files or masks needed),
use scripts/predict_simple.py. It runs on GPU or CPU automatically.
python scripts/predict_simple.py -i ${PATH_TO_IMAGES} -m ${PATH_TO_MODEL}.pth -t loc -o output/
Frames are extracted automatically. Use --frame_stride to sample every Nth
frame (e.g. one frame every half second for a 30 fps clip with --frame_stride 15):
python scripts/predict_simple.py --video ${PATH_TO_VIDEO}.mp4 -m ${PATH_TO_MODEL}.pth -t loc -o output/ --frame_stride 15
-t/--task can be loc (localization/counting via points), seg (segmentation),
clf (fish / no-fish classification), or reg (count regression) — it must match
the task the model was trained for.
Results are written to the output directory:
output/
predictions.json # Per-image predictions (counts, points, etc.)
frames/ # Extracted video frames (only when --video is used)
visualizations/ # Visualization images (for loc and seg tasks)
scripts/predict.pyis a separate tool that visualizes predictions against the DeepFish dataset validation split (it requires the dataset structure and its ground-truth labels). For new/unlabelled data, usepredict_simple.pyabove.
If you use the DeepFish dataset in your work, please cite it as:
@article{saleh2020realistic,
title={A realistic fish-habitat dataset to evaluate algorithms for underwater visual analysis},
author={Saleh, Alzayat and Laradji, Issam H and Konovalov, Dmitry A and Bradley, Michael and Vazquez, David and Sheaves, Marcus},
journal={Scientific Reports},
volume={10},
number={1},
pages={14671},
year={2020},
publisher={Nature Publishing Group UK London},
doi={https://doi.org/10.1038/s41598-020-71639-x}
}