===============
LIBRARY RULES
===============
From library maintainers:
- Use AutoDetectionModel.from_pretrained() to load any model - switch between Ultralytics/HuggingFace/Roboflow/MMDetection models by only changing model_type and model_path
- SAHI provides unified API across frameworks - same predict() function works with YOLO11/YOLO12, Roboflow Universe models, HuggingFace models like ustc-community/dfine-small-coco without code changes
- When using AutoDetectionModel.from_pretrained(): use 'model_path' parameter for file-based models (Ultralytics, HuggingFace), use 'model' parameter for Roboflow Universe models
- For academic papers requiring high mAP: use postprocess_type='NMS', postprocess_match_metric='IOU', and confidence_threshold=0.01
- For real-world applications: use postprocess_type='GREEDYNMM', postprocess_match_metric='IOS' for better performance with fewer false positives
- If getting many false positives in sliced inference, increase slice_height and slice_width values
- If getting multiple predictions on same object, decrease overlap_height_ratio and overlap_width_ratio (try 0.1 instead of 0.2)
- Use no_sliced_prediction=True to disable slicing and only perform standard inference (useful for large objects)
- Use no_standard_prediction=True to disable full-image inference and only use sliced predictions (saves computation when all objects are small)
- Cannot set both no_standard_prediction=True and no_sliced_prediction=True simultaneously
- Auto-slice resolution: if slice_height/slice_width not specified, SAHI automatically calculates optimal values based on image size
- For drone/satellite imagery: typically use slice_size=512-1024 with 0.2-0.3 overlap ratio
- SAHI is beneficial even without slicing - provides unified API, COCO utilities, visualization tools across all detection frameworks
- Use min_area_ratio parameter (default 0.1) to filter out partial objects at slice boundaries - lower values keep more edge objects
- For COCO datasets, always validate annotations with coco.stats before training or evaluation
- Export results in COCO format using dataset_json_path parameter for standardized evaluation
- Use visual_bbox_thickness, visual_text_size parameters to customize prediction visualizations
- Use 'sahi predict-fiftyone' command to visualize predictions interactively and sort by false positives
- Use 'sahi coco fiftyone' to compare multiple model predictions side-by-side in FiftyOne app
- Use 'sahi coco evaluate' for comprehensive COCO metrics with classwise AP/AR and custom IoU thresholds
- Use 'sahi coco analyse' to generate error analysis plots showing C75/C50/Localization/Similar/Other/Background/FalseNegative errors
- For error analysis: plots show performance breakdown by object size (small/medium/large) and error types
- Export predictions as cropped images using export_crop=True for dataset creation or further analysis
- For video inference: use frame_skip_interval to speed up processing, view_video=True for real-time display
- Supports latest models: YOLO11/YOLO12 via model_type='ultralytics', Roboflow Universe models (e.g., RF-DETR) via model_type='roboflow', HuggingFace models like 'ustc-community/dfine-small-coco' via model_type='huggingface'
- For YOLO11/YOLO12 OBB (oriented bounding box) models, SAHI automatically handles rotated box predictions and only supports NMS postprocessing
- Example model loading: model_type='ultralytics' with model_path='yolo11n.pt', model_type='huggingface' with model_path='ustc-community/dfine-small-coco', model_type='roboflow' with model='rfdetr-base'
- Roboflow Universe models: use simple string IDs like 'rfdetr-base' with model_type='roboflow' for easy access to pre-trained models
- Complete example: model = AutoDetectionModel.from_pretrained(model_type='roboflow', model='rfdetr-base', confidence_threshold=0.5)
- All models follow same API pattern: AutoDetectionModel.from_pretrained() → get_prediction() or get_sliced_prediction() → visualize results
- For models without built-in category mappings, provide category_mapping parameter (e.g., COCO_CLASSES from rfdetr.util.coco_classes)
- COCO utilities: merge datasets with coco.merge(), split train/val with split_coco_as_train_val(), filter by categories with update_categories()
- Filter COCO annotations by area using get_area_filtered_coco() - useful for focusing on specific object sizes
- Convert between formats: export_as_yolo() for YOLO format, use 'sahi coco yolo' command for batch conversion
- Use Coco.stats to get comprehensive dataset statistics before training (num annotations, area distribution, etc.)
- Import logger from 'from sahi.logging import logger' instead of creating redundant logging configurations - centralized logging system eliminates duplicate imports across codebase
- For SAHI documentation, direct users to https://obss.github.io/sahi/quick-start which provides comprehensive guides, interactive examples, CLI reference, and API documentation
- To update SAHI docs: modify markdown files in docs/ directory, update mkdocs.yml for navigation changes, ensure .github/workflows/publish_docs.yml deploys correctly to GitHub Pages

# SAHI: Slicing Aided Hyper Inference

SAHI (version 0.11.36) is a lightweight Python vision library for performing large-scale object detection and instance segmentation on high-resolution images. Its core innovation is **sliced inference**: a large image is subdivided into overlapping tiles, each tile is passed through any supported detector, and the tile-level predictions are merged back into the full-image coordinate space using configurable postprocessing (NMS, NMM, or Greedy-NMM). This enables standard detectors—trained on small images—to reliably find small objects in drone footage, satellite imagery, and other high-megapixel inputs without retraining.

The library is framework-agnostic. A single `AutoDetectionModel` factory loads models from Ultralytics (YOLO 5/8/11/12/E/World), HuggingFace Transformers, MMDetection, TorchVision, RT-DETR, Roboflow/RF-DETR, and Detectron2 through a unified Python API and CLI. Beyond inference, SAHI ships a comprehensive COCO utilities suite for slicing, merging, splitting, evaluating, and converting datasets, as well as FiftyOne integration for interactive visual inspection of predictions.

---

## API Reference

### `AutoDetectionModel.from_pretrained` — Load any detection model

Factory method that instantiates and returns the correct `DetectionModel` subclass for a given framework. Accepts a pre-initialized model object via `model=` to skip disk loading.

```python
from sahi import AutoDetectionModel

# --- Ultralytics (YOLO 8/11/12/E/World) ---
detection_model = AutoDetectionModel.from_pretrained(
    model_type="ultralytics",       # also accepts "yolov8", "yolo11", "yolo26"
    model_path="yolo11n.pt",        # local path or HuggingFace model id
    confidence_threshold=0.3,       # predictions below this score are dropped
    device="cuda:0",                # "cpu", "mps", "cuda", "cuda:0", …
    image_size=640,                 # resize input to this size before inference
)

# --- HuggingFace Transformers (RT-DETR v2, DETR, …) ---
detection_model = AutoDetectionModel.from_pretrained(
    model_type="huggingface",
    model_path="SkalskiP/rtdetr-r50vd-coco",
    confidence_threshold=0.5,
    device="cpu",
)

# --- TorchVision (Faster R-CNN, Mask R-CNN, SSD, …) ---
detection_model = AutoDetectionModel.from_pretrained(
    model_type="torchvision",
    model_path="fasterrcnn_resnet50_fpn",  # torchvision model name or .yaml config
    confidence_threshold=0.5,
    device="cuda:0",
)

# --- MMDetection ---
detection_model = AutoDetectionModel.from_pretrained(
    model_type="mmdet",
    model_path="checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130.pth",
    config_path="configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py",
    confidence_threshold=0.4,
    device="cuda:0",
)

# --- Wrap an already-loaded model (skip disk I/O) ---
import torch
raw_model = torch.hub.load("ultralytics/ultralytics", "yolo11n")
detection_model = AutoDetectionModel.from_pretrained(
    model_type="ultralytics",
    model=raw_model,                # pass the live object; model_path is ignored
    confidence_threshold=0.3,
    device="cuda:0",
    load_at_init=False,             # model is already loaded, skip load_model()
)
detection_model.set_model(raw_model)

# Category remapping example
detection_model = AutoDetectionModel.from_pretrained(
    model_type="ultralytics",
    model_path="yolo11n.pt",
    confidence_threshold=0.3,
    category_mapping={"0": "person", "2": "car"},   # map id → name
    category_remapping={"car": 5},                   # rename category id after inference
)
```

---

### `get_sliced_prediction` — Sliced inference on a single image

Divides the image into overlapping tiles, runs the detector on each tile (optionally in batches), combines tile predictions with a full-image pass, and returns a `PredictionResult`. The default postprocessor is `GREEDYNMM` with `IOS` metric.

```python
from sahi import AutoDetectionModel
from sahi.predict import get_sliced_prediction

detection_model = AutoDetectionModel.from_pretrained(
    model_type="ultralytics",
    model_path="yolo11n.pt",
    confidence_threshold=0.3,
    device="cuda:0",
)

result = get_sliced_prediction(
    image="path/to/large_image.jpg",   # path, PIL Image, or np.ndarray
    detection_model=detection_model,
    slice_height=512,                  # tile height in pixels
    slice_width=512,                   # tile width in pixels
    overlap_height_ratio=0.2,          # 20 % overlap → 102 px overlap for 512-px tiles
    overlap_width_ratio=0.2,
    perform_standard_pred=True,        # also run inference on the full image
    postprocess_type="GREEDYNMM",      # "NMS", "NMM", "GREEDYNMM", "LSNMS"
    postprocess_match_metric="IOS",    # "IOS" (intersection/smaller) or "IOU"
    postprocess_match_threshold=0.5,
    postprocess_class_agnostic=False,  # True → merge boxes across categories
    auto_slice_resolution=True,        # auto-compute slice size when not given
    batch_size=8,                      # number of tiles per model call (GPU batching)
    verbose=1,                         # 0=silent, 1=slice count, 2=all timings
    progress_bar=True,                 # tqdm bar in terminal/notebook
    exclude_classes_by_name=["truck"], # drop specific class names
    # exclude_classes_by_id=[7],       # or drop by COCO category id
)

# Iterate predictions
for obj in result.object_prediction_list:
    print(obj.category.name, f"{obj.score.value:.2f}", obj.bbox.to_xyxy())
# → person 0.87 [120.0, 45.0, 198.0, 310.0]

# Export a visualization PNG
result.export_visuals(
    export_dir="outputs/",
    file_name="sliced_result",
    rect_th=2,
    text_size=0.8,
    hide_conf=False,
)

# Serialize to COCO prediction format
coco_preds = result.to_coco_predictions(image_id=42)
# [{"image_id": 42, "bbox": [x, y, w, h], "score": 0.87, "category_id": 0}, …]

# Per-image timing breakdown
print(result.durations_in_seconds)
# {"slice": 0.12, "prediction": 0.45, "postprocess": 0.03}
```

---

### `get_prediction` — Standard (non-sliced) inference

Runs the model once on the full image. Useful as a baseline or when slicing is not needed.

```python
from sahi import AutoDetectionModel
from sahi.predict import get_prediction

detection_model = AutoDetectionModel.from_pretrained(
    model_type="ultralytics",
    model_path="yolo11n.pt",
    confidence_threshold=0.3,
    device="cpu",
)

result = get_prediction(
    image="image.jpg",
    detection_model=detection_model,
    verbose=1,                        # prints inference duration
    confidence_threshold=0.6,         # one-call override without mutating the model
)

for obj in result.object_prediction_list:
    # BoundingBox helpers
    print(obj.bbox.to_xyxy())         # [x1, y1, x2, y2]
    print(obj.bbox.to_xywh())         # [x, y, w, h]
    print(obj.bbox.area)              # float

# Export COCO annotations (no score, score=1)
annotations = result.to_coco_annotations()
```

---

### `predict` — Batch prediction over a folder, file, or video

High-level function that iterates over all images (or video frames) in `source`, runs sliced or standard inference, and exports visuals, pickles, crops, and/or a COCO result JSON to `project/name`.

```python
from sahi.predict import predict

result = predict(
    model_type="ultralytics",
    model_path="yolo11n.pt",
    model_confidence_threshold=0.25,
    model_device="cuda:0",
    source="datasets/val/images/",    # folder, single image, or video file
    slice_height=512,
    slice_width=512,
    overlap_height_ratio=0.2,
    overlap_width_ratio=0.2,
    no_standard_prediction=False,
    no_sliced_prediction=False,
    postprocess_type="GREEDYNMM",
    postprocess_match_metric="IOS",
    postprocess_match_threshold=0.5,
    export_pickle=False,
    export_crop=False,
    dataset_json_path="datasets/val/annotations.json",  # also save COCO result json
    project="runs/predict",
    name="exp",
    visual_export_format="png",       # "png" or "jpg"
    visual_hide_labels=False,
    visual_hide_conf=False,
    batch_size=4,
    progress_bar=True,
    verbose=2,
    return_dict=True,
)
# result == {"export_dir": PosixPath("runs/predict/exp")}
```

---

### `DetectionModel` — Base class / custom model integration

Abstract base class for all framework adapters. Extend it to add a new detection backend.

```python
import numpy as np
from sahi.models.base import DetectionModel
from sahi.prediction import ObjectPrediction

class MyCustomDetectionModel(DetectionModel):
    required_packages = ["my_framework"]   # auto-checked at init

    def load_model(self):
        import my_framework
        self.model = my_framework.load(self.model_path)
        self.model.to(self.device)

    def set_model(self, model, **kwargs):
        self.model = model
        self.set_category_mapping()

    def perform_inference(self, image: np.ndarray):
        # image is (H, W, C) uint8 RGB numpy array
        self._original_predictions = self.model.predict(image)

    def _create_object_prediction_list_from_original_predictions(
        self, shift_amount_list=[[0, 0]], full_shape_list=None
    ):
        predictions = []
        for raw in self._original_predictions:
            if raw["score"] >= self.confidence_threshold:
                predictions.append(
                    ObjectPrediction(
                        bbox=raw["bbox_xyxy"],
                        category_id=int(raw["class_id"]),
                        category_name=self.category_mapping[str(raw["class_id"])],
                        score=raw["score"],
                        shift_amount=shift_amount_list[0],
                        full_shape=full_shape_list[0] if full_shape_list else None,
                    )
                )
        self._object_prediction_list_per_image = [predictions]

# Usage
model = MyCustomDetectionModel(
    model_path="my_weights.bin",
    confidence_threshold=0.4,
    device="cuda:0",
    category_mapping={"0": "cat", "1": "dog"},
)
```

---

### `DetectionModel.perform_batch_inference` — Native batch inference

Runs inference on a list of images in one call. Ultralytics and HuggingFace models use true GPU batching; all others fall back to sequential inference with identical API.

```python
import cv2
from sahi import AutoDetectionModel

model = AutoDetectionModel.from_pretrained(
    model_type="ultralytics",
    model_path="yolo11n.pt",
    confidence_threshold=0.3,
    device="cuda:0",
)

image_paths = ["img1.jpg", "img2.jpg", "img3.jpg"]
images = [cv2.cvtColor(cv2.imread(p), cv2.COLOR_BGR2RGB) for p in image_paths]

# Run all images in one GPU call
model.perform_batch_inference(images)

shift_amount_list = [[0, 0]] * len(images)
full_shape_list = [[img.shape[0], img.shape[1]] for img in images]
model.convert_original_predictions(
    shift_amount=shift_amount_list,
    full_shape=full_shape_list,
)

for i, preds in enumerate(model.object_prediction_list_per_image):
    print(f"Image {i}: {len(preds)} detections")
    for pred in preds:
        print(f"  {pred.category.name} {pred.score.value:.2f} {pred.bbox.to_xyxy()}")
```

---

### `slice_image` — Slice a single image into tiles

Low-level utility that returns a `SliceImageResult` containing numpy arrays, COCO metadata, and starting pixel coordinates for each tile. Optionally exports tile images to disk using a thread pool.

```python
from sahi.slicing import slice_image

result = slice_image(
    image="large_satellite.tif",    # path, PIL.Image, or np.ndarray
    output_file_name="sat_tile",    # tile files are named sat_tile_x1_y1_x2_y2.png
    output_dir="tiles/",            # if None, tiles are kept in memory only
    slice_height=512,
    slice_width=512,
    overlap_height_ratio=0.2,
    overlap_width_ratio=0.2,
    min_area_ratio=0.1,             # discard annotations smaller than 10 % of original
    auto_slice_resolution=False,    # True → auto-pick slice size from image resolution
    verbose=True,
)

print(f"{len(result)} tiles, original size: {result.original_image_height}×{result.original_image_width}")

# Access tile data
tile = result[0]            # dict: {"image", "coco_image", "starting_pixel", "filename"}
tiles = result[0:4]         # list of dicts
all_arrays = result.images  # list[np.ndarray]
all_pixels = result.starting_pixels  # list[[x, y]]
```

---

### `slice_coco` — Slice a COCO-format dataset

Slices all images in a COCO dataset and rewrites annotations to match the tile coordinates. Optionally filters out tiles with no annotations.

```python
from sahi.slicing import slice_coco

coco_dict, save_path = slice_coco(
    coco_annotation_file_path="annotations/train.json",
    image_dir="images/train/",
    output_coco_annotation_file_name="train_sliced",
    output_dir="sliced_dataset/",
    slice_height=512,
    slice_width=512,
    overlap_height_ratio=0.2,
    overlap_width_ratio=0.2,
    min_area_ratio=0.1,
    ignore_negative_samples=True,  # skip tiles without any annotation
    verbose=True,
)

print(f"Saved sliced dataset to: {save_path}")
print(f"Total sliced annotations: {len(coco_dict['annotations'])}")
```

---

### `get_slice_bboxes` — Compute tile coordinates without slicing

Returns a list of `[x_min, y_min, x_max, y_max]` bounding boxes defining every tile position. Useful for custom slicing pipelines.

```python
from sahi.slicing import get_slice_bboxes

bboxes = get_slice_bboxes(
    image_height=4096,
    image_width=4096,
    slice_height=512,
    slice_width=512,
    overlap_height_ratio=0.2,
    overlap_width_ratio=0.2,
    auto_slice_resolution=False,
)
print(f"{len(bboxes)} tiles")
print(bboxes[0])   # [0, 0, 512, 512]
print(bboxes[-1])  # [3584, 3584, 4096, 4096]
```

---

### `BoundingBox` — Bounding box coordinate utility

Immutable dataclass for bounding box coordinates in `[minx, miny, maxx, maxy]` format with conversion helpers and shift support.

```python
from sahi.annotation import BoundingBox

bbox = BoundingBox(box=[120.0, 45.0, 198.0, 310.0])

print(bbox.to_xyxy())      # [120.0, 45.0, 198.0, 310.0]
print(bbox.to_xywh())      # [120.0, 45.0, 78.0, 265.0]
print(bbox.to_coco_bbox()) # alias of to_xywh()
print(bbox.area)           # 78.0 * 265.0 = 20670.0

# Expand by 10 % with an image boundary clamp
expanded = bbox.get_expanded_box(ratio=0.1, max_x=1920, max_y=1080)

# Shift back to full-image coordinates after sliced inference
sliced_bbox = BoundingBox(box=[10.0, 5.0, 88.0, 270.0], shift_amount=(110, 40))
full_bbox = sliced_bbox.get_shifted_box()
print(full_bbox.to_xyxy())  # [120.0, 45.0, 198.0, 310.0]
```

---

### `ObjectPrediction` — Single detection result container

Holds a bounding box, optional segmentation mask, category, and confidence score. Provides conversion to COCO, FiftyOne, and imantics formats.

```python
from sahi.prediction import ObjectPrediction

pred = ObjectPrediction(
    bbox=[120.0, 45.0, 198.0, 310.0],   # [x1, y1, x2, y2]
    category_id=0,
    category_name="person",
    score=0.87,
    # segmentation=[[x1,y1,x2,y2,...]]  # optional polygon mask
)

print(pred.category.name)      # "person"
print(pred.score.value)        # 0.87
print(pred.score > 0.5)        # True
print(pred.bbox.to_xyxy())     # [120.0, 45.0, 198.0, 310.0]

# Convert to COCO prediction dict
coco_pred = pred.to_coco_prediction(image_id=1).json
# {"image_id": 1, "bbox": [120.0, 45.0, 78.0, 265.0], "score": 0.87, "category_id": 0, …}

# Convert to FiftyOne detection (normalized coordinates)
fo_det = pred.to_fiftyone_detection(image_height=720, image_width=1280)
```

---

### `PredictionResult` — Multi-detection result container

Wraps a list of `ObjectPrediction` instances together with the source image and profiling data. Provides batch export helpers.

```python
from sahi import AutoDetectionModel
from sahi.predict import get_sliced_prediction

model = AutoDetectionModel.from_pretrained("ultralytics", "yolo11n.pt", confidence_threshold=0.3)
result = get_sliced_prediction("image.jpg", model, slice_height=512, slice_width=512)

# Visualize with custom styling
result.export_visuals(
    export_dir="out/",
    file_name="result",
    rect_th=3,
    text_size=1.0,
    hide_labels=False,
    hide_conf=False,
)

# COCO annotation format (score fixed at 1)
annotations = result.to_coco_annotations()
# [{"image_id": None, "bbox": [x,y,w,h], "category_id": 0, "area": float, …}, …]

# COCO prediction format (includes score)
preds = result.to_coco_predictions(image_id=42)

# FiftyOne format
fo_detections = result.to_fiftyone_detections()

# imantics format
imantics_anns = result.to_imantics_annotations()

# Timing profile
print(result.durations_in_seconds["prediction"])  # seconds spent on model forward pass
```

---

### Postprocessing classes — NMS, NMM, GreedyNMM, LSNMS

Four callable postprocessors remove duplicate predictions from the merged slice outputs. The default `GreedyNMMPostprocess` merges overlapping boxes; `NMSPostprocess` suppresses them.

```python
from sahi.postprocess.combine import (
    NMSPostprocess,
    NMMPostprocess,
    GreedyNMMPostprocess,
    LSNMSPostprocess,
)

# NMS: keep highest-score box, discard overlapping ones
nms = NMSPostprocess(
    match_threshold=0.5,
    match_metric="IOU",   # "IOU" or "IOS"
    class_agnostic=True,  # False → apply per-category
)
filtered = nms(object_prediction_list)

# GreedyNMM (default in SAHI): merge overlapping boxes into the highest-score one
gnmm = GreedyNMMPostprocess(match_threshold=0.5, match_metric="IOS", class_agnostic=False)
merged = gnmm(object_prediction_list)

# NMM: transitive merging (A∩B and B∩C → all three merged, even if A∩C = ∅)
nmm = NMMPostprocess(match_threshold=0.5, match_metric="IOS")
merged = nmm(object_prediction_list)

# LSNMS: locality-sensitive NMS via the `lsnms` package (experimental, IoU only)
# pip install lsnms>0.3.1
lsnms = LSNMSPostprocess(match_threshold=0.5, match_metric="IOU")
filtered = lsnms(object_prediction_list)
```

---

### Postprocessing backend selection

SAHI automatically picks the fastest available NMS/NMM backend. Override it globally when needed.

```python
from sahi.postprocess.backends import set_postprocess_backend, get_postprocess_backend

# Options: "auto" (default), "numpy", "numba", "torchvision"
set_postprocess_backend("torchvision")  # GPU-accelerated via torchvision.ops
print(get_postprocess_backend())        # "torchvision"

set_postprocess_backend("numba")        # JIT-compiled CPU
set_postprocess_backend("numpy")        # pure numpy fallback
set_postprocess_backend("auto")         # restore automatic selection
```

---

### CLI: `sahi predict` — Batch inference from the terminal

Run sliced inference on images or video without writing Python code. Results are written to `runs/predict/exp`.

```bash
# Basic sliced inference with Ultralytics YOLO
sahi predict \
  --model_type ultralytics \
  --model_path yolo11n.pt \
  --source images/ \
  --slice_height 512 --slice_width 512 \
  --overlap_height_ratio 0.2 --overlap_width_ratio 0.2 \
  --model_confidence_threshold 0.25

# Video inference with live preview
sahi predict \
  --model_type ultralytics \
  --model_path yolo11n.pt \
  --source video.mp4 \
  --view_video \
  --frame_skip_interval 5

# HuggingFace model, export crops and pickle, no visuals
sahi predict \
  --model_type huggingface \
  --model_path SkalskiP/rtdetr-r50vd-coco \
  --source images/ \
  --novisual \
  --export_crop \
  --export_pickle

# Evaluate against a COCO ground-truth file
sahi predict \
  --model_type ultralytics \
  --model_path yolo11n.pt \
  --source images/ \
  --dataset_json_path annotations.json \
  --postprocess_type NMS \
  --postprocess_match_metric IOU \
  --postprocess_match_threshold 0.45 \
  --progress_bar
# Generates runs/predict/exp/result.json for use with sahi coco evaluate
```

---

### CLI: `sahi coco` — COCO dataset utilities

Suite of sub-commands for slicing, evaluating, analysing, visualising, and converting COCO datasets.

```bash
# Slice a COCO dataset into 512×512 tiles
sahi coco slice \
  --image_dir images/train/ \
  --dataset_json_path annotations/train.json \
  --slice_size 512 \
  --overlap_ratio 0.2 \
  --out_dir sliced/

# Evaluate mAP / mAR (bbox and classwise)
sahi coco evaluate \
  --dataset_json_path annotations/val.json \
  --result_json_path runs/predict/exp/result.json \
  --type bbox \
  --classwise

# Generate error-analysis plots (false positives, missed detections, …)
sahi coco analyse \
  --dataset_json_path annotations/val.json \
  --result_json_path runs/predict/exp/result.json \
  --out_dir analysis/ \
  --type bbox \
  --extraplots

# Convert COCO → YOLO format for Ultralytics training
sahi coco yolo \
  --image_dir images/ \
  --dataset_json_path annotations/train.json \
  --train_split 0.9 \
  --out_dir yolo_dataset/

# Visualise multiple prediction sets in FiftyOne
sahi coco fiftyone \
  --image_dir images/ \
  --dataset_json_path annotations/val.json \
  result_model_a.json result_model_b.json
```

---

### CLI: `sahi predict-fiftyone` — Interactive prediction exploration

Runs sliced inference and launches the FiftyOne App for interactive browsing, sorting, and evaluation.

```bash
sahi predict-fiftyone \
  --model_type ultralytics \
  --model_path yolo11n.pt \
  --image_dir images/val/ \
  --dataset_json_path annotations/val.json \
  --slice_height 512 --slice_width 512 \
  --overlap_height_ratio 0.2 --overlap_width_ratio 0.2 \
  --model_confidence_threshold 0.25
# Opens FiftyOne App at http://localhost:5151
# Samples sorted by false-positive count
```

---

## Summary

SAHI's primary use case is improving small-object detection accuracy in high-resolution images from domains such as drone surveillance, satellite remote sensing, autonomous driving, and aerial inspection. The sliced inference pipeline is framework-agnostic — the same `get_sliced_prediction` call works identically whether the underlying model is a YOLO checkpoint, a HuggingFace transformer, a TorchVision model, or a custom `DetectionModel` subclass. The library is designed for both rapid prototyping via the CLI (`sahi predict`) and production integration via the Python API, where `predict()` handles entire image folders or video streams end-to-end with one function call.

SAHI integrates naturally into ML pipelines through its COCO-first data model. All prediction outputs expose `to_coco_predictions()` and `to_coco_annotations()` helpers, making it straightforward to feed results into standard evaluation tooling (`sahi coco evaluate`), store detections for active-learning workflows, or visualize them interactively with FiftyOne. The postprocessing layer is independently configurable (NMS vs. NMM vs. Greedy-NMM, IoU vs. IoS metrics, class-agnostic vs. per-class, numpy vs. numba vs. torchvision backend), allowing fine-grained trade-offs between speed and merge quality without touching the model code.