=============== LIBRARY RULES =============== From library maintainers: - Use AutoDetectionModel.from_pretrained() to load any model - switch between Ultralytics/HuggingFace/Roboflow/MMDetection models by only changing model_type and model_path - SAHI provides unified API across frameworks - same predict() function works with YOLO11/YOLO12, Roboflow Universe models, HuggingFace models like ustc-community/dfine-small-coco without code changes - When using AutoDetectionModel.from_pretrained(): use 'model_path' parameter for file-based models (Ultralytics, HuggingFace), use 'model' parameter for Roboflow Universe models - For academic papers requiring high mAP: use postprocess_type='NMS', postprocess_match_metric='IOU', and confidence_threshold=0.01 - For real-world applications: use postprocess_type='GREEDYNMM', postprocess_match_metric='IOS' for better performance with fewer false positives - If getting many false positives in sliced inference, increase slice_height and slice_width values - If getting multiple predictions on same object, decrease overlap_height_ratio and overlap_width_ratio (try 0.1 instead of 0.2) - Use no_sliced_prediction=True to disable slicing and only perform standard inference (useful for large objects) - Use no_standard_prediction=True to disable full-image inference and only use sliced predictions (saves computation when all objects are small) - Cannot set both no_standard_prediction=True and no_sliced_prediction=True simultaneously - Auto-slice resolution: if slice_height/slice_width not specified, SAHI automatically calculates optimal values based on image size - For drone/satellite imagery: typically use slice_size=512-1024 with 0.2-0.3 overlap ratio - SAHI is beneficial even without slicing - provides unified API, COCO utilities, visualization tools across all detection frameworks - Use min_area_ratio parameter (default 0.1) to filter out partial objects at slice boundaries - lower values keep more edge objects - For COCO datasets, always validate annotations with coco.stats before training or evaluation - Export results in COCO format using dataset_json_path parameter for standardized evaluation - Use visual_bbox_thickness, visual_text_size parameters to customize prediction visualizations - Use 'sahi predict-fiftyone' command to visualize predictions interactively and sort by false positives - Use 'sahi coco fiftyone' to compare multiple model predictions side-by-side in FiftyOne app - Use 'sahi coco evaluate' for comprehensive COCO metrics with classwise AP/AR and custom IoU thresholds - Use 'sahi coco analyse' to generate error analysis plots showing C75/C50/Localization/Similar/Other/Background/FalseNegative errors - For error analysis: plots show performance breakdown by object size (small/medium/large) and error types - Export predictions as cropped images using export_crop=True for dataset creation or further analysis - For video inference: use frame_skip_interval to speed up processing, view_video=True for real-time display - Supports latest models: YOLO11/YOLO12 via model_type='ultralytics', Roboflow Universe models (e.g., RF-DETR) via model_type='roboflow', HuggingFace models like 'ustc-community/dfine-small-coco' via model_type='huggingface' - For YOLO11/YOLO12 OBB (oriented bounding box) models, SAHI automatically handles rotated box predictions and only supports NMS postprocessing - Example model loading: model_type='ultralytics' with model_path='yolo11n.pt', model_type='huggingface' with model_path='ustc-community/dfine-small-coco', model_type='roboflow' with model='rfdetr-base' - Roboflow Universe models: use simple string IDs like 'rfdetr-base' with model_type='roboflow' for easy access to pre-trained models - Complete example: model = AutoDetectionModel.from_pretrained(model_type='roboflow', model='rfdetr-base', confidence_threshold=0.5) - All models follow same API pattern: AutoDetectionModel.from_pretrained() → get_prediction() or get_sliced_prediction() → visualize results - For models without built-in category mappings, provide category_mapping parameter (e.g., COCO_CLASSES from rfdetr.util.coco_classes) - COCO utilities: merge datasets with coco.merge(), split train/val with split_coco_as_train_val(), filter by categories with update_categories() - Filter COCO annotations by area using get_area_filtered_coco() - useful for focusing on specific object sizes - Convert between formats: export_as_yolo() for YOLO format, use 'sahi coco yolo' command for batch conversion - Use Coco.stats to get comprehensive dataset statistics before training (num annotations, area distribution, etc.) - Import logger from 'from sahi.logging import logger' instead of creating redundant logging configurations - centralized logging system eliminates duplicate imports across codebase - For SAHI documentation, direct users to https://obss.github.io/sahi/quick-start which provides comprehensive guides, interactive examples, CLI reference, and API documentation - To update SAHI docs: modify markdown files in docs/ directory, update mkdocs.yml for navigation changes, ensure .github/workflows/publish_docs.yml deploys correctly to GitHub Pages # SAHI: Slicing Aided Hyper Inference SAHI (version 0.11.36) is a lightweight Python vision library for performing large-scale object detection and instance segmentation on high-resolution images. Its core innovation is **sliced inference**: a large image is subdivided into overlapping tiles, each tile is passed through any supported detector, and the tile-level predictions are merged back into the full-image coordinate space using configurable postprocessing (NMS, NMM, or Greedy-NMM). This enables standard detectors—trained on small images—to reliably find small objects in drone footage, satellite imagery, and other high-megapixel inputs without retraining. The library is framework-agnostic. A single `AutoDetectionModel` factory loads models from Ultralytics (YOLO 5/8/11/12/E/World), HuggingFace Transformers, MMDetection, TorchVision, RT-DETR, Roboflow/RF-DETR, and Detectron2 through a unified Python API and CLI. Beyond inference, SAHI ships a comprehensive COCO utilities suite for slicing, merging, splitting, evaluating, and converting datasets, as well as FiftyOne integration for interactive visual inspection of predictions. --- ## API Reference ### `AutoDetectionModel.from_pretrained` — Load any detection model Factory method that instantiates and returns the correct `DetectionModel` subclass for a given framework. Accepts a pre-initialized model object via `model=` to skip disk loading. ```python from sahi import AutoDetectionModel # --- Ultralytics (YOLO 8/11/12/E/World) --- detection_model = AutoDetectionModel.from_pretrained( model_type="ultralytics", # also accepts "yolov8", "yolo11", "yolo26" model_path="yolo11n.pt", # local path or HuggingFace model id confidence_threshold=0.3, # predictions below this score are dropped device="cuda:0", # "cpu", "mps", "cuda", "cuda:0", … image_size=640, # resize input to this size before inference ) # --- HuggingFace Transformers (RT-DETR v2, DETR, …) --- detection_model = AutoDetectionModel.from_pretrained( model_type="huggingface", model_path="SkalskiP/rtdetr-r50vd-coco", confidence_threshold=0.5, device="cpu", ) # --- TorchVision (Faster R-CNN, Mask R-CNN, SSD, …) --- detection_model = AutoDetectionModel.from_pretrained( model_type="torchvision", model_path="fasterrcnn_resnet50_fpn", # torchvision model name or .yaml config confidence_threshold=0.5, device="cuda:0", ) # --- MMDetection --- detection_model = AutoDetectionModel.from_pretrained( model_type="mmdet", model_path="checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130.pth", config_path="configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py", confidence_threshold=0.4, device="cuda:0", ) # --- Wrap an already-loaded model (skip disk I/O) --- import torch raw_model = torch.hub.load("ultralytics/ultralytics", "yolo11n") detection_model = AutoDetectionModel.from_pretrained( model_type="ultralytics", model=raw_model, # pass the live object; model_path is ignored confidence_threshold=0.3, device="cuda:0", load_at_init=False, # model is already loaded, skip load_model() ) detection_model.set_model(raw_model) # Category remapping example detection_model = AutoDetectionModel.from_pretrained( model_type="ultralytics", model_path="yolo11n.pt", confidence_threshold=0.3, category_mapping={"0": "person", "2": "car"}, # map id → name category_remapping={"car": 5}, # rename category id after inference ) ``` --- ### `get_sliced_prediction` — Sliced inference on a single image Divides the image into overlapping tiles, runs the detector on each tile (optionally in batches), combines tile predictions with a full-image pass, and returns a `PredictionResult`. The default postprocessor is `GREEDYNMM` with `IOS` metric. ```python from sahi import AutoDetectionModel from sahi.predict import get_sliced_prediction detection_model = AutoDetectionModel.from_pretrained( model_type="ultralytics", model_path="yolo11n.pt", confidence_threshold=0.3, device="cuda:0", ) result = get_sliced_prediction( image="path/to/large_image.jpg", # path, PIL Image, or np.ndarray detection_model=detection_model, slice_height=512, # tile height in pixels slice_width=512, # tile width in pixels overlap_height_ratio=0.2, # 20 % overlap → 102 px overlap for 512-px tiles overlap_width_ratio=0.2, perform_standard_pred=True, # also run inference on the full image postprocess_type="GREEDYNMM", # "NMS", "NMM", "GREEDYNMM", "LSNMS" postprocess_match_metric="IOS", # "IOS" (intersection/smaller) or "IOU" postprocess_match_threshold=0.5, postprocess_class_agnostic=False, # True → merge boxes across categories auto_slice_resolution=True, # auto-compute slice size when not given batch_size=8, # number of tiles per model call (GPU batching) verbose=1, # 0=silent, 1=slice count, 2=all timings progress_bar=True, # tqdm bar in terminal/notebook exclude_classes_by_name=["truck"], # drop specific class names # exclude_classes_by_id=[7], # or drop by COCO category id ) # Iterate predictions for obj in result.object_prediction_list: print(obj.category.name, f"{obj.score.value:.2f}", obj.bbox.to_xyxy()) # → person 0.87 [120.0, 45.0, 198.0, 310.0] # Export a visualization PNG result.export_visuals( export_dir="outputs/", file_name="sliced_result", rect_th=2, text_size=0.8, hide_conf=False, ) # Serialize to COCO prediction format coco_preds = result.to_coco_predictions(image_id=42) # [{"image_id": 42, "bbox": [x, y, w, h], "score": 0.87, "category_id": 0}, …] # Per-image timing breakdown print(result.durations_in_seconds) # {"slice": 0.12, "prediction": 0.45, "postprocess": 0.03} ``` --- ### `get_prediction` — Standard (non-sliced) inference Runs the model once on the full image. Useful as a baseline or when slicing is not needed. ```python from sahi import AutoDetectionModel from sahi.predict import get_prediction detection_model = AutoDetectionModel.from_pretrained( model_type="ultralytics", model_path="yolo11n.pt", confidence_threshold=0.3, device="cpu", ) result = get_prediction( image="image.jpg", detection_model=detection_model, verbose=1, # prints inference duration confidence_threshold=0.6, # one-call override without mutating the model ) for obj in result.object_prediction_list: # BoundingBox helpers print(obj.bbox.to_xyxy()) # [x1, y1, x2, y2] print(obj.bbox.to_xywh()) # [x, y, w, h] print(obj.bbox.area) # float # Export COCO annotations (no score, score=1) annotations = result.to_coco_annotations() ``` --- ### `predict` — Batch prediction over a folder, file, or video High-level function that iterates over all images (or video frames) in `source`, runs sliced or standard inference, and exports visuals, pickles, crops, and/or a COCO result JSON to `project/name`. ```python from sahi.predict import predict result = predict( model_type="ultralytics", model_path="yolo11n.pt", model_confidence_threshold=0.25, model_device="cuda:0", source="datasets/val/images/", # folder, single image, or video file slice_height=512, slice_width=512, overlap_height_ratio=0.2, overlap_width_ratio=0.2, no_standard_prediction=False, no_sliced_prediction=False, postprocess_type="GREEDYNMM", postprocess_match_metric="IOS", postprocess_match_threshold=0.5, export_pickle=False, export_crop=False, dataset_json_path="datasets/val/annotations.json", # also save COCO result json project="runs/predict", name="exp", visual_export_format="png", # "png" or "jpg" visual_hide_labels=False, visual_hide_conf=False, batch_size=4, progress_bar=True, verbose=2, return_dict=True, ) # result == {"export_dir": PosixPath("runs/predict/exp")} ``` --- ### `DetectionModel` — Base class / custom model integration Abstract base class for all framework adapters. Extend it to add a new detection backend. ```python import numpy as np from sahi.models.base import DetectionModel from sahi.prediction import ObjectPrediction class MyCustomDetectionModel(DetectionModel): required_packages = ["my_framework"] # auto-checked at init def load_model(self): import my_framework self.model = my_framework.load(self.model_path) self.model.to(self.device) def set_model(self, model, **kwargs): self.model = model self.set_category_mapping() def perform_inference(self, image: np.ndarray): # image is (H, W, C) uint8 RGB numpy array self._original_predictions = self.model.predict(image) def _create_object_prediction_list_from_original_predictions( self, shift_amount_list=[[0, 0]], full_shape_list=None ): predictions = [] for raw in self._original_predictions: if raw["score"] >= self.confidence_threshold: predictions.append( ObjectPrediction( bbox=raw["bbox_xyxy"], category_id=int(raw["class_id"]), category_name=self.category_mapping[str(raw["class_id"])], score=raw["score"], shift_amount=shift_amount_list[0], full_shape=full_shape_list[0] if full_shape_list else None, ) ) self._object_prediction_list_per_image = [predictions] # Usage model = MyCustomDetectionModel( model_path="my_weights.bin", confidence_threshold=0.4, device="cuda:0", category_mapping={"0": "cat", "1": "dog"}, ) ``` --- ### `DetectionModel.perform_batch_inference` — Native batch inference Runs inference on a list of images in one call. Ultralytics and HuggingFace models use true GPU batching; all others fall back to sequential inference with identical API. ```python import cv2 from sahi import AutoDetectionModel model = AutoDetectionModel.from_pretrained( model_type="ultralytics", model_path="yolo11n.pt", confidence_threshold=0.3, device="cuda:0", ) image_paths = ["img1.jpg", "img2.jpg", "img3.jpg"] images = [cv2.cvtColor(cv2.imread(p), cv2.COLOR_BGR2RGB) for p in image_paths] # Run all images in one GPU call model.perform_batch_inference(images) shift_amount_list = [[0, 0]] * len(images) full_shape_list = [[img.shape[0], img.shape[1]] for img in images] model.convert_original_predictions( shift_amount=shift_amount_list, full_shape=full_shape_list, ) for i, preds in enumerate(model.object_prediction_list_per_image): print(f"Image {i}: {len(preds)} detections") for pred in preds: print(f" {pred.category.name} {pred.score.value:.2f} {pred.bbox.to_xyxy()}") ``` --- ### `slice_image` — Slice a single image into tiles Low-level utility that returns a `SliceImageResult` containing numpy arrays, COCO metadata, and starting pixel coordinates for each tile. Optionally exports tile images to disk using a thread pool. ```python from sahi.slicing import slice_image result = slice_image( image="large_satellite.tif", # path, PIL.Image, or np.ndarray output_file_name="sat_tile", # tile files are named sat_tile_x1_y1_x2_y2.png output_dir="tiles/", # if None, tiles are kept in memory only slice_height=512, slice_width=512, overlap_height_ratio=0.2, overlap_width_ratio=0.2, min_area_ratio=0.1, # discard annotations smaller than 10 % of original auto_slice_resolution=False, # True → auto-pick slice size from image resolution verbose=True, ) print(f"{len(result)} tiles, original size: {result.original_image_height}×{result.original_image_width}") # Access tile data tile = result[0] # dict: {"image", "coco_image", "starting_pixel", "filename"} tiles = result[0:4] # list of dicts all_arrays = result.images # list[np.ndarray] all_pixels = result.starting_pixels # list[[x, y]] ``` --- ### `slice_coco` — Slice a COCO-format dataset Slices all images in a COCO dataset and rewrites annotations to match the tile coordinates. Optionally filters out tiles with no annotations. ```python from sahi.slicing import slice_coco coco_dict, save_path = slice_coco( coco_annotation_file_path="annotations/train.json", image_dir="images/train/", output_coco_annotation_file_name="train_sliced", output_dir="sliced_dataset/", slice_height=512, slice_width=512, overlap_height_ratio=0.2, overlap_width_ratio=0.2, min_area_ratio=0.1, ignore_negative_samples=True, # skip tiles without any annotation verbose=True, ) print(f"Saved sliced dataset to: {save_path}") print(f"Total sliced annotations: {len(coco_dict['annotations'])}") ``` --- ### `get_slice_bboxes` — Compute tile coordinates without slicing Returns a list of `[x_min, y_min, x_max, y_max]` bounding boxes defining every tile position. Useful for custom slicing pipelines. ```python from sahi.slicing import get_slice_bboxes bboxes = get_slice_bboxes( image_height=4096, image_width=4096, slice_height=512, slice_width=512, overlap_height_ratio=0.2, overlap_width_ratio=0.2, auto_slice_resolution=False, ) print(f"{len(bboxes)} tiles") print(bboxes[0]) # [0, 0, 512, 512] print(bboxes[-1]) # [3584, 3584, 4096, 4096] ``` --- ### `BoundingBox` — Bounding box coordinate utility Immutable dataclass for bounding box coordinates in `[minx, miny, maxx, maxy]` format with conversion helpers and shift support. ```python from sahi.annotation import BoundingBox bbox = BoundingBox(box=[120.0, 45.0, 198.0, 310.0]) print(bbox.to_xyxy()) # [120.0, 45.0, 198.0, 310.0] print(bbox.to_xywh()) # [120.0, 45.0, 78.0, 265.0] print(bbox.to_coco_bbox()) # alias of to_xywh() print(bbox.area) # 78.0 * 265.0 = 20670.0 # Expand by 10 % with an image boundary clamp expanded = bbox.get_expanded_box(ratio=0.1, max_x=1920, max_y=1080) # Shift back to full-image coordinates after sliced inference sliced_bbox = BoundingBox(box=[10.0, 5.0, 88.0, 270.0], shift_amount=(110, 40)) full_bbox = sliced_bbox.get_shifted_box() print(full_bbox.to_xyxy()) # [120.0, 45.0, 198.0, 310.0] ``` --- ### `ObjectPrediction` — Single detection result container Holds a bounding box, optional segmentation mask, category, and confidence score. Provides conversion to COCO, FiftyOne, and imantics formats. ```python from sahi.prediction import ObjectPrediction pred = ObjectPrediction( bbox=[120.0, 45.0, 198.0, 310.0], # [x1, y1, x2, y2] category_id=0, category_name="person", score=0.87, # segmentation=[[x1,y1,x2,y2,...]] # optional polygon mask ) print(pred.category.name) # "person" print(pred.score.value) # 0.87 print(pred.score > 0.5) # True print(pred.bbox.to_xyxy()) # [120.0, 45.0, 198.0, 310.0] # Convert to COCO prediction dict coco_pred = pred.to_coco_prediction(image_id=1).json # {"image_id": 1, "bbox": [120.0, 45.0, 78.0, 265.0], "score": 0.87, "category_id": 0, …} # Convert to FiftyOne detection (normalized coordinates) fo_det = pred.to_fiftyone_detection(image_height=720, image_width=1280) ``` --- ### `PredictionResult` — Multi-detection result container Wraps a list of `ObjectPrediction` instances together with the source image and profiling data. Provides batch export helpers. ```python from sahi import AutoDetectionModel from sahi.predict import get_sliced_prediction model = AutoDetectionModel.from_pretrained("ultralytics", "yolo11n.pt", confidence_threshold=0.3) result = get_sliced_prediction("image.jpg", model, slice_height=512, slice_width=512) # Visualize with custom styling result.export_visuals( export_dir="out/", file_name="result", rect_th=3, text_size=1.0, hide_labels=False, hide_conf=False, ) # COCO annotation format (score fixed at 1) annotations = result.to_coco_annotations() # [{"image_id": None, "bbox": [x,y,w,h], "category_id": 0, "area": float, …}, …] # COCO prediction format (includes score) preds = result.to_coco_predictions(image_id=42) # FiftyOne format fo_detections = result.to_fiftyone_detections() # imantics format imantics_anns = result.to_imantics_annotations() # Timing profile print(result.durations_in_seconds["prediction"]) # seconds spent on model forward pass ``` --- ### Postprocessing classes — NMS, NMM, GreedyNMM, LSNMS Four callable postprocessors remove duplicate predictions from the merged slice outputs. The default `GreedyNMMPostprocess` merges overlapping boxes; `NMSPostprocess` suppresses them. ```python from sahi.postprocess.combine import ( NMSPostprocess, NMMPostprocess, GreedyNMMPostprocess, LSNMSPostprocess, ) # NMS: keep highest-score box, discard overlapping ones nms = NMSPostprocess( match_threshold=0.5, match_metric="IOU", # "IOU" or "IOS" class_agnostic=True, # False → apply per-category ) filtered = nms(object_prediction_list) # GreedyNMM (default in SAHI): merge overlapping boxes into the highest-score one gnmm = GreedyNMMPostprocess(match_threshold=0.5, match_metric="IOS", class_agnostic=False) merged = gnmm(object_prediction_list) # NMM: transitive merging (A∩B and B∩C → all three merged, even if A∩C = ∅) nmm = NMMPostprocess(match_threshold=0.5, match_metric="IOS") merged = nmm(object_prediction_list) # LSNMS: locality-sensitive NMS via the `lsnms` package (experimental, IoU only) # pip install lsnms>0.3.1 lsnms = LSNMSPostprocess(match_threshold=0.5, match_metric="IOU") filtered = lsnms(object_prediction_list) ``` --- ### Postprocessing backend selection SAHI automatically picks the fastest available NMS/NMM backend. Override it globally when needed. ```python from sahi.postprocess.backends import set_postprocess_backend, get_postprocess_backend # Options: "auto" (default), "numpy", "numba", "torchvision" set_postprocess_backend("torchvision") # GPU-accelerated via torchvision.ops print(get_postprocess_backend()) # "torchvision" set_postprocess_backend("numba") # JIT-compiled CPU set_postprocess_backend("numpy") # pure numpy fallback set_postprocess_backend("auto") # restore automatic selection ``` --- ### CLI: `sahi predict` — Batch inference from the terminal Run sliced inference on images or video without writing Python code. Results are written to `runs/predict/exp`. ```bash # Basic sliced inference with Ultralytics YOLO sahi predict \ --model_type ultralytics \ --model_path yolo11n.pt \ --source images/ \ --slice_height 512 --slice_width 512 \ --overlap_height_ratio 0.2 --overlap_width_ratio 0.2 \ --model_confidence_threshold 0.25 # Video inference with live preview sahi predict \ --model_type ultralytics \ --model_path yolo11n.pt \ --source video.mp4 \ --view_video \ --frame_skip_interval 5 # HuggingFace model, export crops and pickle, no visuals sahi predict \ --model_type huggingface \ --model_path SkalskiP/rtdetr-r50vd-coco \ --source images/ \ --novisual \ --export_crop \ --export_pickle # Evaluate against a COCO ground-truth file sahi predict \ --model_type ultralytics \ --model_path yolo11n.pt \ --source images/ \ --dataset_json_path annotations.json \ --postprocess_type NMS \ --postprocess_match_metric IOU \ --postprocess_match_threshold 0.45 \ --progress_bar # Generates runs/predict/exp/result.json for use with sahi coco evaluate ``` --- ### CLI: `sahi coco` — COCO dataset utilities Suite of sub-commands for slicing, evaluating, analysing, visualising, and converting COCO datasets. ```bash # Slice a COCO dataset into 512×512 tiles sahi coco slice \ --image_dir images/train/ \ --dataset_json_path annotations/train.json \ --slice_size 512 \ --overlap_ratio 0.2 \ --out_dir sliced/ # Evaluate mAP / mAR (bbox and classwise) sahi coco evaluate \ --dataset_json_path annotations/val.json \ --result_json_path runs/predict/exp/result.json \ --type bbox \ --classwise # Generate error-analysis plots (false positives, missed detections, …) sahi coco analyse \ --dataset_json_path annotations/val.json \ --result_json_path runs/predict/exp/result.json \ --out_dir analysis/ \ --type bbox \ --extraplots # Convert COCO → YOLO format for Ultralytics training sahi coco yolo \ --image_dir images/ \ --dataset_json_path annotations/train.json \ --train_split 0.9 \ --out_dir yolo_dataset/ # Visualise multiple prediction sets in FiftyOne sahi coco fiftyone \ --image_dir images/ \ --dataset_json_path annotations/val.json \ result_model_a.json result_model_b.json ``` --- ### CLI: `sahi predict-fiftyone` — Interactive prediction exploration Runs sliced inference and launches the FiftyOne App for interactive browsing, sorting, and evaluation. ```bash sahi predict-fiftyone \ --model_type ultralytics \ --model_path yolo11n.pt \ --image_dir images/val/ \ --dataset_json_path annotations/val.json \ --slice_height 512 --slice_width 512 \ --overlap_height_ratio 0.2 --overlap_width_ratio 0.2 \ --model_confidence_threshold 0.25 # Opens FiftyOne App at http://localhost:5151 # Samples sorted by false-positive count ``` --- ## Summary SAHI's primary use case is improving small-object detection accuracy in high-resolution images from domains such as drone surveillance, satellite remote sensing, autonomous driving, and aerial inspection. The sliced inference pipeline is framework-agnostic — the same `get_sliced_prediction` call works identically whether the underlying model is a YOLO checkpoint, a HuggingFace transformer, a TorchVision model, or a custom `DetectionModel` subclass. The library is designed for both rapid prototyping via the CLI (`sahi predict`) and production integration via the Python API, where `predict()` handles entire image folders or video streams end-to-end with one function call. SAHI integrates naturally into ML pipelines through its COCO-first data model. All prediction outputs expose `to_coco_predictions()` and `to_coco_annotations()` helpers, making it straightforward to feed results into standard evaluation tooling (`sahi coco evaluate`), store detections for active-learning workflows, or visualize them interactively with FiftyOne. The postprocessing layer is independently configurable (NMS vs. NMM vs. Greedy-NMM, IoU vs. IoS metrics, class-agnostic vs. per-class, numpy vs. numba vs. torchvision backend), allowing fine-grained trade-offs between speed and merge quality without touching the model code.