SciPy 2025 Project Page

Abstract

Accurate and efficient object detection and spatial localization in remote sensing imagery is a persistent challenge. In the context of precision agriculture, the extensive data annotation required by conventional deep learning models poses additional challenges. This paper presents a fully open source workflow leveraging Meta AI’s Segment Anything Model (SAM) for zero-shot segmentation, enabling scalable object detection and spatial localization in high-resolution drone orthomosaics without the need for annotated image datasets. Model training and/or fine-tuning is rendered unnecessary in our precision agriculture-focused use case. The presented end-to-end workflow takes high-resolution images and quality control (QC) check points as inputs, automatically generates masks corresponding to the objects of interest (empty plant pots, in our given context), and outputs their spatial locations in real-world coordinates. Detection accuracy (required in the given context to be within 3 cm) is then quantitatively evaluated using the ground truth QC check points and benchmarked against object detection output generated using commercially available software. Results demonstrate that the open source workflow achieves superior spatial accuracy — producing output 20% more spatially accurate, with 400% greater IoU — while providing a scalable way to perform spatial localization on high-resolution aerial imagery (with ground sampling distance, or GSD, < 30 cm).

Results

Result 1: False positives may be observed in the output, depending on the filtering parameters used.

Result 2: False positives and "sliver polygons" can be observed in the proprietary software output.

Result 3: Our open source alternative detects objects better, with greater IoU observed in the GeoJSON output.

Workflow	Precision	Recall	F1 Score	Mean Deviation (cm)	IoU
Proprietary	0.9990	1.0000	0.9995	1.39	0.18
Open Source	0.9990	0.9956	0.9973	1.20	0.74
Improvement	--	-0.0022	-0.0044	20%	400%

Performing Object Detection on Drone Orthomosaics with Meta's Segment Anything Model (SAM)

Our task required feature extraction powerful enough to facilitate the accurate spatial localization of `~19,000` plant pots, each 64 pixels tall and wide, in an image with over `1 billion` pixels.

Abstract

Results

Performing Object Detection on Drone Orthomosaics with Meta's Segment Anything Model (SAM)

Our task required feature extraction powerful enough to facilitate the accurate spatial localization of ~19,000 plant pots, each 64 pixels tall and wide, in an image with over 1 billion pixels.

Abstract

Results

Our task required feature extraction powerful enough to facilitate the accurate spatial localization of `~19,000` plant pots, each 64 pixels tall and wide, in an image with over `1 billion` pixels.