Image instance segmentation is essential for plant phenotyping in vertical farms, yet the diversity of plant types and limited annotated image data constrain the performance of traditional supervised techniques. These challenges necessitate a zero-shot approach to enable segmentation without relying on specific training data for each plant type.
Researchers present a zero-shot instance segmentation framework combining Grounding DINO and the Segment Anything Model (SAM). To enhance box prompts, Vegetation Cover Aware Non-Maximum Suppression (VC-NMS) incorporating the Normalized Cover Green Index (NCGI) is used to refine object localization by leveraging vegetation spectral features. For point prompts, similarity maps with a max distance criterion are integrated to improve spatial coherence in sparse annotations, addressing the ambiguity of generic point prompts in agricultural contexts.
Experimental validation on two test datasets shows that our enhanced box and point prompts outperform SAM's everything mode and Grounded SAM in zero-shot segmentation tasks. Compared to the supervised method YOLOv11, the framework demonstrates superior zero-shot generalization, achieving the best segmentation performance on both datasets without target-specific annotations.
This study addresses the critical issue of scarce annotated data in vertical farming by developing a zero-shot segmentation framework. The integration of domain-specific indices (NCGI) and prompt optimization techniques provides an effective solution for plant phenotyping, highlighting the potential of supervised models in agricultural computer vision where extensive manual annotation is impractical.
Bao, Q., Yang, Y., Li, Q., & Yang, H. (2025). Zero-shot instance segmentation for plant phenotyping in vertical farming with foundation models and VC-NMS. Frontiers in Plant Science, 16, 1536226. https://doi.org/10.3389/fpls.2025.1536226
Source: Frontiers In