Fine Tuning Large Models for Straw Detection in Harvested Fields Under Few-Shot Learning Scenarios

The effective monitoring of crop residues, particularly straw in non-harvested fields, is essential for sustainable agricultural practices and environmental management. Traditional methods of straw detection often face challenges due to limited training data, high annotation complexity, and the need...

Full description

Saved in:
Bibliographic Details
Published in:IEEE geoscience and remote sensing letters Vol. 22; pp. 1 - 5
Main Authors: Wu, Di, Liu, Xi, Bai, Song, Liu, Caixia
Format: Journal Article
Language:English
Published: Piscataway IEEE 2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:1545-598X, 1558-0571
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The effective monitoring of crop residues, particularly straw in non-harvested fields, is essential for sustainable agricultural practices and environmental management. Traditional methods of straw detection often face challenges due to limited training data, high annotation complexity, and the need for accurate feature recognition. In response to these challenges, this study investigates the effectiveness of the segment anything model (SAM)-vision transformer (ViT)-huge-low-rank adaptation (LoRA) method, which leverages few-shot learning techniques to accurately and efficiently identify straw using only 0.65% of the available training data. A comparative analysis was performed under consistent testing conditions against several established algorithms, including Deeplabv3, FCN, PSPNet, TransformerUNet, UNet3+, AFFormer, and DynaMas. The results indicate that the SAM-ViT-huge-LoRA method achieves an <inline-formula> <tex-math notation="LaTeX">F1 </tex-math></inline-formula>-score of 83.6%, exceeding the performance of the second best algorithm by at least 2%. Furthermore, the method demonstrates an intersection over union (IoU) metric of 98.02%, surpassing competing models by a minimum of 25%. This research highlights the potential of few-shot learning in scenarios characterized by data scarcity and complex annotation processes. By effectively fine-tuning large models with a small amount of high-quality training data, our approach addresses the challenges of insufficient sample sizes, optimizes the use of limited datasets, reduces annotation costs, and significantly enhances recognition accuracy.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1545-598X
1558-0571
DOI:10.1109/LGRS.2025.3548106