Fine Tuning Large Models for Straw Detection in Harvested Fields Under Few-Shot Learning Scenarios

The effective monitoring of crop residues, particularly straw in non-harvested fields, is essential for sustainable agricultural practices and environmental management. Traditional methods of straw detection often face challenges due to limited training data, high annotation complexity, and the need...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE geoscience and remote sensing letters Jg. 22; S. 1 - 5
Hauptverfasser:	Wu, Di, Liu, Xi, Bai, Song, Liu, Caixia
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Piscataway IEEE 2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:	Adaptation models Agricultural practices Algorithms Annotations Bibliographic information Comparative analysis Complexity Computational modeling Crop residues Crops Effectiveness Environmental management Feature recognition Few-shot learning Harvesting Image segmentation Learning low-rank adaptation (LoRA) Machine learning Monitoring Remote sensing segment anything model (SAM) semantic segmentation Straw Sustainable agriculture Sustainable practices Training Training data Transformers Vectors
ISSN:	1545-598X, 1558-0571
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The effective monitoring of crop residues, particularly straw in non-harvested fields, is essential for sustainable agricultural practices and environmental management. Traditional methods of straw detection often face challenges due to limited training data, high annotation complexity, and the need for accurate feature recognition. In response to these challenges, this study investigates the effectiveness of the segment anything model (SAM)-vision transformer (ViT)-huge-low-rank adaptation (LoRA) method, which leverages few-shot learning techniques to accurately and efficiently identify straw using only 0.65% of the available training data. A comparative analysis was performed under consistent testing conditions against several established algorithms, including Deeplabv3, FCN, PSPNet, TransformerUNet, UNet3+, AFFormer, and DynaMas. The results indicate that the SAM-ViT-huge-LoRA method achieves an <inline-formula> <tex-math notation="LaTeX">F1 </tex-math></inline-formula>-score of 83.6%, exceeding the performance of the second best algorithm by at least 2%. Furthermore, the method demonstrates an intersection over union (IoU) metric of 98.02%, surpassing competing models by a minimum of 25%. This research highlights the potential of few-shot learning in scenarios characterized by data scarcity and complex annotation processes. By effectively fine-tuning large models with a small amount of high-quality training data, our approach addresses the challenges of insufficient sample sizes, optimizes the use of limited datasets, reduces annotation costs, and significantly enhances recognition accuracy.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1545-598X 1558-0571
DOI:	10.1109/LGRS.2025.3548106