Estimation of soil organic carbon content by Vis-NIR spectroscopy combining feature selection algorithm and local regression method

ABSTRACT Soil organic carbon (SOC) content is a critical parameter for evaluating soil health. However, high redundancy and invalid information in soil hyperspectral data can reduce the accuracy and stability of SOC prediction models. This study developed a global partial least squares regression (P...

Full description

Saved in:
Bibliographic Details
Published in:Revista Brasileira de Ciência do Solo Vol. 47
Main Authors: Baoyang Liu, Baofeng Guo, Renxiong Zhuo, Fan Dai
Format: Journal Article
Language:English
Published: Sociedade Brasileira de Ciência do Solo 01.01.2023
Subjects:
ISSN:1806-9657
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract ABSTRACT Soil organic carbon (SOC) content is a critical parameter for evaluating soil health. However, high redundancy and invalid information in soil hyperspectral data can reduce the accuracy and stability of SOC prediction models. This study developed a global partial least squares regression (PLSR) model and a local PLSR model for agricultural soils in the LUCAS 2015 database. Some variable selection methods were combined with the regression models and their effects on prediction accuracy were explored. In addition, when the genetic algorithm is utilized for spectral feature selection, we obtained a more representative spectral subset through a novel coding approach. The results illustrated that the best SOC estimation accuracy was achieved by the local PLSR combined with a coding-improved genetic algorithm (GA), with R2 of 0.71, RMSEP of 5.7 g kg-1, and RPD of 1.87. This study demonstrates that appropriate spectral band selection only slightly enhances the model performance of both global and local regressions, as PLSR models using the full spectrum show similar performance. Local PLSR models consistently outperform global ones using full spectrum or variable selection algorithms.
AbstractList ABSTRACT Soil organic carbon (SOC) content is a critical parameter for evaluating soil health. However, high redundancy and invalid information in soil hyperspectral data can reduce the accuracy and stability of SOC prediction models. This study developed a global partial least squares regression (PLSR) model and a local PLSR model for agricultural soils in the LUCAS 2015 database. Some variable selection methods were combined with the regression models and their effects on prediction accuracy were explored. In addition, when the genetic algorithm is utilized for spectral feature selection, we obtained a more representative spectral subset through a novel coding approach. The results illustrated that the best SOC estimation accuracy was achieved by the local PLSR combined with a coding-improved genetic algorithm (GA), with R2 of 0.71, RMSEP of 5.7 g kg-1, and RPD of 1.87. This study demonstrates that appropriate spectral band selection only slightly enhances the model performance of both global and local regressions, as PLSR models using the full spectrum show similar performance. Local PLSR models consistently outperform global ones using full spectrum or variable selection algorithms.
Author Fan Dai
Baoyang Liu
Baofeng Guo
Renxiong Zhuo
Author_xml – sequence: 1
  orcidid: 0000-0002-3462-6840
  fullname: Baoyang Liu
– sequence: 2
  orcidid: 0000-0002-2705-2949
  fullname: Baofeng Guo
– sequence: 3
  orcidid: 0000-0003-3010-7098
  fullname: Renxiong Zhuo
– sequence: 4
  orcidid: 0000-0001-8211-9790
  fullname: Fan Dai
BookMark eNotkM1KAzEUhYMoWGsfQcgLjCaTySRZSqlaKAqibof8TiMzSUniomtf3FRdXfg-OJxzr8B5iMECcIPRLekZJ3eYo170lCWlc4taglDPzsDihJsTvwSrnL1CLWKUYk4X4HuTi59l8THA6GCOfoIxjTJ4DbVMqmIdQ7GhQHWEHz43z9tXmA9WlxSzjodj9bPywYcROivLV7Iw26n6U6Scxph82c9QBgOnqOUEkx2TrTWqnm3ZR3MNLpycsl393yV4f9i8rZ-a3cvjdn2_a3SHaGkM6m0rpJAdlxRTzpFSHVbGGE4Fd6xlkgpEFcVWOMZb4qRyhgrhKHUCG7IE279cE-XncEh1dzoOUfrhF9TZg0zF68kO9Y3IaWMdsbrTmnElMFECEctd32JMfgDm9HTS
CitedBy_id crossref_primary_10_3390_rs17162806
crossref_primary_10_1109_ACCESS_2025_3574697
crossref_primary_10_3390_s24144464
crossref_primary_10_1016_j_saa_2024_124687
ContentType Journal Article
DBID DOA
DOI 10.36783/18069657rbcs20230067
DatabaseName DOAJ Directory of Open Access Journals
DatabaseTitleList
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
EISSN 1806-9657
ExternalDocumentID oai_doaj_org_article_0690fcdef3ec4cc78b913b903e8f6211
GroupedDBID 5VS
ALMA_UNASSIGNED_HOLDINGS
GROUPED_DOAJ
ID FETCH-LOGICAL-c405t-d06e29a9a48a515880bb41bddd8598f727a5905b51e9f7823fabfd599f55f91d3
IEDL.DBID DOA
ISICitedReferencesCount 4
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001130873800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Fri Oct 03 12:51:52 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c405t-d06e29a9a48a515880bb41bddd8598f727a5905b51e9f7823fabfd599f55f91d3
ORCID 0000-0002-3462-6840
0000-0001-8211-9790
0000-0003-3010-7098
0000-0002-2705-2949
OpenAccessLink https://doaj.org/article/0690fcdef3ec4cc78b913b903e8f6211
ParticipantIDs doaj_primary_oai_doaj_org_article_0690fcdef3ec4cc78b913b903e8f6211
PublicationCentury 2000
PublicationDate 2023-01-01
PublicationDateYYYYMMDD 2023-01-01
PublicationDate_xml – month: 01
  year: 2023
  text: 2023-01-01
  day: 01
PublicationDecade 2020
PublicationTitle Revista Brasileira de Ciência do Solo
PublicationYear 2023
Publisher Sociedade Brasileira de Ciência do Solo
Publisher_xml – name: Sociedade Brasileira de Ciência do Solo
SSID ssib020755185
ssib005513259
Score 2.3243275
Snippet ABSTRACT Soil organic carbon (SOC) content is a critical parameter for evaluating soil health. However, high redundancy and invalid information in soil...
SourceID doaj
SourceType Open Website
SubjectTerms local calibration
LUCAS 2015 database
soil property
variable selection
Title Estimation of soil organic carbon content by Vis-NIR spectroscopy combining feature selection algorithm and local regression method
URI https://doaj.org/article/0690fcdef3ec4cc78b913b903e8f6211
Volume 47
WOSCitedRecordID wos001130873800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1NSwMxEA0iHryIouI3OXhduttsdpOjSouCFBGV3komH7XQ7pbdVejZP-4kKVhPXrxmw7DMJJmXMPMeIdel1WAL5hKQ4JKc5TwRkLmkNLllJVPaQaDMfyxHIzEey6cNqS9fExbpgaPjep5J12ljHbM617oUIDMGMmVWuKIfu3rTUm5cpsLK4njL-knUOMMzj_HYwsPwgGa9TKDpgpcN6NZriPtz-xdxf8gww32yt4aG9Cb-0gHZstUh-RrgHozthbR2tK1ncxqVmDTVqgEc9tXmmDoorOjbrE1GD8809E96nsp6ucLvCwgyENTZQONJ2yB-402q-bRuZt37gqrK0JDYaGOnsTi2olFf-oi8Dgcvd_fJWjgh0Yi_usSkhe1LJVUuFOIV3KIAeQbGGMGlcAhZFJcpB55Z6RAiMKfAGS6l49zJzLBjsl3VlT0hVCK-yBCBFzKDXOpUGAAMBYeiL2wh7Cm59R6bLCM3xsSzVYcBdMVkHcPJXzE8-w8j52TXhzE-j1yQ7a75sJdkR392s7a5CsvjG1Giwr8
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Estimation+of+soil+organic+carbon+content+by+Vis-NIR+spectroscopy+combining+feature+selection+algorithm+and+local+regression+method&rft.jtitle=Revista+Brasileira+de+Ci%C3%AAncia+do+Solo&rft.au=Baoyang+Liu&rft.au=Baofeng+Guo&rft.au=Renxiong+Zhuo&rft.au=Fan+Dai&rft.date=2023-01-01&rft.pub=Sociedade+Brasileira+de+Ci%C3%AAncia+do+Solo&rft.eissn=1806-9657&rft.volume=47&rft_id=info:doi/10.36783%2F18069657rbcs20230067&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_0690fcdef3ec4cc78b913b903e8f6211