A dual-task segmentation network based on multi-head hierarchical attention for 3D plant point cloud.

Uloženo v:
Podrobná bibliografie
Název: A dual-task segmentation network based on multi-head hierarchical attention for 3D plant point cloud.
Autoři: Pan, Dan, Liu, Baijing, Luo, Lin, Zeng, An, Zhou, Yuting, Pan, Kaixin, Xian, Zhiheng, Xian, Yulun, Liu, Licheng
Zdroj: Frontiers in Plant Science; 2025, p1-16, 16p
Témata: POINT cloud, IMAGE segmentation, MORPHOLOGY, MACHINE learning, PLANT phenology
Abstrakt: Introduction: The development of automated high-throughput plant phenotyping systems with non-destructive characteristics fundamentally relies on achieving accurate segmentation of botanical structures at both semantic and instance levels. However, most existing approaches rely heavily on empirically determined threshold parameters and rarely integrate semantic and instance segmentation within a unified framework. Methods: To address these limitations, this study introduces a methodology leveraging 2D image data of real plants, i.e., Caladium bicolor, captured using a custom-designed plant cultivation platform. A high-quality 3D point cloud dataset was generated through reconstruction. Building on this foundation, we propose a streamlined Dual-Task Segmentation Network (DSN) incorporating a multi-head hierarchical attention mechanism to achieve superior segmentation performance. Also, the dual-task framework employs Multi-Value Conditional Random Field (MV-CRF) to enable semantic segmentation of stem-leaf and individual leaf identification through the DSN architecture when processing manually-annotated 3D point cloud data. The network features a dual-branch architecture: one branch predicts the semantic class of each point, while the other embeds points into a high-dimensional vector space for instance clustering. Multi-task joint optimization is facilitated through the MV-CRF model. Results and discussion: Benchmark evaluations validate the novel framework's segmentation efficacy, yielding 99.16% macro-averaged precision, 95.73% class-wise recognition rate, and an average Intersection over Union of 93.64%, while comparative analyses confirm its superiority over nine benchmark architectures in 3D point cloud analytics. For instance segmentation, the model achieved leading metrics of 87.94%, 72.36%, and 71.61%, respectively. Furthermore, ablation studies validated the effectiveness of the network's design and substantiated the rationale behind each architectural choice. [ABSTRACT FROM AUTHOR]
Copyright of Frontiers in Plant Science is the property of Frontiers Media S.A. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Databáze: Complementary Index
Popis
Abstrakt:Introduction: The development of automated high-throughput plant phenotyping systems with non-destructive characteristics fundamentally relies on achieving accurate segmentation of botanical structures at both semantic and instance levels. However, most existing approaches rely heavily on empirically determined threshold parameters and rarely integrate semantic and instance segmentation within a unified framework. Methods: To address these limitations, this study introduces a methodology leveraging 2D image data of real plants, i.e., Caladium bicolor, captured using a custom-designed plant cultivation platform. A high-quality 3D point cloud dataset was generated through reconstruction. Building on this foundation, we propose a streamlined Dual-Task Segmentation Network (DSN) incorporating a multi-head hierarchical attention mechanism to achieve superior segmentation performance. Also, the dual-task framework employs Multi-Value Conditional Random Field (MV-CRF) to enable semantic segmentation of stem-leaf and individual leaf identification through the DSN architecture when processing manually-annotated 3D point cloud data. The network features a dual-branch architecture: one branch predicts the semantic class of each point, while the other embeds points into a high-dimensional vector space for instance clustering. Multi-task joint optimization is facilitated through the MV-CRF model. Results and discussion: Benchmark evaluations validate the novel framework's segmentation efficacy, yielding 99.16% macro-averaged precision, 95.73% class-wise recognition rate, and an average Intersection over Union of 93.64%, while comparative analyses confirm its superiority over nine benchmark architectures in 3D point cloud analytics. For instance segmentation, the model achieved leading metrics of 87.94%, 72.36%, and 71.61%, respectively. Furthermore, ablation studies validated the effectiveness of the network's design and substantiated the rationale behind each architectural choice. [ABSTRACT FROM AUTHOR]
ISSN:1664462X
DOI:10.3389/fpls.2025.1610443