Similarity measure method based on spectra subspace and locally linear embedding algorithm

Due to the high dimensionality, redundancy, noise and nonlinearity of the near infrared (NIR) spectra data result the difficulty of the similarity measure. This paper presented a similarity measure method SSLLE based on the spectra subspace and locally linear embedding (LLE) algorithm. Firstly, we d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Infrared physics & technology Jg. 100; S. 57 - 61
Hauptverfasser: Qin, Yuhua, Duan, Kai, Wu, Lijun, Xu, Baoding
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier B.V 01.08.2019
Schlagworte:
ISSN:1350-4495, 1879-0275
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Due to the high dimensionality, redundancy, noise and nonlinearity of the near infrared (NIR) spectra data result the difficulty of the similarity measure. This paper presented a similarity measure method SSLLE based on the spectra subspace and locally linear embedding (LLE) algorithm. Firstly, we divided the high dimensional spectra data into several subspaces according to the absorption band of the major chemical compositions, which effectively avoids the influence of irrelevant features and noise and reduces the dimension and computation complexity of the LLE. Then, we modified the LLE algorithm by introducing the geodesic distance instead of Euclidean distance, which solves the measure problem of the Euclidean distance in high dimensional space. In order to make the sample more evenly distributed, the method of distance calculation in LLE was also modified. For each spectra subspace, the distance matrix was calculated according to the embedding that was mapped from the high dimensional space by using the modified LLE. Subsequently, the spectral similarity matrix of the sample set was integrated by adding all of the individual distance matrices of each subspace so that the sample with the highest similarity can be found. In order to investigate the effectiveness of the algorithm, the spectral projection of the samples was analyzed first, the results showed that the SSLLE distinguished the tobacco samples from different areas significantly better than the methods of principal component analysis (PCA) and LLE. Secondly, we compared the results of searching the most spectrally similar sample with the target tobacco, it showed that the SSLLE had the minimum differences in the chemical composition, and the highest consistency with the recommendation of the experts than that of PCA and LLE algorithm. It also had good robustness and precision.
ISSN:1350-4495
1879-0275
DOI:10.1016/j.infrared.2019.05.006