Exploration of Principal Component Analysis: Deriving Principal Component Analysis Visually Using Spectra

Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex spectral datasets into comprehensible information by identifying recurring patterns in the data with minimal loss of information. The linear algebr...

Full description

Saved in:
Bibliographic Details
Published in:Applied spectroscopy Vol. 75; no. 4; p. 361
Main Authors: Beattie, J Renwick, Esmonde-White, Francis W L
Format: Journal Article
Language:English
Published: United States 01.04.2021
Subjects:
ISSN:1943-3530, 1943-3530
Online Access:Get more information
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex spectral datasets into comprehensible information by identifying recurring patterns in the data with minimal loss of information. The linear algebra underpinning principal component analysis is not well understood by many applied analytical scientists and spectroscopists who use principal component analysis. The meaning of features identified through principal component analysis is often unclear. This manuscript traces the journey of the spectra themselves through the operations behind principal component analysis, with each step illustrated by simulated spectra. Principal component analysis relies solely on the information within the spectra, consequently the mathematical model is dependent on the nature of the data itself. The direct links between model and spectra allow concrete spectroscopic explanation of principal component analysis , such as the scores representing "concentration" or "weights". The principal components (loadings) are by definition hidden, repeated and uncorrelated spectral shapes that linearly combine to generate the observed spectra. They can be visualized as subtraction spectra between extreme differences within the dataset. Each PC is shown to be a successive refinement of the estimated spectra, improving the fit between PC reconstructed data and the original data. Understanding the data-led development of a principal component analysis model shows how to interpret application specific chemical meaning of the principal component analysis loadings and how to analyze scores. A critical benefit of principal component analysis is its simplicity and the succinctness of its description of a dataset, making it powerful and flexible.
AbstractList Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex spectral datasets into comprehensible information by identifying recurring patterns in the data with minimal loss of information. The linear algebra underpinning principal component analysis is not well understood by many applied analytical scientists and spectroscopists who use principal component analysis. The meaning of features identified through principal component analysis is often unclear. This manuscript traces the journey of the spectra themselves through the operations behind principal component analysis, with each step illustrated by simulated spectra. Principal component analysis relies solely on the information within the spectra, consequently the mathematical model is dependent on the nature of the data itself. The direct links between model and spectra allow concrete spectroscopic explanation of principal component analysis , such as the scores representing "concentration" or "weights". The principal components (loadings) are by definition hidden, repeated and uncorrelated spectral shapes that linearly combine to generate the observed spectra. They can be visualized as subtraction spectra between extreme differences within the dataset. Each PC is shown to be a successive refinement of the estimated spectra, improving the fit between PC reconstructed data and the original data. Understanding the data-led development of a principal component analysis model shows how to interpret application specific chemical meaning of the principal component analysis loadings and how to analyze scores. A critical benefit of principal component analysis is its simplicity and the succinctness of its description of a dataset, making it powerful and flexible.
Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex spectral datasets into comprehensible information by identifying recurring patterns in the data with minimal loss of information. The linear algebra underpinning principal component analysis is not well understood by many applied analytical scientists and spectroscopists who use principal component analysis. The meaning of features identified through principal component analysis is often unclear. This manuscript traces the journey of the spectra themselves through the operations behind principal component analysis, with each step illustrated by simulated spectra. Principal component analysis relies solely on the information within the spectra, consequently the mathematical model is dependent on the nature of the data itself. The direct links between model and spectra allow concrete spectroscopic explanation of principal component analysis , such as the scores representing "concentration" or "weights". The principal components (loadings) are by definition hidden, repeated and uncorrelated spectral shapes that linearly combine to generate the observed spectra. They can be visualized as subtraction spectra between extreme differences within the dataset. Each PC is shown to be a successive refinement of the estimated spectra, improving the fit between PC reconstructed data and the original data. Understanding the data-led development of a principal component analysis model shows how to interpret application specific chemical meaning of the principal component analysis loadings and how to analyze scores. A critical benefit of principal component analysis is its simplicity and the succinctness of its description of a dataset, making it powerful and flexible.Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex spectral datasets into comprehensible information by identifying recurring patterns in the data with minimal loss of information. The linear algebra underpinning principal component analysis is not well understood by many applied analytical scientists and spectroscopists who use principal component analysis. The meaning of features identified through principal component analysis is often unclear. This manuscript traces the journey of the spectra themselves through the operations behind principal component analysis, with each step illustrated by simulated spectra. Principal component analysis relies solely on the information within the spectra, consequently the mathematical model is dependent on the nature of the data itself. The direct links between model and spectra allow concrete spectroscopic explanation of principal component analysis , such as the scores representing "concentration" or "weights". The principal components (loadings) are by definition hidden, repeated and uncorrelated spectral shapes that linearly combine to generate the observed spectra. They can be visualized as subtraction spectra between extreme differences within the dataset. Each PC is shown to be a successive refinement of the estimated spectra, improving the fit between PC reconstructed data and the original data. Understanding the data-led development of a principal component analysis model shows how to interpret application specific chemical meaning of the principal component analysis loadings and how to analyze scores. A critical benefit of principal component analysis is its simplicity and the succinctness of its description of a dataset, making it powerful and flexible.
Author Esmonde-White, Francis W L
Beattie, J Renwick
Author_xml – sequence: 1
  givenname: J Renwick
  orcidid: 0000-0002-0205-717X
  surname: Beattie
  fullname: Beattie, J Renwick
  organization: J Renwick Beattie Consulting, Ballycastle, UK
– sequence: 2
  givenname: Francis W L
  surname: Esmonde-White
  fullname: Esmonde-White, Francis W L
  organization: Esmonde-White Technologies, Ann Arbor, MI, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/33393349$$D View this record in MEDLINE/PubMed
BookMark eNp90DtPwzAQB3ALFdEH7EwoI0vAzzpmq9rykCqBBGWNHHNBRo4d7ATRb08QRWJiuht-_zvdTdHIBw8InRJ8QYiUlxhjJjEtKFaFLLg8QBOiOMuZYHj0px-jaUpvgxaKiSM0ZowpxriaILv-bF2IurPBZ6HOHqL1xrbaZcvQtMM232ULr90u2XSVrSDaD-tf_2XZs029dm6XbdO3fWzBdFEfo8NauwQn-zpD2-v10_I239zf3C0Xm9xwzLqcYw31cNAcKMNKwlxQI82cUs1r4AITAkpSVXFRVIoRTgqlgBtBKw2D5HSGzn_mtjG895C6srHJgHPaQ-hTSbkUeAgpPNCzPe2rBl7KNtpGx135-x76BaD6aTg
CitedBy_id crossref_primary_10_1016_j_jclepro_2023_138760
crossref_primary_10_1016_j_jallcom_2024_178054
crossref_primary_10_1080_27525783_2025_2511889
crossref_primary_10_1016_j_foohum_2024_100399
crossref_primary_10_3390_polym16131777
crossref_primary_10_1111_ijtd_12363
crossref_primary_10_1002_adom_202302848
crossref_primary_10_1016_j_fmre_2023_10_014
crossref_primary_10_1155_2023_3290917
crossref_primary_10_1177_00037028211031618
crossref_primary_10_1002_mop_33251
crossref_primary_10_1016_j_simpat_2024_102925
crossref_primary_10_1063_5_0205468
crossref_primary_10_1016_j_saa_2024_124769
crossref_primary_10_1016_j_microc_2023_109718
crossref_primary_10_3390_app11167766
crossref_primary_10_3390_ijms241310936
crossref_primary_10_1016_j_snb_2022_132680
crossref_primary_10_3390_app11209498
crossref_primary_10_1080_01431161_2022_2112110
crossref_primary_10_3390_a16050257
crossref_primary_10_1016_j_saa_2024_125635
crossref_primary_10_1016_j_snb_2022_131590
crossref_primary_10_1088_1742_6596_2866_1_012071
crossref_primary_10_1016_j_est_2024_114227
crossref_primary_10_3390_s23042297
crossref_primary_10_1007_s12161_023_02538_w
crossref_primary_10_1016_j_polymer_2025_128267
crossref_primary_10_1016_j_slast_2024_100222
crossref_primary_10_1016_j_compag_2024_109761
crossref_primary_10_1038_s41598_025_15908_7
crossref_primary_10_2351_7_0000980
crossref_primary_10_1063_5_0242985
crossref_primary_10_1080_17480272_2023_2286441
crossref_primary_10_1088_1755_1315_1386_1_012026
crossref_primary_10_1007_s10895_022_03136_5
crossref_primary_10_3390_bios15050286
crossref_primary_10_1016_j_saa_2025_126852
crossref_primary_10_1039_D5RA00826C
crossref_primary_10_3390_s24196324
crossref_primary_10_1016_j_talanta_2022_123893
crossref_primary_10_1016_j_procs_2022_12_146
crossref_primary_10_3390_bioengineering9100500
crossref_primary_10_3390_plants13141923
crossref_primary_10_37251_ijome_v1i2_1346
crossref_primary_10_1016_j_optcom_2025_131654
crossref_primary_10_1016_j_talanta_2025_128349
crossref_primary_10_3390_en16020690
crossref_primary_10_1021_jacs_5c06303
crossref_primary_10_7717_peerj_17632
crossref_primary_10_1016_j_infrared_2024_105452
crossref_primary_10_1007_s10570_022_04532_7
crossref_primary_10_1002_jrs_6240
crossref_primary_10_1016_j_microc_2024_111137
crossref_primary_10_1016_j_scitotenv_2023_163888
crossref_primary_10_1016_j_energy_2022_125276
crossref_primary_10_1016_j_rse_2024_114050
crossref_primary_10_1016_j_jii_2024_100711
crossref_primary_10_3390_agriculture14020323
crossref_primary_10_3390_molecules29153562
crossref_primary_10_3390_coatings12020225
crossref_primary_10_1016_j_patcog_2023_110193
crossref_primary_10_1016_j_bios_2024_116414
crossref_primary_10_3390_fi15100335
crossref_primary_10_3390_foods14101684
crossref_primary_10_1016_j_jfca_2023_105917
crossref_primary_10_1016_j_nutos_2025_05_001
crossref_primary_10_1007_s00227_025_04595_7
crossref_primary_10_3390_s25165045
crossref_primary_10_1007_s00484_023_02429_z
crossref_primary_10_1016_j_conbuildmat_2024_139818
crossref_primary_10_3390_foods13071093
crossref_primary_10_1016_j_microc_2024_111666
crossref_primary_10_1016_j_ccr_2025_217196
crossref_primary_10_1080_00223131_2023_2255186
crossref_primary_10_1016_j_foodres_2025_116501
crossref_primary_10_1016_j_ijbiomac_2025_141589
crossref_primary_10_1007_s12649_024_02530_3
crossref_primary_10_1088_1742_6596_2688_1_012014
crossref_primary_10_1021_acs_jchemed_2c00812
crossref_primary_10_1007_s11130_024_01220_8
crossref_primary_10_3390_molecules27248803
crossref_primary_10_3390_molecules27248780
crossref_primary_10_1016_j_foodcont_2024_110577
crossref_primary_10_1021_acsnano_4c16037
crossref_primary_10_1016_j_geoderma_2023_116555
crossref_primary_10_1088_1361_6463_ad6ba1
crossref_primary_10_3390_app132212360
crossref_primary_10_1109_ACCESS_2023_3317516
crossref_primary_10_1016_j_idairyj_2022_105350
crossref_primary_10_1016_j_colsurfa_2022_128707
crossref_primary_10_3390_foods13244182
crossref_primary_10_1007_s12596_022_00981_2
crossref_primary_10_1139_tcsme_2021_0199
crossref_primary_10_3390_pr11020486
crossref_primary_10_1016_j_colsurfa_2024_134548
crossref_primary_10_1016_j_compag_2022_106988
crossref_primary_10_3390_polym15153156
crossref_primary_10_1016_j_saa_2023_123517
crossref_primary_10_3390_coatings13020450
crossref_primary_10_3390_s22228690
crossref_primary_10_1007_s10812_025_01973_3
crossref_primary_10_1016_j_saa_2022_121750
crossref_primary_10_1177_14727978251371181
crossref_primary_10_3389_fimmu_2025_1572468
crossref_primary_10_1088_1742_6596_2312_1_012010
crossref_primary_10_1016_j_talanta_2025_128463
crossref_primary_10_1016_j_talanta_2024_127233
crossref_primary_10_1016_j_tifs_2024_104821
crossref_primary_10_1155_2022_3917618
crossref_primary_10_1177_00037028251322142
crossref_primary_10_1007_s11468_025_02892_x
crossref_primary_10_1016_j_eap_2022_05_013
crossref_primary_10_3390_pr13051501
crossref_primary_10_3390_rs16101817
crossref_primary_10_1016_j_cej_2023_147232
crossref_primary_10_1007_s10103_023_03871_6
crossref_primary_10_1016_j_saa_2025_126779
crossref_primary_10_1016_j_jmapro_2024_11_052
crossref_primary_10_3389_fpls_2024_1358965
crossref_primary_10_1016_j_compag_2025_110507
crossref_primary_10_3390_cli11120235
crossref_primary_10_3390_metabo13091021
crossref_primary_10_1515_hf_2024_0066
crossref_primary_10_1021_acs_chemrev_4c00815
crossref_primary_10_1016_j_chemolab_2022_104634
crossref_primary_10_52589_BJCNIT_I0V0HK0Y
crossref_primary_10_1002_cmtd_202500054
crossref_primary_10_3389_fspor_2025_1537064
crossref_primary_10_1007_s12517_024_11907_6
crossref_primary_10_1038_s41416_025_03050_0
crossref_primary_10_1007_s10668_024_05821_w
crossref_primary_10_1016_j_saa_2022_121636
crossref_primary_10_1039_D4NJ04326J
crossref_primary_10_3390_foods11213462
crossref_primary_10_3390_s23115149
crossref_primary_10_3390_hydrology12010002
crossref_primary_10_3390_j5020021
crossref_primary_10_1002_minf_202300061
crossref_primary_10_3390_agronomy15071678
crossref_primary_10_3390_pharmaceutics15061571
crossref_primary_10_2138_am_2022_8738
crossref_primary_10_3390_app142311186
crossref_primary_10_1364_AO_445265
crossref_primary_10_3390_app13179912
crossref_primary_10_1016_j_jksuci_2024_101961
crossref_primary_10_1111_jtxs_12733
crossref_primary_10_3390_plants13091270
crossref_primary_10_1007_s10653_025_02521_w
crossref_primary_10_1016_j_seps_2024_101975
crossref_primary_10_1016_j_jfca_2024_106824
crossref_primary_10_1364_OE_542460
crossref_primary_10_1007_s00170_023_10899_z
crossref_primary_10_2174_0129503752352719250214061245
crossref_primary_10_1007_s10068_023_01509_5
crossref_primary_10_1016_j_heliyon_2024_e36892
crossref_primary_10_3390_membranes12070691
crossref_primary_10_1007_s13349_024_00761_5
crossref_primary_10_1111_1556_4029_70088
crossref_primary_10_1063_5_0100948
crossref_primary_10_1063_5_0178324
crossref_primary_10_1007_s11627_025_10545_1
crossref_primary_10_1016_j_talanta_2023_124959
crossref_primary_10_1038_s41598_024_74611_1
crossref_primary_10_1002_cem_3400
crossref_primary_10_12677_sea_2024_132016
crossref_primary_10_3389_fmolb_2024_1483326
crossref_primary_10_3390_su16020722
crossref_primary_10_1016_j_saa_2023_123327
crossref_primary_10_3390_foods14122053
ContentType Journal Article
DBID NPM
7X8
DOI 10.1177/0003702820987847
DatabaseName PubMed
MEDLINE - Academic
DatabaseTitle PubMed
MEDLINE - Academic
DatabaseTitleList PubMed
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Chemistry
EISSN 1943-3530
ExternalDocumentID 33393349
Genre Journal Article
GroupedDBID ---
-TM
-~X
.GJ
0R~
23M
4.4
53G
54M
5GY
6J9
8SL
8WZ
A6W
AADUE
AAGGD
AAIKC
AAJPV
AAMNW
AANSI
AAOVH
AAPEO
AAQXI
AARIX
AATAA
AAWJZ
ABAWP
ABCCA
ABCJG
ABDPE
ABDWY
ABEFU
ABEIX
ABFNE
ABFWQ
ABHKI
ABJNI
ABKRH
ABLUO
ABPNF
ABQKF
ABQXT
ABRHV
ABTAH
ABUJY
ABYTW
ACBEA
ACDXX
ACGBL
ACGFO
ACGFS
ACJER
ACNCT
ACOFE
ACOXC
ACROE
ACSIQ
ACUAV
ACUIR
ACXKE
ADEIA
ADGDL
ADRRZ
ADTBJ
ADUKL
ADVBO
AEDJG
AENEX
AEPTA
AEQLS
AESZF
AEWDL
AEWHI
AEXNY
AFEET
AFFNX
AFGYO
AFKRG
AFMOU
AFQAA
AFUIA
AGKLV
AGNHF
AGWFA
AHDMH
AI.
AJEFB
AJUZI
ALFTD
ALMA_UNASSIGNED_HOLDINGS
ARTOV
ATHME
AUTPY
AYAKG
AYPRP
AZSQR
BBRGL
BDDNI
BPACV
CAG
CBRKF
CFDXU
COF
CORYS
CS3
DOPDO
DSZJF
DV7
EBS
EJD
F5P
FHBDP
GROUPED_SAGE_PREMIER_JOURNAL_COLLECTION
H13
H~9
J8X
K.F
L7B
M4V
MVM
NPM
O9-
OFLFD
OPJBK
P2P
Q1R
RNS
ROL
ROS
SAUOL
SCNPE
SFC
SPV
TN5
TR6
UPT
VH1
VOH
VQP
WH7
WHG
XOL
ZCG
ZE2
ZPPRI
ZRKOI
ZY4
~02
7X8
AAPII
ABIDT
ADEBD
AJGYC
AJHME
AJVBE
SASJQ
ID FETCH-LOGICAL-c403t-40aef8206e23097e652c7c622a4fe45011e9729b458b93141899e4c52bae2c742
IEDL.DBID 7X8
ISICitedReferencesCount 200
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000637853700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1943-3530
IngestDate Thu Oct 02 10:26:49 EDT 2025
Thu Apr 03 07:07:50 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 4
Keywords data reduction
spectroscopy
multivariate analysis
PCA
Principal component analysis
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c403t-40aef8206e23097e652c7c622a4fe45011e9729b458b93141899e4c52bae2c742
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0000-0002-0205-717X
PMID 33393349
PQID 2475089990
PQPubID 23479
ParticipantIDs proquest_miscellaneous_2475089990
pubmed_primary_33393349
PublicationCentury 2000
PublicationDate 2021-04-01
PublicationDateYYYYMMDD 2021-04-01
PublicationDate_xml – month: 04
  year: 2021
  text: 2021-04-01
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Applied spectroscopy
PublicationTitleAlternate Appl Spectrosc
PublicationYear 2021
SSID ssj0005935
Score 2.656555
SecondaryResourceType review_article
Snippet Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
StartPage 361
Title Exploration of Principal Component Analysis: Deriving Principal Component Analysis Visually Using Spectra
URI https://www.ncbi.nlm.nih.gov/pubmed/33393349
https://www.proquest.com/docview/2475089990
Volume 75
WOSCitedRecordID wos000637853700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEA7qCnrx_VhfRPBatk3SpvEisrp4cdmDyt7KNJvAgrTrdlfw3ztJW_akCF56KAnkMZmZb2aYj5AbCRMBkYGAGyMCtMdRAEkKLlOYyhxyHWrwZBNyOEzHYzVqAm5VU1bZ6kSvqCeldjHyHhNo2xAcqPBu9hE41iiXXW0oNNZJh6Mr46RajlfdwmPlCTYRp_OAxzxcpSl7vvGKwxshgu5UyJ8dTG9oBrv_XeIe2WlcTHpfy8Q-WTPFAdnqt8xuh2RaF975O6GlpaM64I5znHYoC7RDtG1WcksfUEZd1OHXYfRtWi3h_f2L-hoE6mjtF3M4Iq-Dx5f-U9BQLgRahHyBaBKMdS3dDUITJU0SMy11whgIa0SMysAodMdzEae54pGIcLtG6JjlYHCkYMdko8AFnBI6ManViFa4dP8RpwEInthJbrWxikVdct2eYob7d3kKKEy5rLLVOXbJSX0V2azuvZFxzhXnQp39YfY52WauAsXX2VyQjsUHbS7Jpv5cTKv5lZcV_A5Hz9-n3Mn9
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exploration+of+Principal+Component+Analysis%3A+Deriving+Principal+Component+Analysis+Visually+Using+Spectra&rft.jtitle=Applied+spectroscopy&rft.au=Beattie%2C+J+Renwick&rft.au=Esmonde-White%2C+Francis+W+L&rft.date=2021-04-01&rft.issn=1943-3530&rft.eissn=1943-3530&rft.volume=75&rft.issue=4&rft.spage=361&rft_id=info:doi/10.1177%2F0003702820987847&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1943-3530&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1943-3530&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1943-3530&client=summon