Exploration of Principal Component Analysis: Deriving Principal Component Analysis Visually Using Spectra
Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex spectral datasets into comprehensible information by identifying recurring patterns in the data with minimal loss of information. The linear algebr...
Saved in:
| Published in: | Applied spectroscopy Vol. 75; no. 4; p. 361 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
United States
01.04.2021
|
| Subjects: | |
| ISSN: | 1943-3530, 1943-3530 |
| Online Access: | Get more information |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex spectral datasets into comprehensible information by identifying recurring patterns in the data with minimal loss of information. The linear algebra underpinning principal component analysis is not well understood by many applied analytical scientists and spectroscopists who use principal component analysis. The meaning of features identified through principal component analysis is often unclear. This manuscript traces the journey of the spectra themselves through the operations behind principal component analysis, with each step illustrated by simulated spectra. Principal component analysis relies solely on the information within the spectra, consequently the mathematical model is dependent on the nature of the data itself. The direct links between model and spectra allow concrete spectroscopic explanation of principal component analysis , such as the scores representing "concentration" or "weights". The principal components (loadings) are by definition hidden, repeated and uncorrelated spectral shapes that linearly combine to generate the observed spectra. They can be visualized as subtraction spectra between extreme differences within the dataset. Each PC is shown to be a successive refinement of the estimated spectra, improving the fit between PC reconstructed data and the original data. Understanding the data-led development of a principal component analysis model shows how to interpret application specific chemical meaning of the principal component analysis loadings and how to analyze scores. A critical benefit of principal component analysis is its simplicity and the succinctness of its description of a dataset, making it powerful and flexible. |
|---|---|
| AbstractList | Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex spectral datasets into comprehensible information by identifying recurring patterns in the data with minimal loss of information. The linear algebra underpinning principal component analysis is not well understood by many applied analytical scientists and spectroscopists who use principal component analysis. The meaning of features identified through principal component analysis is often unclear. This manuscript traces the journey of the spectra themselves through the operations behind principal component analysis, with each step illustrated by simulated spectra. Principal component analysis relies solely on the information within the spectra, consequently the mathematical model is dependent on the nature of the data itself. The direct links between model and spectra allow concrete spectroscopic explanation of principal component analysis , such as the scores representing "concentration" or "weights". The principal components (loadings) are by definition hidden, repeated and uncorrelated spectral shapes that linearly combine to generate the observed spectra. They can be visualized as subtraction spectra between extreme differences within the dataset. Each PC is shown to be a successive refinement of the estimated spectra, improving the fit between PC reconstructed data and the original data. Understanding the data-led development of a principal component analysis model shows how to interpret application specific chemical meaning of the principal component analysis loadings and how to analyze scores. A critical benefit of principal component analysis is its simplicity and the succinctness of its description of a dataset, making it powerful and flexible. Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex spectral datasets into comprehensible information by identifying recurring patterns in the data with minimal loss of information. The linear algebra underpinning principal component analysis is not well understood by many applied analytical scientists and spectroscopists who use principal component analysis. The meaning of features identified through principal component analysis is often unclear. This manuscript traces the journey of the spectra themselves through the operations behind principal component analysis, with each step illustrated by simulated spectra. Principal component analysis relies solely on the information within the spectra, consequently the mathematical model is dependent on the nature of the data itself. The direct links between model and spectra allow concrete spectroscopic explanation of principal component analysis , such as the scores representing "concentration" or "weights". The principal components (loadings) are by definition hidden, repeated and uncorrelated spectral shapes that linearly combine to generate the observed spectra. They can be visualized as subtraction spectra between extreme differences within the dataset. Each PC is shown to be a successive refinement of the estimated spectra, improving the fit between PC reconstructed data and the original data. Understanding the data-led development of a principal component analysis model shows how to interpret application specific chemical meaning of the principal component analysis loadings and how to analyze scores. A critical benefit of principal component analysis is its simplicity and the succinctness of its description of a dataset, making it powerful and flexible.Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex spectral datasets into comprehensible information by identifying recurring patterns in the data with minimal loss of information. The linear algebra underpinning principal component analysis is not well understood by many applied analytical scientists and spectroscopists who use principal component analysis. The meaning of features identified through principal component analysis is often unclear. This manuscript traces the journey of the spectra themselves through the operations behind principal component analysis, with each step illustrated by simulated spectra. Principal component analysis relies solely on the information within the spectra, consequently the mathematical model is dependent on the nature of the data itself. The direct links between model and spectra allow concrete spectroscopic explanation of principal component analysis , such as the scores representing "concentration" or "weights". The principal components (loadings) are by definition hidden, repeated and uncorrelated spectral shapes that linearly combine to generate the observed spectra. They can be visualized as subtraction spectra between extreme differences within the dataset. Each PC is shown to be a successive refinement of the estimated spectra, improving the fit between PC reconstructed data and the original data. Understanding the data-led development of a principal component analysis model shows how to interpret application specific chemical meaning of the principal component analysis loadings and how to analyze scores. A critical benefit of principal component analysis is its simplicity and the succinctness of its description of a dataset, making it powerful and flexible. |
| Author | Esmonde-White, Francis W L Beattie, J Renwick |
| Author_xml | – sequence: 1 givenname: J Renwick orcidid: 0000-0002-0205-717X surname: Beattie fullname: Beattie, J Renwick organization: J Renwick Beattie Consulting, Ballycastle, UK – sequence: 2 givenname: Francis W L surname: Esmonde-White fullname: Esmonde-White, Francis W L organization: Esmonde-White Technologies, Ann Arbor, MI, USA |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/33393349$$D View this record in MEDLINE/PubMed |
| BookMark | eNp90DtPwzAQB3ALFdEH7EwoI0vAzzpmq9rykCqBBGWNHHNBRo4d7ATRb08QRWJiuht-_zvdTdHIBw8InRJ8QYiUlxhjJjEtKFaFLLg8QBOiOMuZYHj0px-jaUpvgxaKiSM0ZowpxriaILv-bF2IurPBZ6HOHqL1xrbaZcvQtMM232ULr90u2XSVrSDaD-tf_2XZs029dm6XbdO3fWzBdFEfo8NauwQn-zpD2-v10_I239zf3C0Xm9xwzLqcYw31cNAcKMNKwlxQI82cUs1r4AITAkpSVXFRVIoRTgqlgBtBKw2D5HSGzn_mtjG895C6srHJgHPaQ-hTSbkUeAgpPNCzPe2rBl7KNtpGx135-x76BaD6aTg |
| CitedBy_id | crossref_primary_10_1016_j_jclepro_2023_138760 crossref_primary_10_1016_j_jallcom_2024_178054 crossref_primary_10_1080_27525783_2025_2511889 crossref_primary_10_1016_j_foohum_2024_100399 crossref_primary_10_3390_polym16131777 crossref_primary_10_1111_ijtd_12363 crossref_primary_10_1002_adom_202302848 crossref_primary_10_1016_j_fmre_2023_10_014 crossref_primary_10_1155_2023_3290917 crossref_primary_10_1177_00037028211031618 crossref_primary_10_1002_mop_33251 crossref_primary_10_1016_j_simpat_2024_102925 crossref_primary_10_1063_5_0205468 crossref_primary_10_1016_j_saa_2024_124769 crossref_primary_10_1016_j_microc_2023_109718 crossref_primary_10_3390_app11167766 crossref_primary_10_3390_ijms241310936 crossref_primary_10_1016_j_snb_2022_132680 crossref_primary_10_3390_app11209498 crossref_primary_10_1080_01431161_2022_2112110 crossref_primary_10_3390_a16050257 crossref_primary_10_1016_j_saa_2024_125635 crossref_primary_10_1016_j_snb_2022_131590 crossref_primary_10_1088_1742_6596_2866_1_012071 crossref_primary_10_1016_j_est_2024_114227 crossref_primary_10_3390_s23042297 crossref_primary_10_1007_s12161_023_02538_w crossref_primary_10_1016_j_polymer_2025_128267 crossref_primary_10_1016_j_slast_2024_100222 crossref_primary_10_1016_j_compag_2024_109761 crossref_primary_10_1038_s41598_025_15908_7 crossref_primary_10_2351_7_0000980 crossref_primary_10_1063_5_0242985 crossref_primary_10_1080_17480272_2023_2286441 crossref_primary_10_1088_1755_1315_1386_1_012026 crossref_primary_10_1007_s10895_022_03136_5 crossref_primary_10_3390_bios15050286 crossref_primary_10_1016_j_saa_2025_126852 crossref_primary_10_1039_D5RA00826C crossref_primary_10_3390_s24196324 crossref_primary_10_1016_j_talanta_2022_123893 crossref_primary_10_1016_j_procs_2022_12_146 crossref_primary_10_3390_bioengineering9100500 crossref_primary_10_3390_plants13141923 crossref_primary_10_37251_ijome_v1i2_1346 crossref_primary_10_1016_j_optcom_2025_131654 crossref_primary_10_1016_j_talanta_2025_128349 crossref_primary_10_3390_en16020690 crossref_primary_10_1021_jacs_5c06303 crossref_primary_10_7717_peerj_17632 crossref_primary_10_1016_j_infrared_2024_105452 crossref_primary_10_1007_s10570_022_04532_7 crossref_primary_10_1002_jrs_6240 crossref_primary_10_1016_j_microc_2024_111137 crossref_primary_10_1016_j_scitotenv_2023_163888 crossref_primary_10_1016_j_energy_2022_125276 crossref_primary_10_1016_j_rse_2024_114050 crossref_primary_10_1016_j_jii_2024_100711 crossref_primary_10_3390_agriculture14020323 crossref_primary_10_3390_molecules29153562 crossref_primary_10_3390_coatings12020225 crossref_primary_10_1016_j_patcog_2023_110193 crossref_primary_10_1016_j_bios_2024_116414 crossref_primary_10_3390_fi15100335 crossref_primary_10_3390_foods14101684 crossref_primary_10_1016_j_jfca_2023_105917 crossref_primary_10_1016_j_nutos_2025_05_001 crossref_primary_10_1007_s00227_025_04595_7 crossref_primary_10_3390_s25165045 crossref_primary_10_1007_s00484_023_02429_z crossref_primary_10_1016_j_conbuildmat_2024_139818 crossref_primary_10_3390_foods13071093 crossref_primary_10_1016_j_microc_2024_111666 crossref_primary_10_1016_j_ccr_2025_217196 crossref_primary_10_1080_00223131_2023_2255186 crossref_primary_10_1016_j_foodres_2025_116501 crossref_primary_10_1016_j_ijbiomac_2025_141589 crossref_primary_10_1007_s12649_024_02530_3 crossref_primary_10_1088_1742_6596_2688_1_012014 crossref_primary_10_1021_acs_jchemed_2c00812 crossref_primary_10_1007_s11130_024_01220_8 crossref_primary_10_3390_molecules27248803 crossref_primary_10_3390_molecules27248780 crossref_primary_10_1016_j_foodcont_2024_110577 crossref_primary_10_1021_acsnano_4c16037 crossref_primary_10_1016_j_geoderma_2023_116555 crossref_primary_10_1088_1361_6463_ad6ba1 crossref_primary_10_3390_app132212360 crossref_primary_10_1109_ACCESS_2023_3317516 crossref_primary_10_1016_j_idairyj_2022_105350 crossref_primary_10_1016_j_colsurfa_2022_128707 crossref_primary_10_3390_foods13244182 crossref_primary_10_1007_s12596_022_00981_2 crossref_primary_10_1139_tcsme_2021_0199 crossref_primary_10_3390_pr11020486 crossref_primary_10_1016_j_colsurfa_2024_134548 crossref_primary_10_1016_j_compag_2022_106988 crossref_primary_10_3390_polym15153156 crossref_primary_10_1016_j_saa_2023_123517 crossref_primary_10_3390_coatings13020450 crossref_primary_10_3390_s22228690 crossref_primary_10_1007_s10812_025_01973_3 crossref_primary_10_1016_j_saa_2022_121750 crossref_primary_10_1177_14727978251371181 crossref_primary_10_3389_fimmu_2025_1572468 crossref_primary_10_1088_1742_6596_2312_1_012010 crossref_primary_10_1016_j_talanta_2025_128463 crossref_primary_10_1016_j_talanta_2024_127233 crossref_primary_10_1016_j_tifs_2024_104821 crossref_primary_10_1155_2022_3917618 crossref_primary_10_1177_00037028251322142 crossref_primary_10_1007_s11468_025_02892_x crossref_primary_10_1016_j_eap_2022_05_013 crossref_primary_10_3390_pr13051501 crossref_primary_10_3390_rs16101817 crossref_primary_10_1016_j_cej_2023_147232 crossref_primary_10_1007_s10103_023_03871_6 crossref_primary_10_1016_j_saa_2025_126779 crossref_primary_10_1016_j_jmapro_2024_11_052 crossref_primary_10_3389_fpls_2024_1358965 crossref_primary_10_1016_j_compag_2025_110507 crossref_primary_10_3390_cli11120235 crossref_primary_10_3390_metabo13091021 crossref_primary_10_1515_hf_2024_0066 crossref_primary_10_1021_acs_chemrev_4c00815 crossref_primary_10_1016_j_chemolab_2022_104634 crossref_primary_10_52589_BJCNIT_I0V0HK0Y crossref_primary_10_1002_cmtd_202500054 crossref_primary_10_3389_fspor_2025_1537064 crossref_primary_10_1007_s12517_024_11907_6 crossref_primary_10_1038_s41416_025_03050_0 crossref_primary_10_1007_s10668_024_05821_w crossref_primary_10_1016_j_saa_2022_121636 crossref_primary_10_1039_D4NJ04326J crossref_primary_10_3390_foods11213462 crossref_primary_10_3390_s23115149 crossref_primary_10_3390_hydrology12010002 crossref_primary_10_3390_j5020021 crossref_primary_10_1002_minf_202300061 crossref_primary_10_3390_agronomy15071678 crossref_primary_10_3390_pharmaceutics15061571 crossref_primary_10_2138_am_2022_8738 crossref_primary_10_3390_app142311186 crossref_primary_10_1364_AO_445265 crossref_primary_10_3390_app13179912 crossref_primary_10_1016_j_jksuci_2024_101961 crossref_primary_10_1111_jtxs_12733 crossref_primary_10_3390_plants13091270 crossref_primary_10_1007_s10653_025_02521_w crossref_primary_10_1016_j_seps_2024_101975 crossref_primary_10_1016_j_jfca_2024_106824 crossref_primary_10_1364_OE_542460 crossref_primary_10_1007_s00170_023_10899_z crossref_primary_10_2174_0129503752352719250214061245 crossref_primary_10_1007_s10068_023_01509_5 crossref_primary_10_1016_j_heliyon_2024_e36892 crossref_primary_10_3390_membranes12070691 crossref_primary_10_1007_s13349_024_00761_5 crossref_primary_10_1111_1556_4029_70088 crossref_primary_10_1063_5_0100948 crossref_primary_10_1063_5_0178324 crossref_primary_10_1007_s11627_025_10545_1 crossref_primary_10_1016_j_talanta_2023_124959 crossref_primary_10_1038_s41598_024_74611_1 crossref_primary_10_1002_cem_3400 crossref_primary_10_12677_sea_2024_132016 crossref_primary_10_3389_fmolb_2024_1483326 crossref_primary_10_3390_su16020722 crossref_primary_10_1016_j_saa_2023_123327 crossref_primary_10_3390_foods14122053 |
| ContentType | Journal Article |
| DBID | NPM 7X8 |
| DOI | 10.1177/0003702820987847 |
| DatabaseName | PubMed MEDLINE - Academic |
| DatabaseTitle | PubMed MEDLINE - Academic |
| DatabaseTitleList | PubMed MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | no_fulltext_linktorsrc |
| Discipline | Chemistry |
| EISSN | 1943-3530 |
| ExternalDocumentID | 33393349 |
| Genre | Journal Article |
| GroupedDBID | --- -TM -~X .GJ 0R~ 23M 4.4 53G 54M 5GY 6J9 8SL 8WZ A6W AADUE AAGGD AAIKC AAJPV AAMNW AANSI AAOVH AAPEO AAQXI AARIX AATAA AAWJZ ABAWP ABCCA ABCJG ABDPE ABDWY ABEFU ABEIX ABFNE ABFWQ ABHKI ABJNI ABKRH ABLUO ABPNF ABQKF ABQXT ABRHV ABTAH ABUJY ABYTW ACBEA ACDXX ACGBL ACGFO ACGFS ACJER ACNCT ACOFE ACOXC ACROE ACSIQ ACUAV ACUIR ACXKE ADEIA ADGDL ADRRZ ADTBJ ADUKL ADVBO AEDJG AENEX AEPTA AEQLS AESZF AEWDL AEWHI AEXNY AFEET AFFNX AFGYO AFKRG AFMOU AFQAA AFUIA AGKLV AGNHF AGWFA AHDMH AI. AJEFB AJUZI ALFTD ALMA_UNASSIGNED_HOLDINGS ARTOV ATHME AUTPY AYAKG AYPRP AZSQR BBRGL BDDNI BPACV CAG CBRKF CFDXU COF CORYS CS3 DOPDO DSZJF DV7 EBS EJD F5P FHBDP GROUPED_SAGE_PREMIER_JOURNAL_COLLECTION H13 H~9 J8X K.F L7B M4V MVM NPM O9- OFLFD OPJBK P2P Q1R RNS ROL ROS SAUOL SCNPE SFC SPV TN5 TR6 UPT VH1 VOH VQP WH7 WHG XOL ZCG ZE2 ZPPRI ZRKOI ZY4 ~02 7X8 AAPII ABIDT ADEBD AJGYC AJHME AJVBE SASJQ |
| ID | FETCH-LOGICAL-c403t-40aef8206e23097e652c7c622a4fe45011e9729b458b93141899e4c52bae2c742 |
| IEDL.DBID | 7X8 |
| ISICitedReferencesCount | 200 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000637853700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1943-3530 |
| IngestDate | Thu Oct 02 10:26:49 EDT 2025 Thu Apr 03 07:07:50 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 4 |
| Keywords | data reduction spectroscopy multivariate analysis PCA Principal component analysis |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c403t-40aef8206e23097e652c7c622a4fe45011e9729b458b93141899e4c52bae2c742 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ORCID | 0000-0002-0205-717X |
| PMID | 33393349 |
| PQID | 2475089990 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_2475089990 pubmed_primary_33393349 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-04-01 |
| PublicationDateYYYYMMDD | 2021-04-01 |
| PublicationDate_xml | – month: 04 year: 2021 text: 2021-04-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Applied spectroscopy |
| PublicationTitleAlternate | Appl Spectrosc |
| PublicationYear | 2021 |
| SSID | ssj0005935 |
| Score | 2.656555 |
| SecondaryResourceType | review_article |
| Snippet | Spectroscopy rapidly captures a large amount of data that is not directly interpretable. Principal component analysis is widely used to simplify complex... |
| SourceID | proquest pubmed |
| SourceType | Aggregation Database Index Database |
| StartPage | 361 |
| Title | Exploration of Principal Component Analysis: Deriving Principal Component Analysis Visually Using Spectra |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/33393349 https://www.proquest.com/docview/2475089990 |
| Volume | 75 |
| WOSCitedRecordID | wos000637853700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEA7qCnrx_VhfRPBatk3SpvEisrp4cdmDyt7KNJvAgrTrdlfw3ztJW_akCF56KAnkMZmZb2aYj5AbCRMBkYGAGyMCtMdRAEkKLlOYyhxyHWrwZBNyOEzHYzVqAm5VU1bZ6kSvqCeldjHyHhNo2xAcqPBu9hE41iiXXW0oNNZJh6Mr46RajlfdwmPlCTYRp_OAxzxcpSl7vvGKwxshgu5UyJ8dTG9oBrv_XeIe2WlcTHpfy8Q-WTPFAdnqt8xuh2RaF975O6GlpaM64I5znHYoC7RDtG1WcksfUEZd1OHXYfRtWi3h_f2L-hoE6mjtF3M4Iq-Dx5f-U9BQLgRahHyBaBKMdS3dDUITJU0SMy11whgIa0SMysAodMdzEae54pGIcLtG6JjlYHCkYMdko8AFnBI6ManViFa4dP8RpwEInthJbrWxikVdct2eYob7d3kKKEy5rLLVOXbJSX0V2azuvZFxzhXnQp39YfY52WauAsXX2VyQjsUHbS7Jpv5cTKv5lZcV_A5Hz9-n3Mn9 |
| linkProvider | ProQuest |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exploration+of+Principal+Component+Analysis%3A+Deriving+Principal+Component+Analysis+Visually+Using+Spectra&rft.jtitle=Applied+spectroscopy&rft.au=Beattie%2C+J+Renwick&rft.au=Esmonde-White%2C+Francis+W+L&rft.date=2021-04-01&rft.issn=1943-3530&rft.eissn=1943-3530&rft.volume=75&rft.issue=4&rft.spage=361&rft_id=info:doi/10.1177%2F0003702820987847&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1943-3530&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1943-3530&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1943-3530&client=summon |