PROJECTED PRINCIPAL COMPONENT ANALYSIS IN FACTOR MODELS

This paper introduces a Projected Principal Component Analysis (Projected-PCA), which employees principal component analysis to the projected (smoothed) data matrix onto a given linear space spanned by covariates. When it applies to high-dimensional factor analysis, the projection removes noise comp...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:The Annals of statistics Ročník 44; číslo 1; s. 219
Hlavní autoři: Fan, Jianqing, Liao, Yuan, Wang, Weichen
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States 01.02.2016
Témata:
ISSN:0090-5364
On-line přístup:Zjistit podrobnosti o přístupu
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract This paper introduces a Projected Principal Component Analysis (Projected-PCA), which employees principal component analysis to the projected (smoothed) data matrix onto a given linear space spanned by covariates. When it applies to high-dimensional factor analysis, the projection removes noise components. We show that the unobserved latent factors can be more accurately estimated than the conventional PCA if the projection is genuine, or more precisely, when the factor loading matrices are related to the projected linear space. When the dimensionality is large, the factors can be estimated accurately even when the sample size is finite. We propose a flexible semi-parametric factor model, which decomposes the factor loading matrix into the component that can be explained by subject-specific covariates and the orthogonal residual component. The covariates' effects on the factor loadings are further modeled by the additive model via sieve approximations. By using the newly proposed Projected-PCA, the rates of convergence of the smooth factor loading matrices are obtained, which are much faster than those of the conventional factor analysis. The convergence is achieved even when the sample size is finite and is particularly appealing in the high-dimension-low-sample-size situation. This leads us to developing nonparametric tests on whether observed covariates have explaining powers on the loadings and whether they fully explain the loadings. The proposed method is illustrated by both simulated data and the returns of the components of the S&P 500 index.
AbstractList This paper introduces a Projected Principal Component Analysis (Projected-PCA), which employees principal component analysis to the projected (smoothed) data matrix onto a given linear space spanned by covariates. When it applies to high-dimensional factor analysis, the projection removes noise components. We show that the unobserved latent factors can be more accurately estimated than the conventional PCA if the projection is genuine, or more precisely, when the factor loading matrices are related to the projected linear space. When the dimensionality is large, the factors can be estimated accurately even when the sample size is finite. We propose a flexible semi-parametric factor model, which decomposes the factor loading matrix into the component that can be explained by subject-specific covariates and the orthogonal residual component. The covariates' effects on the factor loadings are further modeled by the additive model via sieve approximations. By using the newly proposed Projected-PCA, the rates of convergence of the smooth factor loading matrices are obtained, which are much faster than those of the conventional factor analysis. The convergence is achieved even when the sample size is finite and is particularly appealing in the high-dimension-low-sample-size situation. This leads us to developing nonparametric tests on whether observed covariates have explaining powers on the loadings and whether they fully explain the loadings. The proposed method is illustrated by both simulated data and the returns of the components of the S&P 500 index.
This paper introduces a Projected Principal Component Analysis (Projected-PCA), which employees principal component analysis to the projected (smoothed) data matrix onto a given linear space spanned by covariates. When it applies to high-dimensional factor analysis, the projection removes noise components. We show that the unobserved latent factors can be more accurately estimated than the conventional PCA if the projection is genuine, or more precisely, when the factor loading matrices are related to the projected linear space. When the dimensionality is large, the factors can be estimated accurately even when the sample size is finite. We propose a flexible semi-parametric factor model, which decomposes the factor loading matrix into the component that can be explained by subject-specific covariates and the orthogonal residual component. The covariates' effects on the factor loadings are further modeled by the additive model via sieve approximations. By using the newly proposed Projected-PCA, the rates of convergence of the smooth factor loading matrices are obtained, which are much faster than those of the conventional factor analysis. The convergence is achieved even when the sample size is finite and is particularly appealing in the high-dimension-low-sample-size situation. This leads us to developing nonparametric tests on whether observed covariates have explaining powers on the loadings and whether they fully explain the loadings. The proposed method is illustrated by both simulated data and the returns of the components of the S&P 500 index.This paper introduces a Projected Principal Component Analysis (Projected-PCA), which employees principal component analysis to the projected (smoothed) data matrix onto a given linear space spanned by covariates. When it applies to high-dimensional factor analysis, the projection removes noise components. We show that the unobserved latent factors can be more accurately estimated than the conventional PCA if the projection is genuine, or more precisely, when the factor loading matrices are related to the projected linear space. When the dimensionality is large, the factors can be estimated accurately even when the sample size is finite. We propose a flexible semi-parametric factor model, which decomposes the factor loading matrix into the component that can be explained by subject-specific covariates and the orthogonal residual component. The covariates' effects on the factor loadings are further modeled by the additive model via sieve approximations. By using the newly proposed Projected-PCA, the rates of convergence of the smooth factor loading matrices are obtained, which are much faster than those of the conventional factor analysis. The convergence is achieved even when the sample size is finite and is particularly appealing in the high-dimension-low-sample-size situation. This leads us to developing nonparametric tests on whether observed covariates have explaining powers on the loadings and whether they fully explain the loadings. The proposed method is illustrated by both simulated data and the returns of the components of the S&P 500 index.
Author Wang, Weichen
Fan, Jianqing
Liao, Yuan
Author_xml – sequence: 1
  givenname: Jianqing
  surname: Fan
  fullname: Fan, Jianqing
  organization: Princeton University
– sequence: 2
  givenname: Yuan
  surname: Liao
  fullname: Liao, Yuan
  organization: University of Maryland
– sequence: 3
  givenname: Weichen
  surname: Wang
  fullname: Wang, Weichen
  organization: Princeton University
BackLink https://www.ncbi.nlm.nih.gov/pubmed/26783374$$D View this record in MEDLINE/PubMed
BookMark eNo1jztPhEAYRadY4z608QcYSht0Ht8AU05YVjEsQwALKzLAkKzhscJS-O8lca3uLU5O7t2iVT_0BqEHgp8JJfBCuC1VRpgDK7TBWGCbL32NttP0hTHmAtgtWlPH9RhzYYPcJFXvgZ8HeytJw9gPExlZvjomKg7i3JKxjD6zMLPC2DpIP1epdVT7IMru0E2j28ncX3OHPg5B7r_ZkXoNfRnZFYC42IxXBEoMHhG84YbRkmhOS9cDIYC6tSippwnTrljWG1M3UIHDjAOa1IJpj-7Q05_3PA7fs5kuRXeaKtO2ujfDPBXEo47DOSZsQR-v6Fx2pi7O46nT40_xf5b-AkAUTeU
CitedBy_id crossref_primary_10_1016_j_jeconom_2017_08_009
crossref_primary_10_1093_jrsssa_qnad086
crossref_primary_10_1016_j_jeconom_2020_04_006
crossref_primary_10_1016_j_jkss_2018_04_005
crossref_primary_10_1016_j_jeconom_2022_04_005
crossref_primary_10_1016_j_jmva_2023_105155
crossref_primary_10_3390_axioms13070418
crossref_primary_10_1080_07350015_2025_2537387
crossref_primary_10_1080_03610926_2019_1576889
crossref_primary_10_1080_01621459_2024_2422129
crossref_primary_10_3390_systems13010026
crossref_primary_10_1093_rfs_hhaa020
crossref_primary_10_3982_QE2330
crossref_primary_10_1017_S0266466623000324
crossref_primary_10_1080_01621459_2025_2538272
crossref_primary_10_1080_03610918_2023_2196748
crossref_primary_10_1016_j_jeconom_2016_11_001
crossref_primary_10_3390_rs14092194
crossref_primary_10_1016_j_jfineco_2019_06_008
crossref_primary_10_1007_s42521_024_00107_2
crossref_primary_10_1016_j_jeconom_2020_09_009
crossref_primary_10_3390_sym13071278
crossref_primary_10_1007_s42952_025_00324_4
crossref_primary_10_1016_j_jmva_2024_105373
crossref_primary_10_1111_jofi_13477
crossref_primary_10_1214_24_AOS2412
crossref_primary_10_1016_j_jeconom_2024_105853
crossref_primary_10_1093_rfs_hhaa102
crossref_primary_10_1080_00273171_2019_1677208
crossref_primary_10_1007_s00180_022_01270_z
crossref_primary_10_1016_j_pacfin_2024_102579
crossref_primary_10_1080_01621459_2025_2526697
crossref_primary_10_1086_735513
crossref_primary_10_1080_01621459_2022_2035099
crossref_primary_10_1016_j_jmva_2024_105403
crossref_primary_10_1146_annurev_financial_091420_011735
crossref_primary_10_1016_j_jeconom_2019_08_012
crossref_primary_10_1093_ectj_utac031
crossref_primary_10_1080_07350015_2024_2374971
crossref_primary_10_1287_mnsc_2023_4768
crossref_primary_10_1093_rfs_hhae036
crossref_primary_10_1093_rapstu_raad010
crossref_primary_10_2478_remav_2025_0018
crossref_primary_10_1016_j_apm_2025_116280
crossref_primary_10_1080_10618600_2022_2110883
crossref_primary_10_1093_jjfinec_nbad026
crossref_primary_10_1016_j_jmva_2021_104786
crossref_primary_10_1016_j_ress_2024_110440
crossref_primary_10_1007_s11408_025_00480_x
crossref_primary_10_3390_app15105282
crossref_primary_10_3390_math13071121
crossref_primary_10_1111_caje_12336
crossref_primary_10_1088_1742_6596_1995_1_012065
crossref_primary_10_1080_07350015_2021_1961786
crossref_primary_10_1093_jjfinec_nbad024
crossref_primary_10_3390_math12213442
crossref_primary_10_1111_ectj_12117
crossref_primary_10_1080_07350015_2020_1730857
crossref_primary_10_1080_01621459_2021_1912757
crossref_primary_10_1016_j_csda_2018_03_015
crossref_primary_10_1016_j_jeconom_2018_09_003
crossref_primary_10_1016_j_jeconom_2020_07_013
crossref_primary_10_1080_07350015_2020_1721294
crossref_primary_10_1080_07350015_2020_1844212
crossref_primary_10_1111_ectj_12061
crossref_primary_10_1080_07350015_2025_2548893
crossref_primary_10_1016_j_jeconom_2020_07_009
crossref_primary_10_1093_jrsssb_qkae001
crossref_primary_10_1007_s00521_023_08313_6
crossref_primary_10_1111_biom_12698
crossref_primary_10_1093_jjfinec_nbaa045
crossref_primary_10_1016_j_jeconom_2023_105521
crossref_primary_10_1080_07350015_2024_2449391
crossref_primary_10_1007_s13171_025_00410_z
crossref_primary_10_1111_jofi_12898
crossref_primary_10_1080_07350015_2021_2011736
crossref_primary_10_1016_j_jeconom_2022_11_001
crossref_primary_10_1016_j_jeconom_2025_106058
crossref_primary_10_1016_j_jeconom_2020_07_003
crossref_primary_10_1016_j_jeconom_2022_11_007
crossref_primary_10_1016_j_pacfin_2023_102106
crossref_primary_10_1093_jrsssb_qkae036
crossref_primary_10_1016_j_jeconom_2019_05_018
crossref_primary_10_1093_biostatistics_kxae027
crossref_primary_10_1080_02664763_2020_1753024
crossref_primary_10_1002_fut_22559
crossref_primary_10_1515_snde_2025_0042
crossref_primary_10_1016_j_jfineco_2019_05_001
crossref_primary_10_1214_17_AOS1588
crossref_primary_10_1214_21_AOS2152
crossref_primary_10_1080_07350015_2021_1927744
crossref_primary_10_1146_annurev_financial_101521_104735
crossref_primary_10_1017_S0266466625100091
crossref_primary_10_1016_j_jeconom_2017_06_023
crossref_primary_10_1214_16_AOS1487
crossref_primary_10_1080_01621459_2020_1831927
crossref_primary_10_1287_mnsc_2023_4966
crossref_primary_10_1111_jofi_13324
crossref_primary_10_1017_S0022109024000036
ContentType Journal Article
DBID NPM
7X8
DOI 10.1214/15-AOS1364
DatabaseName PubMed
MEDLINE - Academic
DatabaseTitle PubMed
MEDLINE - Academic
DatabaseTitleList PubMed
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Statistics
Mathematics
ExternalDocumentID 26783374
Genre Journal Article
GrantInformation_xml – fundername: NIGMS NIH HHS
  grantid: R01 GM072611
GroupedDBID -~X
123
23M
2AX
2FS
2WC
3R3
5RE
6J9
85S
AAFWJ
AAWIL
AAYJJ
ABAWQ
ABBHK
ABEFU
ABFAN
ABPFR
ABPQH
ABXSQ
ABYWD
ABZEH
ACGFO
ACHJO
ACIPV
ACIWK
ACMTB
ACNCT
ACTMH
ACUBG
ADLSF
ADNWM
ADODI
ADULT
AECCQ
AENEX
AETVE
AEUPB
AFFOW
AFVYC
AFXHP
AGLNM
AI.
AIHAF
ALMA_UNASSIGNED_HOLDINGS
ALRMG
AS~
CJ0
CS3
D0L
DQDLB
DSRWC
E3Z
EBS
ECEWR
EJD
F5P
FEDTE
FVMVE
GR0
HDK
HGD
HQ6
HVGLF
IPSME
JAAYA
JAS
JBMMH
JBZCM
JENOY
JHFFW
JKQEH
JLEZI
JLXEF
JMS
JPL
JST
L7B
MVM
N9A
NHB
NPM
OFU
OK1
P2P
PQQKQ
PUASD
RBU
REI
RNS
RPE
SA0
SJN
TN5
TR2
UPT
UQL
VH1
VOH
WH7
WHG
WS9
XSW
YYP
ZCG
ZGI
ZY4
7X8
AFHLI
ID FETCH-LOGICAL-c449t-35c14b048195f5e32b1a52b78499427d9b28a13a79121eedf4c463e64a1d93a82
IEDL.DBID 7X8
ISICitedReferencesCount 107
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000368022000008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0090-5364
IngestDate Thu Oct 02 18:28:25 EDT 2025
Mon Jul 21 05:43:36 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords loading matrix modeling
sieve approximation
high dimensionality
rates of covergence
semi-parametric factor models
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c449t-35c14b048195f5e32b1a52b78499427d9b28a13a79121eedf4c463e64a1d93a82
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PMID 26783374
PQID 1826655013
PQPubID 23479
ParticipantIDs proquest_miscellaneous_1826655013
pubmed_primary_26783374
PublicationCentury 2000
PublicationDate 2016-02-01
PublicationDateYYYYMMDD 2016-02-01
PublicationDate_xml – month: 02
  year: 2016
  text: 2016-02-01
  day: 01
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle The Annals of statistics
PublicationTitleAlternate Ann Stat
PublicationYear 2016
SSID ssj0005943
Score 2.5777838
Snippet This paper introduces a Projected Principal Component Analysis (Projected-PCA), which employees principal component analysis to the projected (smoothed) data...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
StartPage 219
Title PROJECTED PRINCIPAL COMPONENT ANALYSIS IN FACTOR MODELS
URI https://www.ncbi.nlm.nih.gov/pubmed/26783374
https://www.proquest.com/docview/1826655013
Volume 44
WOSCitedRecordID wos000368022000008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3JTsMwELWAcoADS9nKpiBxtRrbk8UnFLWpqJRNTZHKKYoT55gWCnw_dpLSExISF98sWc9jz7Nn9B5Cj9TkwuUVxVpdDgM3dX23IpiXruLPjlBJr3EtCZwochcLnnQfbuuurXJzJzYXdbks9B_5UPNgW9Fpwp5Wb1i7RunqamehsYt6TFEZHdXOYqsWbm265riJLWZDJ09KCQyJhb04JUwrDfxGLZsUMzn-7-JO0FFHLg2vjYZTtCPrPjoMf5RZ1310oNllK858hpxkFmu7E39s6K6I0TTxAmMUh0kc-dHc8CIveE2nqTGNjIk3msczI4zHfpCeo5eJPx89485JARcA_AMzqyAgtDQMtypLMipIblHhuOq9A9QpuaBuTljucAWOypoVFGAzaUNOSs5yl16gvXpZyytkCAlQ6FqoyThI9VhxhdpuYYmSFhUADNDDBqJMRaouP-S1XH6usy1IA3TZ4pytWkmNjKqcyZgD13-YfYMOFGvpWqdvUa9S51Teof3iS8H3ft-EgBqjJPwG3LWy9A
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=PROJECTED+PRINCIPAL+COMPONENT+ANALYSIS+IN+FACTOR+MODELS&rft.jtitle=The+Annals+of+statistics&rft.au=Fan%2C+Jianqing&rft.au=Liao%2C+Yuan&rft.au=Wang%2C+Weichen&rft.date=2016-02-01&rft.issn=0090-5364&rft.volume=44&rft.issue=1&rft.spage=219&rft_id=info:doi/10.1214%2F15-AOS1364&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0090-5364&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0090-5364&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0090-5364&client=summon