Regression-based variable clustering for data reduction

In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely,...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Statistics in medicine Ročník 21; číslo 6; s. 921 - 941
Hlavní autoři: McClelland, R. L., Kronmal, R. A.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Chichester, UK John Wiley & Sons, Ltd 30.03.2002
Wiley
Témata:
ISSN:0277-6715, 1097-0258
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely, and/or many regions with a small number of cases. By merging the regions into larger geographic areas, we obtain more cases within each area (and hence lower standard errors for parameter estimates), as well as fewer areas to summarize in terms of disease rates. The resulting clusters should be such that regions within the same cluster are similar in terms of their disease rates. In this paper we present a clustering algorithm which uses data at the subject‐specific level in order to cluster the original regions into a reduced set of larger areas. The proposed clustering algorithm expresses the clustering goals in terms of a regression framework. This formulation of the problem allows the regions to be clustered in terms of their association with the response, and confounding variables measured at the subject‐specific level may be easily incorporated during the clustering process. Additionally, this framework allows estimation and testing of the association between the areas and the response. The statistical properties and performance of the algorithm were evaluated via simulation studies, and the results are promising. Additional simulations illustrate the importance of controlling for confounding variables during the clustering process, rather than after the clusters are determined. The algorithm is illustrated with data from the Cardiovascular Health Study. Although developed with a specific application in mind, the method is applicable to a wide range of problems. Copyright © 2002 John Wiley & Sons, Ltd.
AbstractList In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely, and/or many regions with a small number of cases. By merging the regions into larger geographic areas, we obtain more cases within each area (and hence lower standard errors for parameter estimates), as well as fewer areas to summarize in terms of disease rates. The resulting clusters should be such that regions within the same cluster are similar in terms of their disease rates. In this paper we present a clustering algorithm which uses data at the subject-specific level in order to cluster the original regions into a reduced set of larger areas. The proposed clustering algorithm expresses the clustering goals in terms of a regression framework. This formulation of the problem allows the regions to be clustered in terms of their association with the response, and confounding variables measured at the subject-specific level may be easily incorporated during the clustering process. Additionally, this framework allows estimation and testing of the association between the areas and the response. The statistical properties and performance of the algorithm were evaluated via simulation studies, and the results are promising. Additional simulations illustrate the importance of controlling for confounding variables during the clustering process, rather than after the clusters are determined. The algorithm is illustrated with data from the Cardiovascular Health Study. Although developed with a specific application in mind, the method is applicable to a wide range of problems.In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely, and/or many regions with a small number of cases. By merging the regions into larger geographic areas, we obtain more cases within each area (and hence lower standard errors for parameter estimates), as well as fewer areas to summarize in terms of disease rates. The resulting clusters should be such that regions within the same cluster are similar in terms of their disease rates. In this paper we present a clustering algorithm which uses data at the subject-specific level in order to cluster the original regions into a reduced set of larger areas. The proposed clustering algorithm expresses the clustering goals in terms of a regression framework. This formulation of the problem allows the regions to be clustered in terms of their association with the response, and confounding variables measured at the subject-specific level may be easily incorporated during the clustering process. Additionally, this framework allows estimation and testing of the association between the areas and the response. The statistical properties and performance of the algorithm were evaluated via simulation studies, and the results are promising. Additional simulations illustrate the importance of controlling for confounding variables during the clustering process, rather than after the clusters are determined. The algorithm is illustrated with data from the Cardiovascular Health Study. Although developed with a specific application in mind, the method is applicable to a wide range of problems.
In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely, and/or many regions with a small number of cases. By merging the regions into larger geographic areas, we obtain more cases within each area (and hence lower standard errors for parameter estimates), as well as fewer areas to summarize in terms of disease rates. The resulting clusters should be such that regions within the same cluster are similar in terms of their disease rates. In this paper we present a clustering algorithm which uses data at the subject‐specific level in order to cluster the original regions into a reduced set of larger areas. The proposed clustering algorithm expresses the clustering goals in terms of a regression framework. This formulation of the problem allows the regions to be clustered in terms of their association with the response, and confounding variables measured at the subject‐specific level may be easily incorporated during the clustering process. Additionally, this framework allows estimation and testing of the association between the areas and the response. The statistical properties and performance of the algorithm were evaluated via simulation studies, and the results are promising. Additional simulations illustrate the importance of controlling for confounding variables during the clustering process, rather than after the clusters are determined. The algorithm is illustrated with data from the Cardiovascular Health Study. Although developed with a specific application in mind, the method is applicable to a wide range of problems. Copyright © 2002 John Wiley & Sons, Ltd.
In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely, and/or many regions with a small number of cases. By merging the regions into larger geographic areas, we obtain more cases within each area (and hence lower standard errors for parameter estimates), as well as fewer areas to summarize in terms of disease rates. The resulting clusters should be such that regions within the same cluster are similar in terms of their disease rates. In this paper we present a clustering algorithm which uses data at the subject-specific level in order to cluster the original regions into a reduced set of larger areas. The proposed clustering algorithm expresses the clustering goals in terms of a regression framework. This formulation of the problem allows the regions to be clustered in terms of their association with the response, and confounding variables measured at the subject-specific level may be easily incorporated during the clustering process. Additionally, this framework allows estimation and testing of the association between the areas and the response. The statistical properties and performance of the algorithm were evaluated via simulation studies, and the results are promising. Additional simulations illustrate the importance of controlling for confounding variables during the clustering process, rather than after the clusters are determined. The algorithm is illustrated with data from the Cardiovascular Health Study. Although developed with a specific application in mind, the method is applicable to a wide range of problems.
Author McClelland, R. L.
Kronmal, R. A.
Author_xml – sequence: 1
  givenname: R. L.
  surname: McClelland
  fullname: McClelland, R. L.
  email: McClelland.Robyn@mayo.edu
  organization: Section of Biostatistics, Mayo Clinic, 200 First Street SW, Rochester, Minnesota 55905, U.S.A
– sequence: 2
  givenname: R. A.
  surname: Kronmal
  fullname: Kronmal, R. A.
  organization: Department of Biostatistics, University of Washington, Seattle, WA, U.S.A
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=13522275$$DView record in Pascal Francis
https://www.ncbi.nlm.nih.gov/pubmed/11870825$$D View this record in MEDLINE/PubMed
BookMark eNp10U1P3DAQBmALUbELReIXoFyouAT8EcfOsVp1AYkuErRF2os1cSYr02wCdsLHv8er3VLRipNt6ZnRzOtdst12LRJywOgJo5SfBreMl1xskTGjhUopl3qbjClXKs0VkyOyG8IdpYxJrnbIiDGtqOZyTNQ1LjyG4Lo2LSFglTyCd1A2mNhmCD161y6SuvNJBT0kHqvB9hF_Jp9qaALub8498nP67cfkPL28OruYfL1MrdBSpDYDlVc0wziVqHJVVSw-BGSAmc4t1LwoJWgotcRSFroukOZxurpEqgprxR75su5777uHAUNvli5YbBposRuCUSwrBC1ohIcbOJRLrMy9d0vwL-bPqhEcbQAEC03tobUu_HVCcs7Vyp2snfVdCB5rY10Pq517D64xjJpV5iZmblaZx4Ljfwreev5P0zV9cg2-fOjMzcX3997Fn3h-8-B_m1wJJc3t7MzM9XyeT2e_zEy8AjiLnhM
CitedBy_id crossref_primary_10_1016_j_socscimed_2004_11_005
crossref_primary_10_1300_J052v22n02_05
crossref_primary_10_1080_00949655_2022_2053855
crossref_primary_10_1111_j_1467_985X_2006_00452_x
crossref_primary_10_1016_j_jspi_2004_04_021
Cites_doi 10.1016/0010-4809(86)90011-X
10.1016/0031-3203(83)90024-9
10.1001/jama.1990.03450130055026
10.2307/2684460
10.1093/oxfordjournals.aje.a112885
10.1007/s10683-007-9168-y
10.2307/2532201
10.1002/sim.4780121916
10.2307/2344237
10.1080/01621459.1971.10482356
10.1007/978-1-4899-3242-6
10.1068/a170397
10.1093/comjnl/28.1.82
10.4135/9781412983648
ContentType Journal Article
Copyright Copyright © 2002 John Wiley & Sons, Ltd.
2002 INIST-CNRS
Copyright 2002 John Wiley & Sons, Ltd.
Copyright_xml – notice: Copyright © 2002 John Wiley & Sons, Ltd.
– notice: 2002 INIST-CNRS
– notice: Copyright 2002 John Wiley & Sons, Ltd.
DBID BSCLL
AAYXX
CITATION
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1002/sim.1063
DatabaseName Istex
CrossRef
Pascal-Francis
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic

MEDLINE
CrossRef
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
Statistics
Public Health
EISSN 1097-0258
EndPage 941
ExternalDocumentID 11870825
13522275
10_1002_sim_1063
SIM1063
ark_67375_WNG_Z8ZZ6FNV_N
Genre article
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, P.H.S
Journal Article
GrantInformation_xml – fundername: Georgetown Echo
  funderid: RC‐HL 35129 JHU MRI RC‐HL 15103
– fundername: National Heart, Lung and Blood Institute
  funderid: N01‐HC‐85079; N01‐HC‐85086
– fundername: NHLBI NIH HHS
  grantid: N01-HC-85086
– fundername: NHLBI NIH HHS
  grantid: N01-HC-85080
– fundername: NHLBI NIH HHS
  grantid: N01-HC-85082
– fundername: NHLBI NIH HHS
  grantid: N01-HC-85084
– fundername: NHLBI NIH HHS
  grantid: N01-HC-85085
– fundername: NHLBI NIH HHS
  grantid: N01-HC-85083
– fundername: NHLBI NIH HHS
  grantid: N01-HC-85079
– fundername: NHLBI NIH HHS
  grantid: N01-HC-85081
GroupedDBID ---
.3N
.GA
.Y3
05W
0R~
10A
123
1L6
1OB
1OC
1ZS
31~
33P
3SF
3WU
4.4
4ZD
50Y
50Z
51W
51X
52M
52N
52O
52P
52S
52T
52U
52W
52X
53G
5RE
5VS
66C
6PF
702
7PT
8-0
8-1
8-3
8-4
8-5
8UM
930
A03
AAESR
AAEVG
AAHQN
AAMMB
AAMNL
AANHP
AANLZ
AAONW
AASGY
AAWTL
AAXRX
AAYCA
AAZKR
ABCQN
ABCUV
ABIJN
ABJNI
ABOCM
ABPVW
ACAHQ
ACBWZ
ACCZN
ACGFS
ACPOU
ACRPL
ACXBN
ACXQS
ACYXJ
ADBBV
ADEOM
ADIZJ
ADKYN
ADMGS
ADNMO
ADOZA
ADXAS
ADZMN
AEFGJ
AEIGN
AEIMD
AENEX
AEUYR
AEYWJ
AFBPY
AFFNX
AFFPM
AFGKR
AFWVQ
AFZJQ
AGQPQ
AGXDD
AGYGG
AHBTC
AHMBA
AIDQK
AIDYY
AITYG
AIURR
AJXKR
ALAGY
ALMA_UNASSIGNED_HOLDINGS
ALUQN
ALVPJ
AMBMR
AMYDB
ASPBG
ATUGU
AUFTA
AVWKF
AZBYB
AZFZN
AZVAB
BAFTC
BDRZF
BFHJK
BHBCM
BMNLL
BMXJE
BNHUX
BROTX
BRXPI
BSCLL
BY8
CS3
D-E
D-F
DCZOG
DPXWK
DR2
DRFUL
DRSTM
DU5
EBS
EJD
EX3
F00
F01
F04
F5P
FEDTE
G-S
G.N
GNP
GODZA
H.T
H.X
HBH
HGLYW
HHY
HHZ
HVGLF
HZ~
IX1
J0M
JPC
KQQ
LATKE
LAW
LC2
LC3
LEEKS
LH4
LITHE
LOXES
LP6
LP7
LUTES
LW6
LYRES
MEWTI
MK4
MRFUL
MRSTM
MSFUL
MSSTM
MXFUL
MXSTM
N04
N05
N9A
NF~
NNB
O66
O9-
OIG
P2P
P2W
P2X
P4D
PALCI
PQQKQ
Q.N
Q11
QB0
QRW
R.K
ROL
RX1
SUPJJ
TN5
UB1
V2E
W8V
W99
WBKPD
WH7
WIB
WIH
WIK
WJL
WOHZO
WOW
WQJ
WXSBR
WYISQ
XBAML
XG1
XV2
YHZ
ZGI
ZZTAW
~IA
~WT
AAHHS
ACCFJ
AEEZP
AEQDE
AEUQT
AFPWT
AIWBW
AJBDE
RWI
WRC
WUP
WWH
AAYXX
CITATION
O8X
ABEML
ACSCC
AGHNM
AMVHM
DUUFO
EBD
EMOBN
HF~
IQODW
M67
RIWAO
RJQFR
RYL
SAMSI
SV3
ZXP
CGR
CUY
CVF
ECM
EIF
NPM
7X8
ID FETCH-LOGICAL-c3853-c4a76d04e0973d67dd104e3a4ae486caf29b5a8ab85eb598f9e06118fbe079cc3
IEDL.DBID DRFUL
ISICitedReferencesCount 6
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000174199200009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0277-6715
IngestDate Sun Nov 09 13:39:49 EST 2025
Wed Feb 19 02:32:50 EST 2025
Mon Jul 21 09:11:53 EDT 2025
Tue Nov 18 22:19:10 EST 2025
Sat Nov 29 06:42:36 EST 2025
Wed Jan 22 16:46:21 EST 2025
Sun Sep 21 06:18:22 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 6
Keywords Cluster analysis
Human
Evaluation
Statistical analysis
Geographic distribution
Regression analysis
Statistics
Algorithm
Epidemiology
Medical application
Data reduction
Language English
License http://onlinelibrary.wiley.com/termsAndConditions#vor
CC BY 4.0
Copyright 2002 John Wiley & Sons, Ltd.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c3853-c4a76d04e0973d67dd104e3a4ae486caf29b5a8ab85eb598f9e06118fbe079cc3
Notes Georgetown Echo - No. RC-HL 35129 JHU MRI RC-HL 15103
istex:B390D1993EB7844F57590899DAD36C1C79276B3F
ark:/67375/WNG-Z8ZZ6FNV-N
ArticleID:SIM1063
National Heart, Lung and Blood Institute - No. N01-HC-85079; No. N01-HC-85086
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PMID 11870825
PQID 71493090
PQPubID 23479
PageCount 21
ParticipantIDs proquest_miscellaneous_71493090
pubmed_primary_11870825
pascalfrancis_primary_13522275
crossref_citationtrail_10_1002_sim_1063
crossref_primary_10_1002_sim_1063
wiley_primary_10_1002_sim_1063_SIM1063
istex_primary_ark_67375_WNG_Z8ZZ6FNV_N
PublicationCentury 2000
PublicationDate 30 March 2002
PublicationDateYYYYMMDD 2002-03-30
PublicationDate_xml – month: 03
  year: 2002
  text: 30 March 2002
  day: 30
PublicationDecade 2000
PublicationPlace Chichester, UK
PublicationPlace_xml – name: Chichester, UK
– name: Elmont, NY
– name: Chichester
– name: England
PublicationTitle Statistics in medicine
PublicationTitleAlternate Statist. Med
PublicationYear 2002
Publisher John Wiley & Sons, Ltd
Wiley
Publisher_xml – name: John Wiley & Sons, Ltd
– name: Wiley
References Gordon AD. Classification: Methods for the Exploratory Analysis of Multivariate Data. Chapman and Hall: London, 1981.
Weiss KB, Wagener DK. Changing patterns of asthma mortality - identifying target populations at high risk. Journal of the American Medical Association 1990; 264:1683-1687.
Bryan RN, Manolio TA, Schertz LD, Jungreis C, Poirier VC, Elster AD, Kronmal RA. A method for using MR to evaluate the effects of cardiovascular disease on the brain: The Cardiovascular Health Study. American Journal of Neuroradiology 1994; 15:1625-1633.
Rand WM. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 1971; 66:846-850.
Polissar L. The effect of migration on comparison of disease rates in geographical studies in the United States. American Journal of Epidemiology 1980; 111:175-182.
Cormack RM. A review of classification. Journal of the Royal Statistical Society, Series A 1971; 134:321-367.
Emrich LJ, Piedmonte MR. A method for generating high-dimensional multivariate binary variates. American Statistician 1991; 45:302-304.
Stanfel LE. Application of clustering theory to cancer mortality data. Computers and Biomedical Research 1986; 19:117-141.
Murtagh F. A survey of algorithms for contiguity-constrained clustering and related problems. Computer Journal 1985; 28:82-88.
Everitt B. Cluster Analysis. 2nd edn. Social Science Research Council, Halsted Press: New York, 1980.
Perruchet C. Constrained agglomerative hierarchical classification. Pattern Recognition 1983; 16:213-217.
Aldenderfer MS, Blashfield RK. Cluster Analysis. Sage Publications: Beverly Hills, California, 1984.
Teng EL, Chui HC. The modified mini-mental state (3MS) examination. Journal of Clinical Psychology 1987; 48:314-318.
Morris RD, Munasinghe RL. Aggregation of existing geographic regions to diminish spurious variability of disease rates. Statistics in Medicine 1993; 12:1915-1929.
Banfield JD, Raftery AE. Model-based Gaussian and non-Gaussian clustering. Biometrics 1993; 49:803-822.
Margules CR, Faith DP, Belbin L. An adjacency constraint in agglomerative hierarchical classifications of geographic data. Environment and Planning A 1985; 17:397-412.
Anderberg MR. Cluster Analysis For Applications. Academic Press: New York, 1973.
1985; 28
1985; 17
1971; 66
1993; 12
1993; 49
1991; 45
1986; 19
1973
1984
1971; 134
1994; 15
1981
1980
1990; 264
1983; 16
1980; 111
1989
1987; 48
Gordon AD (e_1_2_1_7_2) 1981
e_1_2_1_6_2
e_1_2_1_4_2
e_1_2_1_5_2
Bryan RN (e_1_2_1_12_2) 1994; 15
e_1_2_1_2_2
e_1_2_1_11_2
Teng EL (e_1_2_1_19_2) 1987; 48
e_1_2_1_15_2
e_1_2_1_16_2
Weiss KB (e_1_2_1_3_2) 1990; 264
e_1_2_1_13_2
e_1_2_1_14_2
Everitt B (e_1_2_1_10_2) 1980
e_1_2_1_8_2
e_1_2_1_17_2
e_1_2_1_9_2
e_1_2_1_18_2
References_xml – reference: Everitt B. Cluster Analysis. 2nd edn. Social Science Research Council, Halsted Press: New York, 1980.
– reference: Aldenderfer MS, Blashfield RK. Cluster Analysis. Sage Publications: Beverly Hills, California, 1984.
– reference: Stanfel LE. Application of clustering theory to cancer mortality data. Computers and Biomedical Research 1986; 19:117-141.
– reference: Emrich LJ, Piedmonte MR. A method for generating high-dimensional multivariate binary variates. American Statistician 1991; 45:302-304.
– reference: Weiss KB, Wagener DK. Changing patterns of asthma mortality - identifying target populations at high risk. Journal of the American Medical Association 1990; 264:1683-1687.
– reference: Teng EL, Chui HC. The modified mini-mental state (3MS) examination. Journal of Clinical Psychology 1987; 48:314-318.
– reference: Cormack RM. A review of classification. Journal of the Royal Statistical Society, Series A 1971; 134:321-367.
– reference: Polissar L. The effect of migration on comparison of disease rates in geographical studies in the United States. American Journal of Epidemiology 1980; 111:175-182.
– reference: Murtagh F. A survey of algorithms for contiguity-constrained clustering and related problems. Computer Journal 1985; 28:82-88.
– reference: Bryan RN, Manolio TA, Schertz LD, Jungreis C, Poirier VC, Elster AD, Kronmal RA. A method for using MR to evaluate the effects of cardiovascular disease on the brain: The Cardiovascular Health Study. American Journal of Neuroradiology 1994; 15:1625-1633.
– reference: Perruchet C. Constrained agglomerative hierarchical classification. Pattern Recognition 1983; 16:213-217.
– reference: Gordon AD. Classification: Methods for the Exploratory Analysis of Multivariate Data. Chapman and Hall: London, 1981.
– reference: Morris RD, Munasinghe RL. Aggregation of existing geographic regions to diminish spurious variability of disease rates. Statistics in Medicine 1993; 12:1915-1929.
– reference: Banfield JD, Raftery AE. Model-based Gaussian and non-Gaussian clustering. Biometrics 1993; 49:803-822.
– reference: Rand WM. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 1971; 66:846-850.
– reference: Anderberg MR. Cluster Analysis For Applications. Academic Press: New York, 1973.
– reference: Margules CR, Faith DP, Belbin L. An adjacency constraint in agglomerative hierarchical classifications of geographic data. Environment and Planning A 1985; 17:397-412.
– year: 1984
– year: 1981
– volume: 28
  start-page: 82
  year: 1985
  end-page: 88
  article-title: A survey of algorithms for contiguity‐constrained clustering and related problems
  publication-title: Computer Journal
– volume: 66
  start-page: 846
  year: 1971
  end-page: 850
  article-title: Objective criteria for the evaluation of clustering methods
  publication-title: Journal of the American Statistical Association
– volume: 45
  start-page: 302
  year: 1991
  end-page: 304
  article-title: A method for generating high‐dimensional multivariate binary variates
  publication-title: American Statistician
– year: 1980
– volume: 12
  start-page: 1915
  year: 1993
  end-page: 1929
  article-title: Aggregation of existing geographic regions to diminish spurious variability of disease rates
  publication-title: Statistics in Medicine
– volume: 15
  start-page: 1625
  year: 1994
  end-page: 1633
  article-title: A method for using MR to evaluate the effects of cardiovascular disease on the brain: The Cardiovascular Health Study
  publication-title: American Journal of Neuroradiology
– year: 1989
– year: 1973
– volume: 19
  start-page: 117
  year: 1986
  end-page: 141
  article-title: Application of clustering theory to cancer mortality data
  publication-title: Computers and Biomedical Research
– volume: 17
  start-page: 397
  year: 1985
  end-page: 412
  article-title: An adjacency constraint in agglomerative hierarchical classifications of geographic data
  publication-title: Environment and Planning A
– volume: 49
  start-page: 803
  year: 1993
  end-page: 822
  article-title: Model‐based Gaussian and non‐Gaussian clustering
  publication-title: Biometrics
– volume: 134
  start-page: 321
  year: 1971
  end-page: 367
  article-title: A review of classification
  publication-title: Journal of the Royal Statistical Society, Series A
– volume: 48
  start-page: 314
  year: 1987
  end-page: 318
  article-title: The modified mini‐mental state (3MS) examination
  publication-title: Journal of Clinical Psychology
– volume: 16
  start-page: 213
  year: 1983
  end-page: 217
  article-title: Constrained agglomerative hierarchical classification
  publication-title: Pattern Recognition
– volume: 111
  start-page: 175
  year: 1980
  end-page: 182
  article-title: The effect of migration on comparison of disease rates in geographical studies in the United States
  publication-title: American Journal of Epidemiology
– volume: 264
  start-page: 1683
  year: 1990
  end-page: 1687
  article-title: Changing patterns of asthma mortality – identifying target populations at high risk
  publication-title: Journal of the American Medical Association
– ident: e_1_2_1_4_2
  doi: 10.1016/0010-4809(86)90011-X
– ident: e_1_2_1_16_2
  doi: 10.1016/0031-3203(83)90024-9
– volume: 264
  start-page: 1683
  year: 1990
  ident: e_1_2_1_3_2
  article-title: Changing patterns of asthma mortality – identifying target populations at high risk
  publication-title: Journal of the American Medical Association
  doi: 10.1001/jama.1990.03450130055026
– ident: e_1_2_1_17_2
  doi: 10.2307/2684460
– ident: e_1_2_1_5_2
  doi: 10.1093/oxfordjournals.aje.a112885
– ident: e_1_2_1_8_2
  doi: 10.1007/s10683-007-9168-y
– ident: e_1_2_1_11_2
  doi: 10.2307/2532201
– volume: 48
  start-page: 314
  year: 1987
  ident: e_1_2_1_19_2
  article-title: The modified mini‐mental state (3MS) examination
  publication-title: Journal of Clinical Psychology
– volume-title: Classification: Methods for the Exploratory Analysis of Multivariate Data
  year: 1981
  ident: e_1_2_1_7_2
– ident: e_1_2_1_2_2
  doi: 10.1002/sim.4780121916
– ident: e_1_2_1_6_2
  doi: 10.2307/2344237
– ident: e_1_2_1_18_2
  doi: 10.1080/01621459.1971.10482356
– volume-title: Cluster Analysis
  year: 1980
  ident: e_1_2_1_10_2
– volume: 15
  start-page: 1625
  year: 1994
  ident: e_1_2_1_12_2
  article-title: A method for using MR to evaluate the effects of cardiovascular disease on the brain: The Cardiovascular Health Study
  publication-title: American Journal of Neuroradiology
– ident: e_1_2_1_13_2
  doi: 10.1007/978-1-4899-3242-6
– ident: e_1_2_1_15_2
  doi: 10.1068/a170397
– ident: e_1_2_1_14_2
  doi: 10.1093/comjnl/28.1.82
– ident: e_1_2_1_9_2
  doi: 10.4135/9781412983648
SSID ssj0011527
Score 1.7089527
Snippet In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary...
SourceID proquest
pubmed
pascalfrancis
crossref
wiley
istex
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 921
SubjectTerms Aged
Algorithms
Biological and medical sciences
Cardiovascular Diseases - epidemiology
Cluster Analysis
Cognition
Computer Simulation
Computerized, statistical medical data processing and models in biomedicine
data reduction
Humans
Infarction - epidemiology
Magnetic Resonance Imaging
Medical sciences
Medical statistics
Models, Statistical
regression
Regression Analysis
variable clustering
Title Regression-based variable clustering for data reduction
URI https://api.istex.fr/ark:/67375/WNG-Z8ZZ6FNV-N/fulltext.pdf
https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fsim.1063
https://www.ncbi.nlm.nih.gov/pubmed/11870825
https://www.proquest.com/docview/71493090
Volume 21
WOSCitedRecordID wos000174199200009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVWIB
  databaseName: Wiley Online Library Full Collection 2020
  customDbUrl:
  eissn: 1097-0258
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0011527
  issn: 0277-6715
  databaseCode: DRFUL
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: https://onlinelibrary.wiley.com
  providerName: Wiley-Blackwell
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1fb9QwDLfYDqFJiD8HGx1wFAnBU7X-SZvkEQEHSFuFBoPTvURpkqKJ0U13u2mPfAQ-4z4JdtPe6aQhIfHUPjhVGtuxndg_A7zgldUOhSeSNkkjxrSOdFKjXhlbYzTGnLEtZP4-L0sxmchPXVYl1cJ4fIjlgRtpRrtfk4Lrar63Ag2dH__E1yLbgAHVVGHgNXh7OD7aX94h9A1b6ZKy4EneQ8_G6V4_ds0YDWhdLyk5Us9xfWrf2OI6z3PdkW0t0fju__zDPbjT-Z_hay8w9-GGa4Zw66C7YR_CbX-OF_rypCFskTfqwZwfgDx0333ebHP16zcZQBteYLBN5VehOVkQ6AKawhAd4ZBST8MZAcMS6x_C0fjdlzcfoq73QmQytOCRYZoXNmaO4Hxswa3FuM1lmmnHRGF0ncoq10JXIndVLkUtHXoGiagrF3NpTLYNm81p4x5BaFNTWB5ria4SM5mthJYO4zzG6iRzRRrAq54JynTA5NQf40R5SOVU4TIpWqYAni8pzzwYxzU0L1s-Lgn07Aclr_FcfSvfq6mYTotx-VWVAYzWGL36IvmkKc8DeNZzXqHO0UWKbtzpYq44hpVZLOMAdrxArMYmuP9h0I2zaPn-12mqzx8P6Ln7r4SPYattREPlkPET2DyfLdxTuGkuUAZmI9jgEzHqNOAP3g0IxQ
linkProvider Wiley-Blackwell
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwEB6VXQSVEI8F2vBog4TgFDUPJ47FCQFLK3YjVFpa7cVybKeqaNNql6048hP4jfwSZuJkVysVCYlTcrAjZ8bj-cYefwPwkpdGWZw8gTBRHDCmVKCiCu1KmwqjMWa1aSjzR7wo8uNj8XkN3nR3YRw_xGLDjSyjWa_JwGlDemfJGjo7PcfXLLkBfZYlPO9B__3-8HC0OEToKrbSKWXGo7Tjng3jna7vijfqk2B_UHakmqGAKlfZ4jrouYpkG1c0vPdfP3Ef7rYI1H_rpswDWLP1AG6N2zP2AdxxO3m-u6A0gHXCo47O-SGIfXviMmfr3z9_kQs0_hWG23QBy9dnc6JdQGfoIxT2KfnUnxI1LCn_ERwOPxy82w3a6guBTtCHB5opnpmQWSL0MRk3BiM3myimLMszrapYlKnKVZmntkxFXgmL2CDKq9KGXGidPIZefVHbTfBNrDPDQyUQLDGdmDJXwmKkx1gVJTaLPXjdaUHqlpqcKmScSUeqHEsUkyQxefBi0fLS0XFc0-ZVo8hFAzX9RulrPJVHxUc5ySeTbFh8lYUHWyuaXn6RUGnMUw-2O9VLtDo6SlG1vZjPJMfAMglF6MGGmxHLvhGugBh24ygaxf91mPLL3pieT_614Tbc3j0Yj-Ror_j0FNabsjR0OTJ8Br3v07l9Djf1Fc6H6VZrCH8Ab9ELzQ
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NbtQwEB6VXVRVQgWWvxRog4TgFDU_ThyrJ0QJVGyjqlCo9mI5toMQJa122Yojj8Az8iSdiZNdrVQkJE7JwY4cj8fzjWf8DcBzXhllcfEEwkRxwJhSgYpq1CttavTGmNWmpcwf87LMT0_F0Rrs9XdhHD_E4sCNNKPdr0nB7YWpd5esobOv3_E1S27AkKUiZQMY7h8XJ-NFEKGv2EpRyoxHac89G8a7fd8VazSkif1J2ZFqhhNUu8oW10HPVSTbmqLi9n_9xB3Y7BCo_8otmbuwZpsRrB92MfYR3HIneb67oDSCDcKjjs75Hohj-8VlzjZ_fv0mE2j8S3S36QKWr8_mRLuAxtBHKOxT8qk_JWpYEv59OCnefHz9LuiqLwQ6QRseaKZ4ZkJmidDHZNwY9NxsopiyLM-0qmNRpSpXVZ7aKhV5LSxigyivKxtyoXXyAAbNeWMfgW9inRkeKoFgienEVLkSFj09xuoosVnswcteClJ31ORUIeNMOlLlWOI0SZomD54tWl44Oo5r2rxoBblooKbfKH2Np_Jz-VZO8skkK8pPsvRge0XSyy8SKo156sFOL3qJWkehFNXY8_lMcnQsk1CEHjx0K2LZN8IdEN1uHEUr-L8OU344OKTn1r823IH1o_1Cjg_K949ho61KQ3cjwycw-DGd26dwU1_icphud3pwBevuC0g
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Regression-based+variable+clustering+for+data+reduction&rft.jtitle=Statistics+in+medicine&rft.au=MCCLELLAND%2C+R.+L&rft.au=KRONMAL%2C+R.+A&rft.date=2002-03-30&rft.pub=Wiley&rft.issn=0277-6715&rft.volume=21&rft.issue=6&rft.spage=921&rft.epage=941&rft_id=info:doi/10.1002%2Fsim.1063&rft.externalDBID=n%2Fa&rft.externalDocID=13522275
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0277-6715&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0277-6715&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0277-6715&client=summon