Regression-based variable clustering for data reduction
In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely,...
Uloženo v:
| Vydáno v: | Statistics in medicine Ročník 21; číslo 6; s. 921 - 941 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Chichester, UK
John Wiley & Sons, Ltd
30.03.2002
Wiley |
| Témata: | |
| ISSN: | 0277-6715, 1097-0258 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely, and/or many regions with a small number of cases. By merging the regions into larger geographic areas, we obtain more cases within each area (and hence lower standard errors for parameter estimates), as well as fewer areas to summarize in terms of disease rates. The resulting clusters should be such that regions within the same cluster are similar in terms of their disease rates. In this paper we present a clustering algorithm which uses data at the subject‐specific level in order to cluster the original regions into a reduced set of larger areas. The proposed clustering algorithm expresses the clustering goals in terms of a regression framework. This formulation of the problem allows the regions to be clustered in terms of their association with the response, and confounding variables measured at the subject‐specific level may be easily incorporated during the clustering process. Additionally, this framework allows estimation and testing of the association between the areas and the response. The statistical properties and performance of the algorithm were evaluated via simulation studies, and the results are promising. Additional simulations illustrate the importance of controlling for confounding variables during the clustering process, rather than after the clusters are determined. The algorithm is illustrated with data from the Cardiovascular Health Study. Although developed with a specific application in mind, the method is applicable to a wide range of problems. Copyright © 2002 John Wiley & Sons, Ltd. |
|---|---|
| AbstractList | In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely, and/or many regions with a small number of cases. By merging the regions into larger geographic areas, we obtain more cases within each area (and hence lower standard errors for parameter estimates), as well as fewer areas to summarize in terms of disease rates. The resulting clusters should be such that regions within the same cluster are similar in terms of their disease rates. In this paper we present a clustering algorithm which uses data at the subject-specific level in order to cluster the original regions into a reduced set of larger areas. The proposed clustering algorithm expresses the clustering goals in terms of a regression framework. This formulation of the problem allows the regions to be clustered in terms of their association with the response, and confounding variables measured at the subject-specific level may be easily incorporated during the clustering process. Additionally, this framework allows estimation and testing of the association between the areas and the response. The statistical properties and performance of the algorithm were evaluated via simulation studies, and the results are promising. Additional simulations illustrate the importance of controlling for confounding variables during the clustering process, rather than after the clusters are determined. The algorithm is illustrated with data from the Cardiovascular Health Study. Although developed with a specific application in mind, the method is applicable to a wide range of problems.In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely, and/or many regions with a small number of cases. By merging the regions into larger geographic areas, we obtain more cases within each area (and hence lower standard errors for parameter estimates), as well as fewer areas to summarize in terms of disease rates. The resulting clusters should be such that regions within the same cluster are similar in terms of their disease rates. In this paper we present a clustering algorithm which uses data at the subject-specific level in order to cluster the original regions into a reduced set of larger areas. The proposed clustering algorithm expresses the clustering goals in terms of a regression framework. This formulation of the problem allows the regions to be clustered in terms of their association with the response, and confounding variables measured at the subject-specific level may be easily incorporated during the clustering process. Additionally, this framework allows estimation and testing of the association between the areas and the response. The statistical properties and performance of the algorithm were evaluated via simulation studies, and the results are promising. Additional simulations illustrate the importance of controlling for confounding variables during the clustering process, rather than after the clusters are determined. The algorithm is illustrated with data from the Cardiovascular Health Study. Although developed with a specific application in mind, the method is applicable to a wide range of problems. In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely, and/or many regions with a small number of cases. By merging the regions into larger geographic areas, we obtain more cases within each area (and hence lower standard errors for parameter estimates), as well as fewer areas to summarize in terms of disease rates. The resulting clusters should be such that regions within the same cluster are similar in terms of their disease rates. In this paper we present a clustering algorithm which uses data at the subject‐specific level in order to cluster the original regions into a reduced set of larger areas. The proposed clustering algorithm expresses the clustering goals in terms of a regression framework. This formulation of the problem allows the regions to be clustered in terms of their association with the response, and confounding variables measured at the subject‐specific level may be easily incorporated during the clustering process. Additionally, this framework allows estimation and testing of the association between the areas and the response. The statistical properties and performance of the algorithm were evaluated via simulation studies, and the results are promising. Additional simulations illustrate the importance of controlling for confounding variables during the clustering process, rather than after the clusters are determined. The algorithm is illustrated with data from the Cardiovascular Health Study. Although developed with a specific application in mind, the method is applicable to a wide range of problems. Copyright © 2002 John Wiley & Sons, Ltd. In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary measures, and a more parsimonious representation of the country as a whole. This may be the case if there are too many to summarize concisely, and/or many regions with a small number of cases. By merging the regions into larger geographic areas, we obtain more cases within each area (and hence lower standard errors for parameter estimates), as well as fewer areas to summarize in terms of disease rates. The resulting clusters should be such that regions within the same cluster are similar in terms of their disease rates. In this paper we present a clustering algorithm which uses data at the subject-specific level in order to cluster the original regions into a reduced set of larger areas. The proposed clustering algorithm expresses the clustering goals in terms of a regression framework. This formulation of the problem allows the regions to be clustered in terms of their association with the response, and confounding variables measured at the subject-specific level may be easily incorporated during the clustering process. Additionally, this framework allows estimation and testing of the association between the areas and the response. The statistical properties and performance of the algorithm were evaluated via simulation studies, and the results are promising. Additional simulations illustrate the importance of controlling for confounding variables during the clustering process, rather than after the clusters are determined. The algorithm is illustrated with data from the Cardiovascular Health Study. Although developed with a specific application in mind, the method is applicable to a wide range of problems. |
| Author | McClelland, R. L. Kronmal, R. A. |
| Author_xml | – sequence: 1 givenname: R. L. surname: McClelland fullname: McClelland, R. L. email: McClelland.Robyn@mayo.edu organization: Section of Biostatistics, Mayo Clinic, 200 First Street SW, Rochester, Minnesota 55905, U.S.A – sequence: 2 givenname: R. A. surname: Kronmal fullname: Kronmal, R. A. organization: Department of Biostatistics, University of Washington, Seattle, WA, U.S.A |
| BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=13522275$$DView record in Pascal Francis https://www.ncbi.nlm.nih.gov/pubmed/11870825$$D View this record in MEDLINE/PubMed |
| BookMark | eNp10U1P3DAQBmALUbELReIXoFyouAT8EcfOsVp1AYkuErRF2os1cSYr02wCdsLHv8er3VLRipNt6ZnRzOtdst12LRJywOgJo5SfBreMl1xskTGjhUopl3qbjClXKs0VkyOyG8IdpYxJrnbIiDGtqOZyTNQ1LjyG4Lo2LSFglTyCd1A2mNhmCD161y6SuvNJBT0kHqvB9hF_Jp9qaALub8498nP67cfkPL28OruYfL1MrdBSpDYDlVc0wziVqHJVVSw-BGSAmc4t1LwoJWgotcRSFroukOZxurpEqgprxR75su5777uHAUNvli5YbBposRuCUSwrBC1ohIcbOJRLrMy9d0vwL-bPqhEcbQAEC03tobUu_HVCcs7Vyp2snfVdCB5rY10Pq517D64xjJpV5iZmblaZx4Ljfwreev5P0zV9cg2-fOjMzcX3997Fn3h-8-B_m1wJJc3t7MzM9XyeT2e_zEy8AjiLnhM |
| CitedBy_id | crossref_primary_10_1016_j_socscimed_2004_11_005 crossref_primary_10_1300_J052v22n02_05 crossref_primary_10_1080_00949655_2022_2053855 crossref_primary_10_1111_j_1467_985X_2006_00452_x crossref_primary_10_1016_j_jspi_2004_04_021 |
| Cites_doi | 10.1016/0010-4809(86)90011-X 10.1016/0031-3203(83)90024-9 10.1001/jama.1990.03450130055026 10.2307/2684460 10.1093/oxfordjournals.aje.a112885 10.1007/s10683-007-9168-y 10.2307/2532201 10.1002/sim.4780121916 10.2307/2344237 10.1080/01621459.1971.10482356 10.1007/978-1-4899-3242-6 10.1068/a170397 10.1093/comjnl/28.1.82 10.4135/9781412983648 |
| ContentType | Journal Article |
| Copyright | Copyright © 2002 John Wiley & Sons, Ltd. 2002 INIST-CNRS Copyright 2002 John Wiley & Sons, Ltd. |
| Copyright_xml | – notice: Copyright © 2002 John Wiley & Sons, Ltd. – notice: 2002 INIST-CNRS – notice: Copyright 2002 John Wiley & Sons, Ltd. |
| DBID | BSCLL AAYXX CITATION IQODW CGR CUY CVF ECM EIF NPM 7X8 |
| DOI | 10.1002/sim.1063 |
| DatabaseName | Istex CrossRef Pascal-Francis Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic MEDLINE CrossRef |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Medicine Statistics Public Health |
| EISSN | 1097-0258 |
| EndPage | 941 |
| ExternalDocumentID | 11870825 13522275 10_1002_sim_1063 SIM1063 ark_67375_WNG_Z8ZZ6FNV_N |
| Genre | article Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, P.H.S Journal Article |
| GrantInformation_xml | – fundername: Georgetown Echo funderid: RC‐HL 35129 JHU MRI RC‐HL 15103 – fundername: National Heart, Lung and Blood Institute funderid: N01‐HC‐85079; N01‐HC‐85086 – fundername: NHLBI NIH HHS grantid: N01-HC-85086 – fundername: NHLBI NIH HHS grantid: N01-HC-85080 – fundername: NHLBI NIH HHS grantid: N01-HC-85082 – fundername: NHLBI NIH HHS grantid: N01-HC-85084 – fundername: NHLBI NIH HHS grantid: N01-HC-85085 – fundername: NHLBI NIH HHS grantid: N01-HC-85083 – fundername: NHLBI NIH HHS grantid: N01-HC-85079 – fundername: NHLBI NIH HHS grantid: N01-HC-85081 |
| GroupedDBID | --- .3N .GA .Y3 05W 0R~ 10A 123 1L6 1OB 1OC 1ZS 31~ 33P 3SF 3WU 4.4 4ZD 50Y 50Z 51W 51X 52M 52N 52O 52P 52S 52T 52U 52W 52X 53G 5RE 5VS 66C 6PF 702 7PT 8-0 8-1 8-3 8-4 8-5 8UM 930 A03 AAESR AAEVG AAHQN AAMMB AAMNL AANHP AANLZ AAONW AASGY AAWTL AAXRX AAYCA AAZKR ABCQN ABCUV ABIJN ABJNI ABOCM ABPVW ACAHQ ACBWZ ACCZN ACGFS ACPOU ACRPL ACXBN ACXQS ACYXJ ADBBV ADEOM ADIZJ ADKYN ADMGS ADNMO ADOZA ADXAS ADZMN AEFGJ AEIGN AEIMD AENEX AEUYR AEYWJ AFBPY AFFNX AFFPM AFGKR AFWVQ AFZJQ AGQPQ AGXDD AGYGG AHBTC AHMBA AIDQK AIDYY AITYG AIURR AJXKR ALAGY ALMA_UNASSIGNED_HOLDINGS ALUQN ALVPJ AMBMR AMYDB ASPBG ATUGU AUFTA AVWKF AZBYB AZFZN AZVAB BAFTC BDRZF BFHJK BHBCM BMNLL BMXJE BNHUX BROTX BRXPI BSCLL BY8 CS3 D-E D-F DCZOG DPXWK DR2 DRFUL DRSTM DU5 EBS EJD EX3 F00 F01 F04 F5P FEDTE G-S G.N GNP GODZA H.T H.X HBH HGLYW HHY HHZ HVGLF HZ~ IX1 J0M JPC KQQ LATKE LAW LC2 LC3 LEEKS LH4 LITHE LOXES LP6 LP7 LUTES LW6 LYRES MEWTI MK4 MRFUL MRSTM MSFUL MSSTM MXFUL MXSTM N04 N05 N9A NF~ NNB O66 O9- OIG P2P P2W P2X P4D PALCI PQQKQ Q.N Q11 QB0 QRW R.K ROL RX1 SUPJJ TN5 UB1 V2E W8V W99 WBKPD WH7 WIB WIH WIK WJL WOHZO WOW WQJ WXSBR WYISQ XBAML XG1 XV2 YHZ ZGI ZZTAW ~IA ~WT AAHHS ACCFJ AEEZP AEQDE AEUQT AFPWT AIWBW AJBDE RWI WRC WUP WWH AAYXX CITATION O8X ABEML ACSCC AGHNM AMVHM DUUFO EBD EMOBN HF~ IQODW M67 RIWAO RJQFR RYL SAMSI SV3 ZXP CGR CUY CVF ECM EIF NPM 7X8 |
| ID | FETCH-LOGICAL-c3853-c4a76d04e0973d67dd104e3a4ae486caf29b5a8ab85eb598f9e06118fbe079cc3 |
| IEDL.DBID | DRFUL |
| ISICitedReferencesCount | 6 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000174199200009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0277-6715 |
| IngestDate | Sun Nov 09 13:39:49 EST 2025 Wed Feb 19 02:32:50 EST 2025 Mon Jul 21 09:11:53 EDT 2025 Tue Nov 18 22:19:10 EST 2025 Sat Nov 29 06:42:36 EST 2025 Wed Jan 22 16:46:21 EST 2025 Sun Sep 21 06:18:22 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Keywords | Cluster analysis Human Evaluation Statistical analysis Geographic distribution Regression analysis Statistics Algorithm Epidemiology Medical application Data reduction |
| Language | English |
| License | http://onlinelibrary.wiley.com/termsAndConditions#vor CC BY 4.0 Copyright 2002 John Wiley & Sons, Ltd. |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c3853-c4a76d04e0973d67dd104e3a4ae486caf29b5a8ab85eb598f9e06118fbe079cc3 |
| Notes | Georgetown Echo - No. RC-HL 35129 JHU MRI RC-HL 15103 istex:B390D1993EB7844F57590899DAD36C1C79276B3F ark:/67375/WNG-Z8ZZ6FNV-N ArticleID:SIM1063 National Heart, Lung and Blood Institute - No. N01-HC-85079; No. N01-HC-85086 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| PMID | 11870825 |
| PQID | 71493090 |
| PQPubID | 23479 |
| PageCount | 21 |
| ParticipantIDs | proquest_miscellaneous_71493090 pubmed_primary_11870825 pascalfrancis_primary_13522275 crossref_citationtrail_10_1002_sim_1063 crossref_primary_10_1002_sim_1063 wiley_primary_10_1002_sim_1063_SIM1063 istex_primary_ark_67375_WNG_Z8ZZ6FNV_N |
| PublicationCentury | 2000 |
| PublicationDate | 30 March 2002 |
| PublicationDateYYYYMMDD | 2002-03-30 |
| PublicationDate_xml | – month: 03 year: 2002 text: 30 March 2002 day: 30 |
| PublicationDecade | 2000 |
| PublicationPlace | Chichester, UK |
| PublicationPlace_xml | – name: Chichester, UK – name: Elmont, NY – name: Chichester – name: England |
| PublicationTitle | Statistics in medicine |
| PublicationTitleAlternate | Statist. Med |
| PublicationYear | 2002 |
| Publisher | John Wiley & Sons, Ltd Wiley |
| Publisher_xml | – name: John Wiley & Sons, Ltd – name: Wiley |
| References | Gordon AD. Classification: Methods for the Exploratory Analysis of Multivariate Data. Chapman and Hall: London, 1981. Weiss KB, Wagener DK. Changing patterns of asthma mortality - identifying target populations at high risk. Journal of the American Medical Association 1990; 264:1683-1687. Bryan RN, Manolio TA, Schertz LD, Jungreis C, Poirier VC, Elster AD, Kronmal RA. A method for using MR to evaluate the effects of cardiovascular disease on the brain: The Cardiovascular Health Study. American Journal of Neuroradiology 1994; 15:1625-1633. Rand WM. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 1971; 66:846-850. Polissar L. The effect of migration on comparison of disease rates in geographical studies in the United States. American Journal of Epidemiology 1980; 111:175-182. Cormack RM. A review of classification. Journal of the Royal Statistical Society, Series A 1971; 134:321-367. Emrich LJ, Piedmonte MR. A method for generating high-dimensional multivariate binary variates. American Statistician 1991; 45:302-304. Stanfel LE. Application of clustering theory to cancer mortality data. Computers and Biomedical Research 1986; 19:117-141. Murtagh F. A survey of algorithms for contiguity-constrained clustering and related problems. Computer Journal 1985; 28:82-88. Everitt B. Cluster Analysis. 2nd edn. Social Science Research Council, Halsted Press: New York, 1980. Perruchet C. Constrained agglomerative hierarchical classification. Pattern Recognition 1983; 16:213-217. Aldenderfer MS, Blashfield RK. Cluster Analysis. Sage Publications: Beverly Hills, California, 1984. Teng EL, Chui HC. The modified mini-mental state (3MS) examination. Journal of Clinical Psychology 1987; 48:314-318. Morris RD, Munasinghe RL. Aggregation of existing geographic regions to diminish spurious variability of disease rates. Statistics in Medicine 1993; 12:1915-1929. Banfield JD, Raftery AE. Model-based Gaussian and non-Gaussian clustering. Biometrics 1993; 49:803-822. Margules CR, Faith DP, Belbin L. An adjacency constraint in agglomerative hierarchical classifications of geographic data. Environment and Planning A 1985; 17:397-412. Anderberg MR. Cluster Analysis For Applications. Academic Press: New York, 1973. 1985; 28 1985; 17 1971; 66 1993; 12 1993; 49 1991; 45 1986; 19 1973 1984 1971; 134 1994; 15 1981 1980 1990; 264 1983; 16 1980; 111 1989 1987; 48 Gordon AD (e_1_2_1_7_2) 1981 e_1_2_1_6_2 e_1_2_1_4_2 e_1_2_1_5_2 Bryan RN (e_1_2_1_12_2) 1994; 15 e_1_2_1_2_2 e_1_2_1_11_2 Teng EL (e_1_2_1_19_2) 1987; 48 e_1_2_1_15_2 e_1_2_1_16_2 Weiss KB (e_1_2_1_3_2) 1990; 264 e_1_2_1_13_2 e_1_2_1_14_2 Everitt B (e_1_2_1_10_2) 1980 e_1_2_1_8_2 e_1_2_1_17_2 e_1_2_1_9_2 e_1_2_1_18_2 |
| References_xml | – reference: Everitt B. Cluster Analysis. 2nd edn. Social Science Research Council, Halsted Press: New York, 1980. – reference: Aldenderfer MS, Blashfield RK. Cluster Analysis. Sage Publications: Beverly Hills, California, 1984. – reference: Stanfel LE. Application of clustering theory to cancer mortality data. Computers and Biomedical Research 1986; 19:117-141. – reference: Emrich LJ, Piedmonte MR. A method for generating high-dimensional multivariate binary variates. American Statistician 1991; 45:302-304. – reference: Weiss KB, Wagener DK. Changing patterns of asthma mortality - identifying target populations at high risk. Journal of the American Medical Association 1990; 264:1683-1687. – reference: Teng EL, Chui HC. The modified mini-mental state (3MS) examination. Journal of Clinical Psychology 1987; 48:314-318. – reference: Cormack RM. A review of classification. Journal of the Royal Statistical Society, Series A 1971; 134:321-367. – reference: Polissar L. The effect of migration on comparison of disease rates in geographical studies in the United States. American Journal of Epidemiology 1980; 111:175-182. – reference: Murtagh F. A survey of algorithms for contiguity-constrained clustering and related problems. Computer Journal 1985; 28:82-88. – reference: Bryan RN, Manolio TA, Schertz LD, Jungreis C, Poirier VC, Elster AD, Kronmal RA. A method for using MR to evaluate the effects of cardiovascular disease on the brain: The Cardiovascular Health Study. American Journal of Neuroradiology 1994; 15:1625-1633. – reference: Perruchet C. Constrained agglomerative hierarchical classification. Pattern Recognition 1983; 16:213-217. – reference: Gordon AD. Classification: Methods for the Exploratory Analysis of Multivariate Data. Chapman and Hall: London, 1981. – reference: Morris RD, Munasinghe RL. Aggregation of existing geographic regions to diminish spurious variability of disease rates. Statistics in Medicine 1993; 12:1915-1929. – reference: Banfield JD, Raftery AE. Model-based Gaussian and non-Gaussian clustering. Biometrics 1993; 49:803-822. – reference: Rand WM. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 1971; 66:846-850. – reference: Anderberg MR. Cluster Analysis For Applications. Academic Press: New York, 1973. – reference: Margules CR, Faith DP, Belbin L. An adjacency constraint in agglomerative hierarchical classifications of geographic data. Environment and Planning A 1985; 17:397-412. – year: 1984 – year: 1981 – volume: 28 start-page: 82 year: 1985 end-page: 88 article-title: A survey of algorithms for contiguity‐constrained clustering and related problems publication-title: Computer Journal – volume: 66 start-page: 846 year: 1971 end-page: 850 article-title: Objective criteria for the evaluation of clustering methods publication-title: Journal of the American Statistical Association – volume: 45 start-page: 302 year: 1991 end-page: 304 article-title: A method for generating high‐dimensional multivariate binary variates publication-title: American Statistician – year: 1980 – volume: 12 start-page: 1915 year: 1993 end-page: 1929 article-title: Aggregation of existing geographic regions to diminish spurious variability of disease rates publication-title: Statistics in Medicine – volume: 15 start-page: 1625 year: 1994 end-page: 1633 article-title: A method for using MR to evaluate the effects of cardiovascular disease on the brain: The Cardiovascular Health Study publication-title: American Journal of Neuroradiology – year: 1989 – year: 1973 – volume: 19 start-page: 117 year: 1986 end-page: 141 article-title: Application of clustering theory to cancer mortality data publication-title: Computers and Biomedical Research – volume: 17 start-page: 397 year: 1985 end-page: 412 article-title: An adjacency constraint in agglomerative hierarchical classifications of geographic data publication-title: Environment and Planning A – volume: 49 start-page: 803 year: 1993 end-page: 822 article-title: Model‐based Gaussian and non‐Gaussian clustering publication-title: Biometrics – volume: 134 start-page: 321 year: 1971 end-page: 367 article-title: A review of classification publication-title: Journal of the Royal Statistical Society, Series A – volume: 48 start-page: 314 year: 1987 end-page: 318 article-title: The modified mini‐mental state (3MS) examination publication-title: Journal of Clinical Psychology – volume: 16 start-page: 213 year: 1983 end-page: 217 article-title: Constrained agglomerative hierarchical classification publication-title: Pattern Recognition – volume: 111 start-page: 175 year: 1980 end-page: 182 article-title: The effect of migration on comparison of disease rates in geographical studies in the United States publication-title: American Journal of Epidemiology – volume: 264 start-page: 1683 year: 1990 end-page: 1687 article-title: Changing patterns of asthma mortality – identifying target populations at high risk publication-title: Journal of the American Medical Association – ident: e_1_2_1_4_2 doi: 10.1016/0010-4809(86)90011-X – ident: e_1_2_1_16_2 doi: 10.1016/0031-3203(83)90024-9 – volume: 264 start-page: 1683 year: 1990 ident: e_1_2_1_3_2 article-title: Changing patterns of asthma mortality – identifying target populations at high risk publication-title: Journal of the American Medical Association doi: 10.1001/jama.1990.03450130055026 – ident: e_1_2_1_17_2 doi: 10.2307/2684460 – ident: e_1_2_1_5_2 doi: 10.1093/oxfordjournals.aje.a112885 – ident: e_1_2_1_8_2 doi: 10.1007/s10683-007-9168-y – ident: e_1_2_1_11_2 doi: 10.2307/2532201 – volume: 48 start-page: 314 year: 1987 ident: e_1_2_1_19_2 article-title: The modified mini‐mental state (3MS) examination publication-title: Journal of Clinical Psychology – volume-title: Classification: Methods for the Exploratory Analysis of Multivariate Data year: 1981 ident: e_1_2_1_7_2 – ident: e_1_2_1_2_2 doi: 10.1002/sim.4780121916 – ident: e_1_2_1_6_2 doi: 10.2307/2344237 – ident: e_1_2_1_18_2 doi: 10.1080/01621459.1971.10482356 – volume-title: Cluster Analysis year: 1980 ident: e_1_2_1_10_2 – volume: 15 start-page: 1625 year: 1994 ident: e_1_2_1_12_2 article-title: A method for using MR to evaluate the effects of cardiovascular disease on the brain: The Cardiovascular Health Study publication-title: American Journal of Neuroradiology – ident: e_1_2_1_13_2 doi: 10.1007/978-1-4899-3242-6 – ident: e_1_2_1_15_2 doi: 10.1068/a170397 – ident: e_1_2_1_14_2 doi: 10.1093/comjnl/28.1.82 – ident: e_1_2_1_9_2 doi: 10.4135/9781412983648 |
| SSID | ssj0011527 |
| Score | 1.7089527 |
| Snippet | In many studies it is of interest to cluster states, counties or other small regions in order to obtain improved estimates of disease rates or other summary... |
| SourceID | proquest pubmed pascalfrancis crossref wiley istex |
| SourceType | Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 921 |
| SubjectTerms | Aged Algorithms Biological and medical sciences Cardiovascular Diseases - epidemiology Cluster Analysis Cognition Computer Simulation Computerized, statistical medical data processing and models in biomedicine data reduction Humans Infarction - epidemiology Magnetic Resonance Imaging Medical sciences Medical statistics Models, Statistical regression Regression Analysis variable clustering |
| Title | Regression-based variable clustering for data reduction |
| URI | https://api.istex.fr/ark:/67375/WNG-Z8ZZ6FNV-N/fulltext.pdf https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fsim.1063 https://www.ncbi.nlm.nih.gov/pubmed/11870825 https://www.proquest.com/docview/71493090 |
| Volume | 21 |
| WOSCitedRecordID | wos000174199200009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVWIB databaseName: Wiley Online Library Full Collection 2020 customDbUrl: eissn: 1097-0258 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0011527 issn: 0277-6715 databaseCode: DRFUL dateStart: 19960101 isFulltext: true titleUrlDefault: https://onlinelibrary.wiley.com providerName: Wiley-Blackwell |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1fb9QwDLfYDqFJiD8HGx1wFAnBU7X-SZvkEQEHSFuFBoPTvURpkqKJ0U13u2mPfAQ-4z4JdtPe6aQhIfHUPjhVGtuxndg_A7zgldUOhSeSNkkjxrSOdFKjXhlbYzTGnLEtZP4-L0sxmchPXVYl1cJ4fIjlgRtpRrtfk4Lrar63Ag2dH__E1yLbgAHVVGHgNXh7OD7aX94h9A1b6ZKy4EneQ8_G6V4_ds0YDWhdLyk5Us9xfWrf2OI6z3PdkW0t0fju__zDPbjT-Z_hay8w9-GGa4Zw66C7YR_CbX-OF_rypCFskTfqwZwfgDx0333ebHP16zcZQBteYLBN5VehOVkQ6AKawhAd4ZBST8MZAcMS6x_C0fjdlzcfoq73QmQytOCRYZoXNmaO4Hxswa3FuM1lmmnHRGF0ncoq10JXIndVLkUtHXoGiagrF3NpTLYNm81p4x5BaFNTWB5ria4SM5mthJYO4zzG6iRzRRrAq54JynTA5NQf40R5SOVU4TIpWqYAni8pzzwYxzU0L1s-Lgn07Aclr_FcfSvfq6mYTotx-VWVAYzWGL36IvmkKc8DeNZzXqHO0UWKbtzpYq44hpVZLOMAdrxArMYmuP9h0I2zaPn-12mqzx8P6Ln7r4SPYattREPlkPET2DyfLdxTuGkuUAZmI9jgEzHqNOAP3g0IxQ |
| linkProvider | Wiley-Blackwell |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwEB6VXQSVEI8F2vBog4TgFDUPJ47FCQFLK3YjVFpa7cVybKeqaNNql6048hP4jfwSZuJkVysVCYlTcrAjZ8bj-cYefwPwkpdGWZw8gTBRHDCmVKCiCu1KmwqjMWa1aSjzR7wo8uNj8XkN3nR3YRw_xGLDjSyjWa_JwGlDemfJGjo7PcfXLLkBfZYlPO9B__3-8HC0OEToKrbSKWXGo7Tjng3jna7vijfqk2B_UHakmqGAKlfZ4jrouYpkG1c0vPdfP3Ef7rYI1H_rpswDWLP1AG6N2zP2AdxxO3m-u6A0gHXCo47O-SGIfXviMmfr3z9_kQs0_hWG23QBy9dnc6JdQGfoIxT2KfnUnxI1LCn_ERwOPxy82w3a6guBTtCHB5opnpmQWSL0MRk3BiM3myimLMszrapYlKnKVZmntkxFXgmL2CDKq9KGXGidPIZefVHbTfBNrDPDQyUQLDGdmDJXwmKkx1gVJTaLPXjdaUHqlpqcKmScSUeqHEsUkyQxefBi0fLS0XFc0-ZVo8hFAzX9RulrPJVHxUc5ySeTbFh8lYUHWyuaXn6RUGnMUw-2O9VLtDo6SlG1vZjPJMfAMglF6MGGmxHLvhGugBh24ygaxf91mPLL3pieT_614Tbc3j0Yj-Ror_j0FNabsjR0OTJ8Br3v07l9Djf1Fc6H6VZrCH8Ab9ELzQ |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NbtQwEB6VXVRVQgWWvxRog4TgFDU_ThyrJ0QJVGyjqlCo9mI5toMQJa122Yojj8Az8iSdiZNdrVQkJE7JwY4cj8fzjWf8DcBzXhllcfEEwkRxwJhSgYpq1CttavTGmNWmpcwf87LMT0_F0Rrs9XdhHD_E4sCNNKPdr0nB7YWpd5esobOv3_E1S27AkKUiZQMY7h8XJ-NFEKGv2EpRyoxHac89G8a7fd8VazSkif1J2ZFqhhNUu8oW10HPVSTbmqLi9n_9xB3Y7BCo_8otmbuwZpsRrB92MfYR3HIneb67oDSCDcKjjs75Hohj-8VlzjZ_fv0mE2j8S3S36QKWr8_mRLuAxtBHKOxT8qk_JWpYEv59OCnefHz9LuiqLwQ6QRseaKZ4ZkJmidDHZNwY9NxsopiyLM-0qmNRpSpXVZ7aKhV5LSxigyivKxtyoXXyAAbNeWMfgW9inRkeKoFgienEVLkSFj09xuoosVnswcteClJ31ORUIeNMOlLlWOI0SZomD54tWl44Oo5r2rxoBblooKbfKH2Np_Jz-VZO8skkK8pPsvRge0XSyy8SKo156sFOL3qJWkehFNXY8_lMcnQsk1CEHjx0K2LZN8IdEN1uHEUr-L8OU344OKTn1r823IH1o_1Cjg_K949ho61KQ3cjwycw-DGd26dwU1_icphud3pwBevuC0g |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Regression-based+variable+clustering+for+data+reduction&rft.jtitle=Statistics+in+medicine&rft.au=MCCLELLAND%2C+R.+L&rft.au=KRONMAL%2C+R.+A&rft.date=2002-03-30&rft.pub=Wiley&rft.issn=0277-6715&rft.volume=21&rft.issue=6&rft.spage=921&rft.epage=941&rft_id=info:doi/10.1002%2Fsim.1063&rft.externalDBID=n%2Fa&rft.externalDocID=13522275 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0277-6715&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0277-6715&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0277-6715&client=summon |