Machine learning for identification of frailty in Canadian primary care practices

IntroductionFrailty is a medical syndrome, commonly affecting people aged 65 years and over and is characterized by a greater risk of adverse outcomes following illness or injury. Electronic medical records contain a large amount of longitudinal data that can be used for primary care research. Machi...

Full description

Saved in:
Bibliographic Details
Published in:International journal of population data science Vol. 6; no. 1; p. 1650
Main Authors: Aponte-Hao, Sylvia, Wong, Sabrina T., Thandi, Manpreet, Ronksley, Paul, McBrien, Kerry, Lee, Joon, Grandy, Mathew, Mangin, Dee, Katz, Alan, Singer, Alexander, Manca, Donna, Williamson, Tyler
Format: Journal Article
Language:English
Published: Wales Swansea University 01.01.2021
Subjects:
ISSN:2399-4908, 2399-4908
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract IntroductionFrailty is a medical syndrome, commonly affecting people aged 65 years and over and is characterized by a greater risk of adverse outcomes following illness or injury. Electronic medical records contain a large amount of longitudinal data that can be used for primary care research. Machine learning can fully utilize this wide breadth of data for the detection of diseases and syndromes. The creation of a frailty case definition using machine learning may facilitate early intervention, inform advanced screening tests, and allow for surveillance. ObjectivesThe objective of this study was to develop a validated case definition of frailty for the primary care context, using machine learning. MethodsPhysicians participating in the Canadian Primary Care Sentinel Surveillance Network across Canada were asked to retrospectively identify the level of frailty present in a sample of their own patients (total n = 5,466), collected from 2015-2019. Frailty levels were dichotomized using a cut-off of 5. Extracted features included previously prescribed medications, billing codes, and other routinely collected primary care data. We used eight supervised machine learning algorithms, with performance assessed using a hold-out test set. A balanced training dataset was also created by oversampling. Sensitivity analyses considered two alternative dichotomization cut-offs. Model performance was evaluated using area under the receiver-operating characteristic curve, F1, accuracy, sensitivity, specificity, negative predictive value and positive predictive value. ResultsThe prevalence of frailty within our sample was 18.4%. Of the eight models developed to identify frail patients, an XGBoost model achieved the highest sensitivity (78.14%) and specificity (74.41%). The balanced training dataset did not improve classification performance. Sensitivity analyses did not show improved performance for cut-offs other than 5. ConclusionSupervised machine learning was able to create well performing classification models for frailty. Future research is needed to assess frailty inter-rater reliability, and link multiple data sources for frailty identification.
AbstractList Frailty is a medical syndrome, commonly affecting people aged 65 years and over and is characterized by a greater risk of adverse outcomes following illness or injury. Electronic medical records contain a large amount of longitudinal data that can be used for primary care research. Machine learning can fully utilize this wide breadth of data for the detection of diseases and syndromes. The creation of a frailty case definition using machine learning may facilitate early intervention, inform advanced screening tests, and allow for surveillance. The objective of this study was to develop a validated case definition of frailty for the primary care context, using machine learning. Physicians participating in the Canadian Primary Care Sentinel Surveillance Network across Canada were asked to retrospectively identify the level of frailty present in a sample of their own patients (total n 5,466), collected from 2015-2019. Frailty levels were dichotomized using a cut-off of 5. Extracted features included previously prescribed medications, billing codes, and other routinely collected primary care data. We used eight supervised machine learning algorithms, with performance assessed using a hold-out test set. A balanced training dataset was also created by oversampling. Sensitivity analyses considered two alternative dichotomization cut-offs. Model performance was evaluated using area under the receiver-operating characteristic curve, F1, accuracy, sensitivity, specificity, negative predictive value and positive predictive value. The prevalence of frailty within our sample was 18.4%. Of the eight models developed to identify frail patients, an XGBoost model achieved the highest sensitivity (78.14%) and specificity (74.41%). The balanced training dataset did not improve classification performance. Sensitivity analyses did not show improved performance for cut-offs other than 5. Supervised machine learning was able to create well performing classification models for frailty. Future research is needed to assess frailty inter-rater reliability, and link multiple data sources for frailty identification.
IntroductionFrailty is a medical syndrome, commonly affecting people aged 65 years and over and is characterized by a greater risk of adverse outcomes following illness or injury. Electronic medical records contain a large amount of longitudinal data that can be used for primary care research. Machine learning can fully utilize this wide breadth of data for the detection of diseases and syndromes. The creation of a frailty case definition using machine learning may facilitate early intervention, inform advanced screening tests, and allow for surveillance. ObjectivesThe objective of this study was to develop a validated case definition of frailty for the primary care context, using machine learning. MethodsPhysicians participating in the Canadian Primary Care Sentinel Surveillance Network across Canada were asked to retrospectively identify the level of frailty present in a sample of their own patients (total n = 5,466), collected from 2015-2019. Frailty levels were dichotomized using a cut-off of 5. Extracted features included previously prescribed medications, billing codes, and other routinely collected primary care data. We used eight supervised machine learning algorithms, with performance assessed using a hold-out test set. A balanced training dataset was also created by oversampling. Sensitivity analyses considered two alternative dichotomization cut-offs. Model performance was evaluated using area under the receiver-operating characteristic curve, F1, accuracy, sensitivity, specificity, negative predictive value and positive predictive value. ResultsThe prevalence of frailty within our sample was 18.4%. Of the eight models developed to identify frail patients, an XGBoost model achieved the highest sensitivity (78.14%) and specificity (74.41%). The balanced training dataset did not improve classification performance. Sensitivity analyses did not show improved performance for cut-offs other than 5. ConclusionSupervised machine learning was able to create well performing classification models for frailty. Future research is needed to assess frailty inter-rater reliability, and link multiple data sources for frailty identification.
Frailty is a medical syndrome, commonly affecting people aged 65 years and over and is characterized by a greater risk of adverse outcomes following illness or injury. Electronic medical records contain a large amount of longitudinal data that can be used for primary care research. Machine learning can fully utilize this wide breadth of data for the detection of diseases and syndromes. The creation of a frailty case definition using machine learning may facilitate early intervention, inform advanced screening tests, and allow for surveillance.INTRODUCTIONFrailty is a medical syndrome, commonly affecting people aged 65 years and over and is characterized by a greater risk of adverse outcomes following illness or injury. Electronic medical records contain a large amount of longitudinal data that can be used for primary care research. Machine learning can fully utilize this wide breadth of data for the detection of diseases and syndromes. The creation of a frailty case definition using machine learning may facilitate early intervention, inform advanced screening tests, and allow for surveillance.The objective of this study was to develop a validated case definition of frailty for the primary care context, using machine learning.OBJECTIVESThe objective of this study was to develop a validated case definition of frailty for the primary care context, using machine learning.Physicians participating in the Canadian Primary Care Sentinel Surveillance Network across Canada were asked to retrospectively identify the level of frailty present in a sample of their own patients (total n = 5,466), collected from 2015-2019. Frailty levels were dichotomized using a cut-off of 5. Extracted features included previously prescribed medications, billing codes, and other routinely collected primary care data. We used eight supervised machine learning algorithms, with performance assessed using a hold-out test set. A balanced training dataset was also created by oversampling. Sensitivity analyses considered two alternative dichotomization cut-offs. Model performance was evaluated using area under the receiver-operating characteristic curve, F1, accuracy, sensitivity, specificity, negative predictive value and positive predictive value.METHODSPhysicians participating in the Canadian Primary Care Sentinel Surveillance Network across Canada were asked to retrospectively identify the level of frailty present in a sample of their own patients (total n = 5,466), collected from 2015-2019. Frailty levels were dichotomized using a cut-off of 5. Extracted features included previously prescribed medications, billing codes, and other routinely collected primary care data. We used eight supervised machine learning algorithms, with performance assessed using a hold-out test set. A balanced training dataset was also created by oversampling. Sensitivity analyses considered two alternative dichotomization cut-offs. Model performance was evaluated using area under the receiver-operating characteristic curve, F1, accuracy, sensitivity, specificity, negative predictive value and positive predictive value.The prevalence of frailty within our sample was 18.4%. Of the eight models developed to identify frail patients, an XGBoost model achieved the highest sensitivity (78.14%) and specificity (74.41%). The balanced training dataset did not improve classification performance. Sensitivity analyses did not show improved performance for cut-offs other than 5.RESULTSThe prevalence of frailty within our sample was 18.4%. Of the eight models developed to identify frail patients, an XGBoost model achieved the highest sensitivity (78.14%) and specificity (74.41%). The balanced training dataset did not improve classification performance. Sensitivity analyses did not show improved performance for cut-offs other than 5.Supervised machine learning was able to create well performing classification models for frailty. Future research is needed to assess frailty inter-rater reliability, and link multiple data sources for frailty identification.CONCLUSIONSupervised machine learning was able to create well performing classification models for frailty. Future research is needed to assess frailty inter-rater reliability, and link multiple data sources for frailty identification.
Introduction Frailty is a medical syndrome, commonly affecting people aged 65 years and over and is characterized by a greater risk of adverse outcomes following illness or injury. Electronic medical records contain a large amount of longitudinal data that can be used for primary care research. Machine learning can fully utilize this wide breadth of data for the detection of diseases and syndromes. The creation of a frailty case definition using machine learning may facilitate early intervention, inform advanced screening tests, and allow for surveillance. Objectives The objective of this study was to develop a validated case definition of frailty for the primary care context, using machine learning. Methods Physicians participating in the Canadian Primary Care Sentinel Surveillance Network across Canada were asked to retrospectively identify the level of frailty present in a sample of their own patients (total n = 5,466), collected from 2015-2019. Frailty levels were dichotomized using a cut-off of 5. Extracted features included previously prescribed medications, billing codes, and other routinely collected primary care data. We used eight supervised machine learning algorithms, with performance assessed using a hold-out test set. A balanced training dataset was also created by oversampling. Sensitivity analyses considered two alternative dichotomization cut-offs. Model performance was evaluated using area under the receiver-operating characteristic curve, F1, accuracy, sensitivity, specificity, negative predictive value and positive predictive value. Results The prevalence of frailty within our sample was 18.4%. Of the eight models developed to identify frail patients, an XGBoost model achieved the highest sensitivity (78.14%) and specificity (74.41%). The balanced training dataset did not improve classification performance. Sensitivity analyses did not show improved performance for cut-offs other than 5. Conclusion Supervised machine learning was able to create well performing classification models for frailty. Future research is needed to assess frailty inter-rater reliability, and link multiple data sources for frailty identification.
Author Katz, Alan
Thandi, Manpreet
Mangin, Dee
Wong, Sabrina T.
Manca, Donna
Ronksley, Paul
Aponte-Hao, Sylvia
McBrien, Kerry
Lee, Joon
Grandy, Mathew
Singer, Alexander
Williamson, Tyler
Author_xml – sequence: 1
  givenname: Sylvia
  surname: Aponte-Hao
  fullname: Aponte-Hao, Sylvia
– sequence: 2
  givenname: Sabrina T.
  surname: Wong
  fullname: Wong, Sabrina T.
– sequence: 3
  givenname: Manpreet
  surname: Thandi
  fullname: Thandi, Manpreet
– sequence: 4
  givenname: Paul
  surname: Ronksley
  fullname: Ronksley, Paul
– sequence: 5
  givenname: Kerry
  surname: McBrien
  fullname: McBrien, Kerry
– sequence: 6
  givenname: Joon
  surname: Lee
  fullname: Lee, Joon
– sequence: 7
  givenname: Mathew
  surname: Grandy
  fullname: Grandy, Mathew
– sequence: 8
  givenname: Dee
  surname: Mangin
  fullname: Mangin, Dee
– sequence: 9
  givenname: Alan
  surname: Katz
  fullname: Katz, Alan
– sequence: 10
  givenname: Alexander
  surname: Singer
  fullname: Singer, Alexander
– sequence: 11
  givenname: Donna
  surname: Manca
  fullname: Manca, Donna
– sequence: 12
  givenname: Tyler
  surname: Williamson
  fullname: Williamson, Tyler
BackLink https://www.ncbi.nlm.nih.gov/pubmed/34541337$$D View this record in MEDLINE/PubMed
BookMark eNp1kctrVDEUh0Op2Fq770qydDNjXjc32QgyVC1URGjX4dw8phnuJGNyp9D_3syj0gqu8vqd7yT53qHTlJNH6IqSOeNK6U9xtXF1_igjnVPZkRN0zrjWM6GJOn0xP0OXta4IIYwK1kv6Fp1x0QnKeX-Ofv0A-xCTx6OHkmJa4pALjs6nKYZoYYo54RxwKBDH6QnHhBeQwEVIeFPiGsoTtlB8W4CdovX1PXoTYKz-8jheoPuv13eL77Pbn99uFl9uZ1ZINs00E4RZp30fBEjda-akI3oAayWhSkvuuODKsY4RHTizUumg_dC5lh1cxy_QzYHrMqzM8S4mQzT7jVyWBkq70ehNrzRQprxTYWhMNdCeaAoNo-wgg2iszwfWZjusvbPt9QXGV9DXJyk-mGV-NEpw2j6zAT4eASX_3vo6mXWs1o8jJJ-31bCuF71gUrMW_fCy198mz05aQB4CtuRaiw_GxmkvYtpJMJSYvX-z9292_s3Ofysk_xQ-s_9b8geO-LW-
CitedBy_id crossref_primary_10_1186_s12875_021_01573_y
crossref_primary_10_2196_44185
crossref_primary_10_2196_62942
crossref_primary_10_2196_47346
crossref_primary_10_3389_fpubh_2022_901068
crossref_primary_10_1007_s11255_023_03640_y
crossref_primary_10_1017_S1463423625000337
crossref_primary_10_1007_s41666_023_00125_6
crossref_primary_10_1038_s41598_024_73854_2
crossref_primary_10_1109_JBHI_2022_3152538
crossref_primary_10_1136_bmjopen_2023_076918
crossref_primary_10_3389_fpubh_2023_1169083
ContentType Journal Article
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
5PM
DOA
DOI 10.23889/ijpds.v6i1.1650
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Open Access Full Text
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE
CrossRef
MEDLINE - Academic

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Economics
EISSN 2399-4908
ExternalDocumentID oai_doaj_org_article_789a128ed8fb4388b17091a9bd8cb6f4
PMC8431345
34541337
10_23889_ijpds_v6i1_1650
Genre Journal Article
GeographicLocations Canada
GeographicLocations_xml – name: Canada
GroupedDBID AAFWJ
AAYXX
ADBBV
AFPKN
ALMA_UNASSIGNED_HOLDINGS
BCNDV
CITATION
GROUPED_DOAJ
M~E
OK1
RPM
CGR
CUY
CVF
ECM
EIF
NPM
7X8
5PM
ID FETCH-LOGICAL-c462t-92402cd9e7f4a69792d6d09bacc6018963d3438d25209f32c689f9eb5d979bd53
IEDL.DBID DOA
ISICitedReferencesCount 30
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000894823500010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2399-4908
IngestDate Fri Oct 03 12:31:30 EDT 2025
Thu Aug 21 13:14:34 EDT 2025
Sun Nov 09 05:30:05 EST 2025
Thu Jan 02 22:54:25 EST 2025
Sat Nov 29 06:19:42 EST 2025
Tue Nov 18 22:45:31 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords supervised machine learning
Canada
frailty
electronic medical records
case definition
primary care
electronic health records
machine learning
Language English
License http://creativecommons.org/licenses/by/4.0
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c462t-92402cd9e7f4a69792d6d09bacc6018963d3438d25209f32c689f9eb5d979bd53
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Conflict of interest: The authors declare no conflicts of interest.
OpenAccessLink https://doaj.org/article/789a128ed8fb4388b17091a9bd8cb6f4
PMID 34541337
PQID 2574742692
PQPubID 23479
ParticipantIDs doaj_primary_oai_doaj_org_article_789a128ed8fb4388b17091a9bd8cb6f4
pubmedcentral_primary_oai_pubmedcentral_nih_gov_8431345
proquest_miscellaneous_2574742692
pubmed_primary_34541337
crossref_citationtrail_10_23889_ijpds_v6i1_1650
crossref_primary_10_23889_ijpds_v6i1_1650
PublicationCentury 2000
PublicationDate 2021-01-01
PublicationDateYYYYMMDD 2021-01-01
PublicationDate_xml – month: 01
  year: 2021
  text: 2021-01-01
  day: 01
PublicationDecade 2020
PublicationPlace Wales
PublicationPlace_xml – name: Wales
PublicationTitle International journal of population data science
PublicationTitleAlternate Int J Popul Data Sci
PublicationYear 2021
Publisher Swansea University
Publisher_xml – name: Swansea University
SSID ssj0002142761
Score 2.312698
Snippet IntroductionFrailty is a medical syndrome, commonly affecting people aged 65 years and over and is characterized by a greater risk of adverse outcomes...
Frailty is a medical syndrome, commonly affecting people aged 65 years and over and is characterized by a greater risk of adverse outcomes following illness or...
Introduction Frailty is a medical syndrome, commonly affecting people aged 65 years and over and is characterized by a greater risk of adverse outcomes...
SourceID doaj
pubmedcentral
proquest
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage 1650
SubjectTerms Aged
Canada - epidemiology
Frailty - diagnosis
Humans
Machine Learning
Population Data Science
Primary Health Care
Reproducibility of Results
Retrospective Studies
Title Machine learning for identification of frailty in Canadian primary care practices
URI https://www.ncbi.nlm.nih.gov/pubmed/34541337
https://www.proquest.com/docview/2574742692
https://pubmed.ncbi.nlm.nih.gov/PMC8431345
https://doaj.org/article/789a128ed8fb4388b17091a9bd8cb6f4
Volume 6
WOSCitedRecordID wos000894823500010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2399-4908
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002142761
  issn: 2399-4908
  databaseCode: DOA
  dateStart: 20170101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2399-4908
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002142761
  issn: 2399-4908
  databaseCode: M~E
  dateStart: 20170101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwEB6VCqlcUEspDaWVkXrhkC7rOH4coWrFpVWRQNqb5ScEVdlqs63Ehd_O2ElWuwiVC5ccEidxZsaemczMNwCnVNTCO8VKxaIt2dSYUtYGFx7zkkeHHkCQudmEuL6Ws5m6WWv1lXLCenjgnnATIZXBPTR4GS2rpLRTgSrOKOulszxmJFC0etacqbQHJyAxdND7uCRqJakmzY8735098GZ6NuWpzH5ND2W4_r_ZmH-mSq7pnstdeD4YjeRDP9k92ArtC9gZa4q7ffh8lXMiAxmaQHwjaIuSxg-pQJn6ZB5JXJjmdvmTNC0ZYQnIXY83QVIOGBmLprqX8PXy4sv5p3JollA6xumyVClM4rwKIjLDlVDUc_9eWeMc-lwS15mvkH6epryXWFHHpYoq2NrjWOvr6gC223kbDoFU-ETlU99MyVOQWGXQPcGDoZa6yhUwGUmn3YAknhpa3Gr0KDKxdSa2TsTWidgFvFvdMXzVI2M_Jm6sxiX863wCpUIPUqH_JRUFvB15qXG9pCCIacP8vtO4RTGR6ndpAa963q5eVbEadXolChAbXN-Yy-aVtvmeMbklGmJ4--v_MfkjeEZT5kz-0fMGtpeL-3AMT93DsukWJ_BEzORJFnc8Xv26-A3pHwbe
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Machine+learning+for+identification+of+frailty+in+Canadian+primary+care+practices&rft.jtitle=International+journal+of+population+data+science&rft.au=Aponte-Hao%2C+Sylvia&rft.au=Wong%2C+Sabrina+T&rft.au=Thandi%2C+Manpreet&rft.au=Ronksley%2C+Paul&rft.date=2021-01-01&rft.eissn=2399-4908&rft.volume=6&rft.issue=1&rft.spage=1650&rft_id=info:doi/10.23889%2Fijpds.v6i1.1650&rft_id=info%3Apmid%2F34541337&rft.externalDocID=34541337
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2399-4908&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2399-4908&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2399-4908&client=summon