Joint Network Reconstruction and Community Detection from Rich but Noisy Data

Most empirical studies of complex networks return rich but noisy data, as they measure the network structure repeatedly but with substantial errors due to indirect measurements. In this article, we propose a novel framework, called the group-based binary mixture (GBM) modeling approach, to simultane...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of computational and graphical statistics Ročník 33; číslo 2; s. 501 - 514
Hlavní autoři: Hu, Jie, Chen, Xiao, Chen, Yu, Zhang, Weiping
Médium: Journal Article
Jazyk:angličtina
Vydáno: Alexandria Taylor & Francis 02.04.2024
Taylor & Francis Ltd
Témata:
ISSN:1061-8600, 1537-2715
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Most empirical studies of complex networks return rich but noisy data, as they measure the network structure repeatedly but with substantial errors due to indirect measurements. In this article, we propose a novel framework, called the group-based binary mixture (GBM) modeling approach, to simultaneously conduct network reconstruction and community detection from such rich but noisy data. A generalized expectation-maximization (EM) algorithm is developed for computing the maximum likelihood estimates, and an information criterion is introduced to consistently select the number of communities. The strong consistency properties of the network reconstruction and community detection are established under some assumption on the Kullback-Leibler (KL) divergence, and in particular, we do not impose assumptions on the true network structure. It is shown that joint reconstruction with community detection has a synergistic effect, whereby actually detecting communities can improve the accuracy of the reconstruction. Finally, we illustrate the performance of the approach with numerical simulations and two real examples. Supplementary materials for this article are available online.
AbstractList Most empirical studies of complex networks return rich but noisy data, as they measure the network structure repeatedly but with substantial errors due to indirect measurements. In this article, we propose a novel framework, called the group-based binary mixture (GBM) modeling approach, to simultaneously conduct network reconstruction and community detection from such rich but noisy data. A generalized expectation-maximization (EM) algorithm is developed for computing the maximum likelihood estimates, and an information criterion is introduced to consistently select the number of communities. The strong consistency properties of the network reconstruction and community detection are established under some assumption on the Kullback-Leibler (KL) divergence, and in particular, we do not impose assumptions on the true network structure. It is shown that joint reconstruction with community detection has a synergistic effect, whereby actually detecting communities can improve the accuracy of the reconstruction. Finally, we illustrate the performance of the approach with numerical simulations and two real examples. Supplementary materials for this article are available online.
Most empirical studies of complex networks return rich but noisy data, as they measure the network structure repeatedly but with substantial errors due to indirect measurements. In this article, we propose a novel framework, called the group-based binary mixture (GBM) modeling approach, to simultaneously conduct network reconstruction and community detection from such rich but noisy data. A generalized expectation-maximization (EM) algorithm is developed for computing the maximum likelihood estimates, and an information criterion is introduced to consistently select the number of communities. The strong consistency properties of the network reconstruction and community detection are established under some assumption on the Kullback-Leibler (KL) divergence, and in particular, we do not impose assumptions on the true network structure. It is shown that joint reconstruction with community detection has a synergistic effect, whereby actually detecting communities can improve the accuracy of the reconstruction. Finally, we illustrate the performance of the approach with numerical simulations and two real examples. Supplementary materials for this article are available online.
Author Chen, Xiao
Chen, Yu
Zhang, Weiping
Hu, Jie
Author_xml – sequence: 1
  givenname: Jie
  surname: Hu
  fullname: Hu, Jie
  organization: International Institute of Finance, School of Management, University of Science and Technology of China
– sequence: 2
  givenname: Xiao
  surname: Chen
  fullname: Chen, Xiao
  organization: International Institute of Finance, School of Management, University of Science and Technology of China
– sequence: 3
  givenname: Yu
  surname: Chen
  fullname: Chen, Yu
  organization: International Institute of Finance, School of Management, University of Science and Technology of China
– sequence: 4
  givenname: Weiping
  surname: Zhang
  fullname: Zhang, Weiping
  organization: International Institute of Finance, School of Management, University of Science and Technology of China
BookMark eNqFkF1PwyAUhomZidv0J5g08brzAC1t441mfmdqsug1YZRG5goTaMz-vTSdN17o1eHA-xzgmaCRsUYhdIphhqGEcwwMlwxgRoDQGSGsYBQO0BjntEhJgfNRXMdM2oeO0MT7NQBgVhVj9PRotQnJswpf1n0kSyWt8cF1MmhrEmHqZG7btjM67JJrFdSw3zjbJkst35NVF2GrfTwVQRyjw0ZsvDrZ1yl6u715nd-ni5e7h_nVIpWUliEldZXRRhSqKiStC0VYXhNZZjUrmxLXdAVlrmJLicqBAmQkk6wqMVGkyYQCOkVnw9yts5-d8oGvbedMvJJTYFkGWVHlMXUxpKSz3jvVcKmD6D8QnNAbjoH3_viPP97743t_kc5_0VunW-F2_3KXA6dNY10rotdNzYPYbaxrnDBSx0f-PeIbdmiHoA
CitedBy_id crossref_primary_10_1111_anzs_70026
Cites_doi 10.1093/biomet/asaa099
10.1093/biomet/80.2.267
10.1093/biomet/asaa009
10.1214/20-AOS2042
10.1214/13-AOS1138
10.1111/rssb.12200
10.1103/PhysRevLett.108.258701
10.1080/01621459.2020.1777136
10.1214/14-AOS1274
10.1080/01621459.2021.1996378
10.1214/18-EJS1521
10.1103/PhysRevLett.123.128301
10.1111/cwe.12357
10.1073/pnas.0610537104
10.1198/jasa.2010.tm09414
10.1137/140956166
10.1214/10-AOAS403
10.1080/07350015.2022.2099870
10.1080/01621459.2020.1722676
10.1080/10618600.2015.1096790
10.2307/1912526
10.1007/s00779-005-0046-3
10.1016/s0022-2836(03)00239-0
10.1093/bioinformatics/btl396
10.1038/ncomms5323
10.1080/07350015.2016.1272459
10.1038/s41567-018-0076-1
10.1103/PhysRevLett.114.028701
10.1073/pnas.0908366106
10.1080/07350015.2020.1798241
ContentType Journal Article
Copyright 2023 American Statistical Association and Institute of Mathematical Statistics 2023
2023 American Statistical Association and Institute of Mathematical Statistics
Copyright_xml – notice: 2023 American Statistical Association and Institute of Mathematical Statistics 2023
– notice: 2023 American Statistical Association and Institute of Mathematical Statistics
DBID AAYXX
CITATION
JQ2
DOI 10.1080/10618600.2023.2267630
DatabaseName CrossRef
ProQuest Computer Science Collection
DatabaseTitle CrossRef
ProQuest Computer Science Collection
DatabaseTitleList
ProQuest Computer Science Collection
DeliveryMethod fulltext_linktorsrc
Discipline Statistics
Mathematics
EISSN 1537-2715
EndPage 514
ExternalDocumentID 10_1080_10618600_2023_2267630
2267630
Genre Research Article
GrantInformation_xml – fundername: Natural Science Foundation of China
  grantid: 12371279 and 12171450
– fundername: Natural Science Foundation of Anhui Province
  grantid: 2208085MA05
GroupedDBID -~X
.4S
.7F
.DC
.QJ
0BK
0R~
30N
4.4
5GY
AAENE
AAGDL
AAHIA
AAJMT
AALDU
AAMIU
AAPUL
AAQRR
ABCCY
ABFAN
ABFIM
ABJNI
ABLIJ
ABLJU
ABPAQ
ABPEM
ABTAI
ABXUL
ABXYU
ABYWD
ACGFO
ACGFS
ACIWK
ACMTB
ACTIO
ACTMH
ADCVX
ADGTB
AEGXH
AELLO
AENEX
AEOZL
AEPSL
AEYOC
AFRVT
AFVYC
AGDLA
AGMYJ
AHDZW
AIAGR
AIJEM
AKBRZ
AKBVH
AKOOK
ALMA_UNASSIGNED_HOLDINGS
ALQZU
AMVHM
AQRUH
AQTUD
ARCSS
AVBZW
AWYRJ
BLEHA
CCCUG
CS3
D0L
DGEBU
DKSSO
DU5
EBS
E~A
E~B
F5P
GTTXZ
H13
HF~
HZ~
H~P
IPNFZ
J.P
JAA
KYCEM
LJTGL
M4Z
MS~
NA5
NY~
O9-
P2P
PQQKQ
RIG
RNANH
ROSJB
RTWRZ
RWL
RXW
S-T
SNACF
TAE
TASJS
TBQAZ
TDBHL
TEJ
TFL
TFT
TFW
TN5
TTHFI
TUROJ
TUS
UT5
UU3
WZA
XWC
ZGOLN
~S~
AAYXX
CITATION
JQ2
ID FETCH-LOGICAL-c338t-2d943fa7e97c3d7e265d2c84d68f81d3b085e84d32e50300424c69812e2f4ae03
IEDL.DBID TFW
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001105498600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1061-8600
IngestDate Wed Aug 13 08:47:53 EDT 2025
Sat Nov 29 03:24:19 EST 2025
Tue Nov 18 21:39:54 EST 2025
Mon Oct 20 23:49:33 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c338t-2d943fa7e97c3d7e265d2c84d68f81d3b085e84d32e50300424c69812e2f4ae03
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 3064404795
PQPubID 29738
PageCount 14
ParticipantIDs proquest_journals_3064404795
crossref_primary_10_1080_10618600_2023_2267630
informaworld_taylorfrancis_310_1080_10618600_2023_2267630
crossref_citationtrail_10_1080_10618600_2023_2267630
PublicationCentury 2000
PublicationDate 2024-04-02
PublicationDateYYYYMMDD 2024-04-02
PublicationDate_xml – month: 04
  year: 2024
  text: 2024-04-02
  day: 02
PublicationDecade 2020
PublicationPlace Alexandria
PublicationPlace_xml – name: Alexandria
PublicationTitle Journal of computational and graphical statistics
PublicationYear 2024
Publisher Taylor & Francis
Taylor & Francis Ltd
Publisher_xml – name: Taylor & Francis
– name: Taylor & Francis Ltd
References e_1_3_4_4_1
e_1_3_4_3_1
e_1_3_4_8_1
e_1_3_4_7_1
e_1_3_4_20_1
e_1_3_4_6_1
e_1_3_4_5_1
e_1_3_4_23_1
e_1_3_4_24_1
e_1_3_4_21_1
e_1_3_4_22_1
e_1_3_4_27_1
e_1_3_4_28_1
e_1_3_4_25_1
e_1_3_4_26_1
e_1_3_4_29_1
Airoldi E. M. (e_1_3_4_2_1) 2008; 9
e_1_3_4_31_1
e_1_3_4_30_1
Gao C. (e_1_3_4_9_1) 2017; 18
e_1_3_4_12_1
e_1_3_4_35_1
e_1_3_4_13_1
e_1_3_4_34_1
e_1_3_4_10_1
Keribin C. (e_1_3_4_15_1) 2000; 62
e_1_3_4_33_1
e_1_3_4_11_1
e_1_3_4_32_1
e_1_3_4_16_1
e_1_3_4_17_1
e_1_3_4_14_1
e_1_3_4_18_1
e_1_3_4_19_1
References_xml – ident: e_1_3_4_35_1
  doi: 10.1093/biomet/asaa099
– ident: e_1_3_4_20_1
  doi: 10.1093/biomet/80.2.267
– ident: e_1_3_4_17_1
  doi: 10.1093/biomet/asaa009
– ident: e_1_3_4_34_1
  doi: 10.1214/20-AOS2042
– ident: e_1_3_4_3_1
  doi: 10.1214/13-AOS1138
– ident: e_1_3_4_19_1
  doi: 10.1111/rssb.12200
– ident: e_1_3_4_24_1
  doi: 10.1103/PhysRevLett.108.258701
– ident: e_1_3_4_28_1
  doi: 10.1080/01621459.2020.1777136
– ident: e_1_3_4_18_1
  doi: 10.1214/14-AOS1274
– volume: 62
  start-page: 49
  year: 2000
  ident: e_1_3_4_15_1
  article-title: “Consistent Estimation of the Order of Mixture Models,”
  publication-title: Sankhyā: The Indian Journal of Statistics, Series A
– volume: 18
  start-page: 1980
  year: 2017
  ident: e_1_3_4_9_1
  article-title: “Achieving Optimal Misclassification Proportion in Stochastic Block Models,”
  publication-title: The Journal of Machine Learning Research
– ident: e_1_3_4_30_1
  doi: 10.1080/01621459.2021.1996378
– ident: e_1_3_4_16_1
  doi: 10.1214/18-EJS1521
– ident: e_1_3_4_23_1
  doi: 10.1103/PhysRevLett.123.128301
– ident: e_1_3_4_7_1
  doi: 10.1111/cwe.12357
– ident: e_1_3_4_22_1
  doi: 10.1073/pnas.0610537104
– volume: 9
  start-page: 1981
  year: 2008
  ident: e_1_3_4_2_1
  article-title: “Mixed Membership Stochastic Blockmodels,”
  publication-title: The Journal of Machine Learning Research
– ident: e_1_3_4_10_1
  doi: 10.1198/jasa.2010.tm09414
– ident: e_1_3_4_29_1
  doi: 10.1137/140956166
– ident: e_1_3_4_32_1
  doi: 10.1214/10-AOAS403
– ident: e_1_3_4_6_1
  doi: 10.1080/07350015.2022.2099870
– ident: e_1_3_4_13_1
  doi: 10.1080/01621459.2020.1722676
– ident: e_1_3_4_25_1
  doi: 10.1080/10618600.2015.1096790
– ident: e_1_3_4_33_1
  doi: 10.2307/1912526
– ident: e_1_3_4_8_1
  doi: 10.1007/s00779-005-0046-3
– ident: e_1_3_4_27_1
  doi: 10.1016/s0022-2836(03)00239-0
– ident: e_1_3_4_31_1
  doi: 10.1093/bioinformatics/btl396
– ident: e_1_3_4_26_1
  doi: 10.1038/ncomms5323
– ident: e_1_3_4_4_1
  doi: 10.1080/07350015.2016.1272459
– ident: e_1_3_4_21_1
  doi: 10.1038/s41567-018-0076-1
– ident: e_1_3_4_12_1
  doi: 10.1103/PhysRevLett.114.028701
– ident: e_1_3_4_11_1
  doi: 10.1073/pnas.0908366106
– ident: e_1_3_4_14_1
– ident: e_1_3_4_5_1
  doi: 10.1080/07350015.2020.1798241
SSID ssj0001697
Score 2.3884258
Snippet Most empirical studies of complex networks return rich but noisy data, as they measure the network structure repeatedly but with substantial errors due to...
SourceID proquest
crossref
informaworld
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 501
SubjectTerms Algorithms
Binary mixtures
Community detection
EM algorithm
Kullback-Leibler divergence
Maximum likelihood estimates
Mixture distributions
Network reconstruction
Reconstruction
Synergistic effect
Title Joint Network Reconstruction and Community Detection from Rich but Noisy Data
URI https://www.tandfonline.com/doi/abs/10.1080/10618600.2023.2267630
https://www.proquest.com/docview/3064404795
Volume 33
WOSCitedRecordID wos001105498600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAWR
  databaseName: Taylor & Francis Journals Complete
  customDbUrl:
  eissn: 1537-2715
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001697
  issn: 1061-8600
  databaseCode: TFW
  dateStart: 19920301
  isFulltext: true
  titleUrlDefault: https://www.tandfonline.com
  providerName: Taylor & Francis
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3fS8MwED5EfJgP_piK0yl58LVzTdqlfRR1iLjhw9S9lSRNYCCd2Cr433tp0ukQ2YM-tuVCesnlvmvvvgM4E8ok0jCDyI2nQUTzKJBSh4Hhlh1OMyrqWpjHOz4eJ9Npeu-zCUufVmljaOOIIuqz2hq3kGWTEXduo5gEHXXPtv7uIX5AG7FRO7p-a5qT4dPiLA59exWUCKxIU8Pz2yhL3mmJu_THWV07oOH2P0x9B7Y8-iQXbrvswpou2rA5WlC3lm1oWfjp2Jv3YHQ7nxUVGbtccWJj1S_GWYKzIL7ApPogV7rS7r4tWSG2Yp_INxSez0p8KiqxDw_D68nlTeAbMAQKI9cqoHkaMSO4TrliOdd0EOdUJVE-SAziXCYRr2m8ZFTHfVb_RVWDFCGDpiYSus8OYL2YF_oQCIbC2vA8jo1CDJNLaUKDYBPfHocWXHYgahSfKc9ObptkPGehJzFtVJdZ1WVedR3oLcReHD3HKoH0-6pmVf1dxLgmJhlbIdtttkDmLR1FENNFlqc_PvrD0MfQwkuXEkS7sI4rqU9gQ73jgr-e1nv6E6W476I
linkProvider Taylor & Francis
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELZQQaIMPAqIQgEPrCmNndTJiICqQJupQDcrdmypEkoRDUj8e86xU1oh1AHGxDrLOT_uO-fuO4QuUqkjoakG5MZiLyBZ4AmhfE8zww6nKEnLXJinAUuSaDyOF3NhTFil8aG1JYooz2qzuc1ldBUSd2ncmAgsddvU_m4DgIBNAm77egi21vDnj3rP89PYdwVWQMQzMlUWz2_dLNmnJfbSH6d1aYJ6O_8x-F207QAovrIrZg-tqbyBtoZz9tZZA9UNArUEzvtoeD-d5AVObLg4Nu7qN-kshmFgl2NSfOIbVSj73mStYJO0j8U7CE8nM2hNi_QAPfZuR9d9z9Vg8CQ4r4VHsjigOmUqZpJmTJFumBEZBVk30gB1qQDIpuCREhV2aPkjVXZjQA2K6CBVHXqIavk0V0cIgzesNMvCUEuAMZkQ2teAN-HroeuUiSYKKs1z6QjKTZ2MF-47HtNKddyojjvVNVF7LvZqGTpWCcSL08qL8mpE2zomnK6QbVVrgLvNDiIA6wJD1R8e_6Hrc7TZHw0HfHCXPJygOjTZCCHSQjWYVXWKNuQHTP7bWbnAvwDn-fPM
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT8MwDLbQQGgceAwQgwE5cO3YmrZpj4hR8diqHQbsVjVtIk1C3cQKEv8ep0kHE0I7wLGNHKWOk3xO7c8AF0kqfS6pROTGAsuxM8fiXHQtyRQ7nKB2UubCPPVZFPnjcTA00YRzE1apfGipiSLKvVot7lkmq4i4S-XF-HhQt1Xp7zbiB1wj6LWvI3T2lJGPwufFZtw19VVQxFIyVRLPb90sHU9L5KU_NuvyBAp3_mHsu7Bt4Ce50vayB2sib8DWYMHdOm9AXeFPTd-8D4P76SQvSKSDxYlyVr8oZwmOgpgMk-KD9EQh9HuVs0JUyj7hbyg8ncyxNSmSA3gMb0bXt5apwGCl6LoWlp0FDpUJEwFLacaE7bmZnfpO5vkSgS7lCNgEPlJbuB1a_kZNvQAxg7Clk4gOPYRaPs3FERD0hYVkmevKFEFMxrnsSkSb-PXYdcJ4E5xK8XFq6MlVlYyXuGtYTCvVxUp1sVFdE9oLsZnm51glEHyf1bgoL0akrmIS0xWyrcoEYrPUUQRBnaOI-t3jP3R9DpvDXhj376KHE6hjiw4PsltQw0kVp7CRvuPcv56V5v0JBK7yfg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Joint+Network+Reconstruction+and+Community+Detection+from+Rich+but+Noisy+Data&rft.jtitle=Journal+of+computational+and+graphical+statistics&rft.au=Hu%2C+Jie&rft.au=Chen%2C+Xiao&rft.au=Chen%2C+Yu&rft.au=Zhang%2C+Weiping&rft.date=2024-04-02&rft.issn=1061-8600&rft.eissn=1537-2715&rft.volume=33&rft.issue=2&rft.spage=501&rft.epage=514&rft_id=info:doi/10.1080%2F10618600.2023.2267630&rft.externalDBID=n%2Fa&rft.externalDocID=10_1080_10618600_2023_2267630
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1061-8600&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1061-8600&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1061-8600&client=summon