Optimal Estimation of the Number of Network Communities
In network analysis, how to estimate the number of communities K is a fundamental problem. We consider a broad setting where we allow severe degree heterogeneity and a wide range of sparsity levels, and propose Stepwise Goodness of Fit (StGoF) as a new approach. This is a stepwise algorithm, where f...
Uloženo v:
| Vydáno v: | Journal of the American Statistical Association Ročník 118; číslo 543; s. 2101 - 2116 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Alexandria
Taylor & Francis
03.07.2023
Taylor & Francis Ltd |
| Témata: | |
| ISSN: | 0162-1459, 1537-274X, 1537-274X |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | In network analysis, how to estimate the number of communities K is a fundamental problem. We consider a broad setting where we allow severe degree heterogeneity and a wide range of sparsity levels, and propose Stepwise Goodness of Fit (StGoF) as a new approach. This is a stepwise algorithm, where for
, we alternately use a community detection step and a goodness of fit (GoF) step. We adapt SCORE Jin for community detection, and propose a new GoF metric. We show that at step m, the GoF metric diverges to
in probability for all m < K and converges to N(0, 1) if m = K. This gives rise to a consistent estimate for K. Also, we discover the right way to define the signal-to-noise ratio (SNR) for our problem and show that consistent estimates for K do not exist if
, and StGoF is uniformly consistent for K if
. Therefore, StGoF achieves the optimal phase transition.
Similar stepwise methods are known to face analytical challenges. We overcome the challenges by using a different stepwise scheme in StGoF and by deriving sharp results that are not available before. The key to our analysis is to show that SCORE has the Nonsplitting Property (NSP). Primarily due to a nontractable rotation of eigenvectors dictated by the Davis-Kahan
theorem, the NSP is nontrivial to prove and requires new techniques we develop.
Supplementary materials
for this article are available online. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 0162-1459 1537-274X 1537-274X |
| DOI: | 10.1080/01621459.2022.2035736 |