Online multi-label dependency topic models for text classification
Multi-label text classification is an increasingly important field as large amounts of text data are available and extracting relevant information is important in many application contexts. Probabilistic generative models are the basis of a number of popular text mining methods such as Naive Bayes o...
Uloženo v:
| Vydáno v: | Machine learning Ročník 107; číslo 5; s. 859 - 886 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
Springer US
01.05.2018
Springer Nature B.V |
| Témata: | |
| ISSN: | 0885-6125, 1573-0565 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Multi-label text classification is an increasingly important field as large amounts of text data are available and extracting relevant information is important in many application contexts. Probabilistic generative models are the basis of a number of popular text mining methods such as Naive Bayes or Latent Dirichlet Allocation. However, Bayesian models for multi-label text classification often are overly complicated to account for label dependencies and skewed label frequencies while at the same time preventing overfitting. To solve this problem we employ the same technique that contributed to the success of deep learning in recent years: greedy layer-wise training. Applying this technique in the supervised setting prevents overfitting and leads to better classification accuracy. The intuition behind this approach is to learn the labels first and subsequently add a more abstract layer to represent dependencies among the labels. This allows using a relatively simple hierarchical topic model which can easily be adapted to the online setting. We show that our method successfully models dependencies online for large-scale multi-label datasets with many labels and improves over the baseline method not modeling dependencies. The same strategy, layer-wise greedy training, also makes the batch variant competitive with existing more complex multi-label topic models. |
|---|---|
| AbstractList | Multi-label text classification is an increasingly important field as large amounts of text data are available and extracting relevant information is important in many application contexts. Probabilistic generative models are the basis of a number of popular text mining methods such as Naive Bayes or Latent Dirichlet Allocation. However, Bayesian models for multi-label text classification often are overly complicated to account for label dependencies and skewed label frequencies while at the same time preventing overfitting. To solve this problem we employ the same technique that contributed to the success of deep learning in recent years: greedy layer-wise training. Applying this technique in the supervised setting prevents overfitting and leads to better classification accuracy. The intuition behind this approach is to learn the labels first and subsequently add a more abstract layer to represent dependencies among the labels. This allows using a relatively simple hierarchical topic model which can easily be adapted to the online setting. We show that our method successfully models dependencies online for large-scale multi-label datasets with many labels and improves over the baseline method not modeling dependencies. The same strategy, layer-wise greedy training, also makes the batch variant competitive with existing more complex multi-label topic models. |
| Author | Burkhardt, Sophie Kramer, Stefan |
| Author_xml | – sequence: 1 givenname: Sophie orcidid: 0000-0002-5385-3926 surname: Burkhardt fullname: Burkhardt, Sophie email: burkhardt@informatik.uni-mainz.de organization: Institute of Computer Science, Johannes Gutenberg-University of Mainz – sequence: 2 givenname: Stefan surname: Kramer fullname: Kramer, Stefan organization: Institute of Computer Science, Johannes Gutenberg-University of Mainz |
| BookMark | eNp9kE1LAzEQhoNUsK3-AG8LnqOT7CbZHLX4BYVe9ByySVZSttk1ScH-e7euBxH0NId5n5mXZ4FmoQ8OoUsC1wRA3CQCUlYYiMCM1xLzEzQnTJQYGGczNIe6ZpgTys7QIqUtAFBe8zm624TOB1fs9l32uNON6wrrBhesC-ZQ5H7wptj11nWpaPtYZPeRC9PplHzrjc6-D-fotNVdchffc4leH-5fVk94vXl8Xt2usSmZzJhbQm2lS8kqTS1tWtKOJSppaSU0wLgS4ASztmzBsAYM1KYueStqKaVoWLlEV9PdIfbve5ey2vb7GMaXikgh2HiM0zElppSJfUrRtcr4_NUzR-07RUAdhalJmBqFqaMwxUeS_CKH6Hc6Hv5l6MSkMRveXPzR6U_oE8GufvM |
| CitedBy_id | crossref_primary_10_1016_j_asoc_2020_106995 crossref_primary_10_1145_3373464_3373474 crossref_primary_10_1016_j_ipm_2023_103530 crossref_primary_10_1108_BIJ_07_2022_0454 crossref_primary_10_1109_TMM_2024_3402534 crossref_primary_10_1016_j_ins_2019_12_031 crossref_primary_10_1109_TPAMI_2020_2985210 crossref_primary_10_1007_s10489_021_03086_8 crossref_primary_10_1109_TKDE_2019_2951561 crossref_primary_10_3233_IDA_194952 crossref_primary_10_1007_s10489_020_01798_x crossref_primary_10_1109_ACCESS_2020_3024745 crossref_primary_10_1155_2022_2834363 crossref_primary_10_1016_j_neucom_2025_129638 crossref_primary_10_1007_s10994_021_05967_y crossref_primary_10_1016_j_jksuci_2022_10_015 crossref_primary_10_1109_TKDE_2019_2908898 crossref_primary_10_3390_make7010003 crossref_primary_10_1016_j_asoc_2020_106167 crossref_primary_10_1109_TBDATA_2020_3019478 crossref_primary_10_1016_j_ins_2020_09_019 crossref_primary_10_1002_cpe_6347 crossref_primary_10_1109_ACCESS_2023_3326722 crossref_primary_10_1007_s13042_024_02181_9 crossref_primary_10_1016_j_knosys_2020_106624 crossref_primary_10_1016_j_procs_2018_07_037 crossref_primary_10_1109_TNNLS_2024_3382911 crossref_primary_10_1109_ACCESS_2020_2981389 crossref_primary_10_1109_TPAMI_2025_3574183 crossref_primary_10_1109_TNNLS_2021_3105142 crossref_primary_10_1177_1550147720911892 crossref_primary_10_1016_j_knosys_2025_114017 crossref_primary_10_1016_j_knosys_2022_109093 crossref_primary_10_1007_s11042_022_12713_6 crossref_primary_10_1109_TNNLS_2022_3164906 crossref_primary_10_1109_TCYB_2021_3107422 |
| Cites_doi | 10.1111/j.1467-9868.2009.00698.x 10.21236/ADA440081 10.1007/978-3-540-74958-5_38 10.1162/neco.2006.18.7.1527 10.1007/978-3-642-12837-0_11 10.1145/2245276.2245311 10.1109/ICDM.2008.140 10.1016/j.patcog.2017.05.007 10.1198/016214506000000302 10.3115/1699510.1699543 10.1073/pnas.0307752101 10.1145/1143844.1143917 10.4018/jdwm.2007070101 10.1007/s10994-011-5256-5 10.1109/TKDE.2013.39 10.1007/978-3-662-44851-9_28 10.1007/s10994-011-5272-5 |
| ContentType | Journal Article |
| Copyright | The Author(s) 2017 Machine Learning is a copyright of Springer, (2017). All Rights Reserved. |
| Copyright_xml | – notice: The Author(s) 2017 – notice: Machine Learning is a copyright of Springer, (2017). All Rights Reserved. |
| DBID | AAYXX CITATION 3V. 7SC 7XB 88I 8AL 8AO 8FD 8FE 8FG 8FK ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- L7M L~C L~D M0N M2P P5Z P62 PHGZM PHGZT PKEHL PQEST PQGLB PQQKQ PQUKI PRINS Q9U |
| DOI | 10.1007/s10994-017-5689-6 |
| DatabaseName | CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts ProQuest Central (purchase pre-March 2016) Science Database (Alumni Edition) Computing Database (Alumni Edition) ProQuest Pharma Collection Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central Korea ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Computing Database Science Database Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China ProQuest Central Basic |
| DatabaseTitle | CrossRef Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Pharma Collection ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Central Korea ProQuest Central (New) Advanced Technologies Database with Aerospace Advanced Technologies & Aerospace Collection ProQuest Computing ProQuest Science Journals (Alumni Edition) ProQuest Central Basic ProQuest Science Journals ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Academic ProQuest Central (Alumni) ProQuest One Academic (New) |
| DatabaseTitleList | Computer Science Database |
| Database_xml | – sequence: 1 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1573-0565 |
| EndPage | 886 |
| ExternalDocumentID | 10_1007_s10994_017_5689_6 |
| GroupedDBID | -4Z -59 -5G -BR -EM -Y2 -~C -~X .4S .86 .DC .VR 06D 0R~ 0VY 199 1N0 1SB 2.D 203 28- 29M 2J2 2JN 2JY 2KG 2KM 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5QI 5VS 67Z 6NX 6TJ 78A 88I 8AO 8FE 8FG 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAEWM AAHNG AAIAL AAJBT AAJKR AANZL AAOBN AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDZT ABECU ABFTV ABHLI ABHQN ABIVO ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACBXY ACDTI ACGFS ACGOD ACHSB ACHXU ACKNC ACMDZ ACMLO ACNCT ACOKC ACOMO ACPIV ACZOJ ADHHG ADHIR ADIMF ADINQ ADKNI ADKPE ADMLS ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AENEX AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFEXP AFGCZ AFKRA AFLOW AFQWF AFWTZ AFZKB AGAYW AGDGC AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARAPS ARCSS ARMRJ ASPBG AVWKF AXYYD AYJHY AZFZN AZQEC B-. BA0 BBWZM BDATZ BENPR BGLVJ BGNMA BPHCQ BSONS CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO EBLON EBS EIOEI EJD ESBYG F5P FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GXS H13 HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I-F I09 IHE IJ- IKXTQ ITG ITH ITM IWAJR IXC IZIGR IZQ I~X I~Y I~Z J-C J0Z JBSCW JCJTX JZLTJ K6V K7- KDC KOV KOW LAK LLZTM M0N M2P M4Y MA- MVM N2Q N9A NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P2P P62 P9O PF- PQQKQ PROAC PT4 Q2X QF4 QM1 QN7 QO4 QOK QOS R4E R89 R9I RHV RIG RNI RNS ROL RPX RSV RZC RZE S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SCO SDH SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TAE TEORI TN5 TSG TSK TSV TUC TUS U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW VXZ W23 W48 WH7 WIP WK8 XJT YLTOR Z45 Z7R Z7S Z7U Z7V Z7W Z7X Z7Y Z7Z Z81 Z83 Z85 Z86 Z87 Z88 Z8M Z8N Z8O Z8P Z8Q Z8R Z8S Z8T Z8U Z8W Z8Z Z91 Z92 ZMTXR AAPKM AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC ADHKG ADKFA AEZWR AFDZB AFFHD AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP AMVHM ATHPR AYFIA CITATION PHGZM PHGZT PQGLB 7SC 7XB 8AL 8FD 8FK JQ2 L7M L~C L~D PKEHL PQEST PQUKI PRINS Q9U |
| ID | FETCH-LOGICAL-c359t-6d12d4a3954a2d2bf1f00249d247a00d4a70e75dd3f0c5b0c08c836f789997b53 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 42 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000428442300003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0885-6125 |
| IngestDate | Tue Nov 04 22:18:10 EST 2025 Sat Nov 29 01:43:27 EST 2025 Tue Nov 18 22:30:01 EST 2025 Fri Feb 21 02:28:49 EST 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 5 |
| Keywords | LDA Multi-label classification Online learning Topic model |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c359t-6d12d4a3954a2d2bf1f00249d247a00d4a70e75dd3f0c5b0c08c836f789997b53 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-5385-3926 |
| OpenAccessLink | https://link.springer.com/content/pdf/10.1007/s10994-017-5689-6.pdf |
| PQID | 1977502462 |
| PQPubID | 54194 |
| PageCount | 28 |
| ParticipantIDs | proquest_journals_1977502462 crossref_citationtrail_10_1007_s10994_017_5689_6 crossref_primary_10_1007_s10994_017_5689_6 springer_journals_10_1007_s10994_017_5689_6 |
| PublicationCentury | 2000 |
| PublicationDate | 20180500 2018-5-00 20180501 |
| PublicationDateYYYYMMDD | 2018-05-01 |
| PublicationDate_xml | – month: 5 year: 2018 text: 20180500 |
| PublicationDecade | 2010 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York – name: Dordrecht |
| PublicationTitle | Machine learning |
| PublicationTitleAbbrev | Mach Learn |
| PublicationYear | 2018 |
| Publisher | Springer US Springer Nature B.V |
| Publisher_xml | – name: Springer US – name: Springer Nature B.V |
| References | LewisDDYangYRoseTGLiFRcv1: A new benchmark collection for text categorization researchJournal of Machine Learning Research20045361397 RubinTChambersASmythPSteyversMStatistical topic models for multi-label document classificationMachine Learning2012881–2157208294260710.1007/s10994-011-5272-51243.68248 BishopCMPattern recognition and machine learning (information science and statistics)2006New YorkSpringer1107.68072 ZhangLShahSKakadiarisIHierarchical multi-label classification using fully associative ensemble learningPattern Recognition2017708910310.1016/j.patcog.2017.05.007 Canini, K. R., Shi, L., & Griffiths, T. L. (2009). Online inference of topics with latent dirichlet allocation. In Proceedings of the international conference on artificial intelligence and statistics (pp. 65–72). Ghamrawi, N., & McCallum, A. (2005). Collective multi-label classification. In Proceedings of the 14th ACM international conference on information and knowledge management (pp. 195–200). New York: ACM. HintonGEOsinderoSTehYWA fast learning algorithm for deep belief netsNeural Computation200618715271554222448510.1162/neco.2006.18.7.15271106.68094 TehYWJordanMIBealMJBleiDMHierarchical Dirichlet processesJournal of the American Statistical Association200610147615661581227948010.1198/0162145060000003021171.62349 Tsoumakas, G., & Vlahavas, I. (2007). Random k-labelsets: An ensemble method for multilabel classification. In J. N. Kok, J. Koronacki, R. L. d. Mantaras, S. Matwin, D. Mladenič, & A. Skowron (Eds.), Proceedings of ECML (pp. 406–417). Warsaw, Poland: Springer. Zhang, M. L., & Zhang, K. (2010). Multi-label learning by exploiting label dependency. In Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10 (pp. 999–1008). Washington, DC, USA: ACM. Li, A. Q., Ahmed, A., Ravi, S., & Smola, A. J. (2014). Reducing the sampling complexity of topic models. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14 (pp. 891–900). New York, NY, USA: ACM. Asuncion, A., Welling, M., Smyth, P., & Teh, Y. W. (2009). On smoothing and inference for topic models. In Proceedings of the 25th conference on uncertainty in artificial intelligence, UAI ’09 (pp. 27–34). Arlington, Virginia, United States: AUAI Press. HoffmanMDBleiDMWangCPaisleyJStochastic variational inferenceJournal of Machine Learning Research2013141303134730819261317.68163 Loza Mencía, E., & Fürnkranz, J. (2010). Efficient multilabel classification algorithms for large-scale problems in the legal domain. In E. Francesconi, S. Montemagni, W. Peters, & D. Tiscornia (Eds.), Semantic processing of legal texts—Where the language of law meets the law of language, lecture notes in artificial intelligence (1st ed., Vol. 6036, pp. 192–215). Berlin: Springer. Prabhu, Y., & Varma, M. (2014). Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14 (pp. 263–272). New York, USA: ACM. ReadJPfahringerBHolmesGFrankEClassifier chains for multi-label classificationMachine Learning2011853333359310823710.1007/s10994-011-5256-5 ZhangMLZhouZHA review on multi-label learning algorithmsIEEE Transactions on Knowledge and Data Engineering20142681819183710.1109/TKDE.2013.39 Nam, J., Kim, J., Loza Mencía, E., Gurevych, I., & Fürnkranz, J. (2014). Large-scale multi-label text classification—Revisiting neural networks. In T. Calders, F. Esposito, E. Hüllermeier, & R. Meo (Eds.), Proceedings of ECML-PKDD, Part II (pp. 437–452). Berlin, Heidelberg: Springer. Wallach, H. M., Mimno, D. M., & McCallum, A. (2009). Rethinking lda: Why priors matter. In Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, A. Culotta (Eds.), Advances in neural information processing systems 22 (pp. 1973–1981). Curran Associates Inc. Wicker, J., Pfahringer, B., & Kramer, S. (2012). Multi-label classification using boolean matrix decomposition. In Proceedings of the 27th annual ACM symposium on applied computing, SAC ’12 (pp. 179–186). New York, USA: ACM. Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. In Proceedings of the 20th conference on uncertainty in artificial intelligence, UAI ’04 (pp. 487–494). Arlington, Virginia, United States: AUAI Press. CappOMoulinesEOn-line expectation-maximization algorithm for latent data modelsJournal of the Royal Statistical Society Series B2009713593613274990910.1111/j.1467-9868.2009.00698.x1250.62015 Papanikolaou, Y., Foulds, J. R., Rubin, T. N., & Tsoumakas, G. (2015). Dense distributions from sparse samples: Improved Gibbs sampling parameter estimators for LDA. ArXiv e-prints. Foulds, J., Boyles, L., DuBois, C., Smyth, P., & Welling, M. (2013). Stochastic collapsed variational bayesian inference for latent dirichlet allocation. In Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’13 (pp. 446–454). New York, USA: ACM. TsoumakasGKatakisIMulti-label classification: An overviewInternational Journal of Data Warehousing and Mining2007200711310.4018/jdwm.2007070101 Hoffman, M., Bach, F. R., & Blei, D. M. (2010). Online learning for latent dirichlet allocation. In Advances in neural information processing systems (pp. 856–864). BengioYLamblinPPopoviciDLarochelleHGreedy layer-wise training of deep networksAdvances in Neural Information Processing Systems200719153 AlSumait, L., Barbar, D., & Domeniconi, C. (2008). On-line lda: Adaptive topic models for mining text streams with applications to topic detection and tracking. In 2008 eighth IEEE international conference on data mining (pp. 3–12). Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. In Proceedings of the National Academy of Sciences of the United States of America (Vol. 101, pp. 5228–5235). National Academy of Sciences. Gouk, H., Pfahringer, B., & Cree, M. J. (2016). Learning distance metrics for multi-label classification. In 8th Asian conference on machine learning (Vol. 63, pp. 318–333). Ramage, D., Hall, D., Nallapati, R., & Manning, C. D. (2009). Labeled lda: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 conference on empirical methods in natural language processing: Volume 1, EMNLP ’09 (pp. 248–256). Stroudsburg, USA: Association for Computational Linguistics. Huang, S. J., & Zhou, Z. H. (2012). Multi-label learning by exploiting label correlations locally. In Proceedings of the twenty-sixth AAAI conference on artificial intelligence, AAAI’12 (pp. 949–955). Toronto, Ontario, Canada: AAAI Press. Li, W., & McCallum, A. (2006). Pachinko allocation: Dag-structured mixture models of topic correlations. In Proceedings of the 23rd international conference on machine learning (pp. 577–584). New York: ACM. WickerJTyukinAKramerSA Nonlinear Label Compression and Transformation Method for Multi-label Classification Using Autoencoders2016ChamSpringer International Publishing328340 Yen, I.E.H., Huang, X., Ravikumar, P., Zhong, K., & Dhillon, I. (2016). Pd-sparse: A primal and dual sparse approach to extreme multiclass and multilabel classification. In Proceedings of the 33rd international conference on machine learning (pp. 3069–3077). New York: ACM. Teh, Y. W., Newman, D., & Welling, M. (2006). A collapsed variational bayesian inference algorithm for latent dirichlet allocation. In Advances in neural information processing systems (pp. 1353–1360). 5689_CR9 5689_CR8 CM Bishop (5689_CR4) 2006 5689_CR5 O Capp (5689_CR6) 2009; 71 5689_CR7 5689_CR1 5689_CR24 5689_CR22 5689_CR2 5689_CR20 5689_CR21 G Tsoumakas (5689_CR28) 2007; 2007 5689_CR29 5689_CR27 L Zhang (5689_CR34) 2017; 70 Y Bengio (5689_CR3) 2007; 19 YW Teh (5689_CR26) 2006; 101 5689_CR35 5689_CR14 5689_CR33 5689_CR12 5689_CR31 5689_CR10 GE Hinton (5689_CR11) 2006; 18 5689_CR30 J Wicker (5689_CR32) 2016 5689_CR19 J Read (5689_CR23) 2011; 85 DD Lewis (5689_CR15) 2004; 5 5689_CR17 ML Zhang (5689_CR36) 2014; 26 5689_CR18 5689_CR16 T Rubin (5689_CR25) 2012; 88 MD Hoffman (5689_CR13) 2013; 14 |
| References_xml | – reference: Li, A. Q., Ahmed, A., Ravi, S., & Smola, A. J. (2014). Reducing the sampling complexity of topic models. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14 (pp. 891–900). New York, NY, USA: ACM. – reference: Asuncion, A., Welling, M., Smyth, P., & Teh, Y. W. (2009). On smoothing and inference for topic models. In Proceedings of the 25th conference on uncertainty in artificial intelligence, UAI ’09 (pp. 27–34). Arlington, Virginia, United States: AUAI Press. – reference: HintonGEOsinderoSTehYWA fast learning algorithm for deep belief netsNeural Computation200618715271554222448510.1162/neco.2006.18.7.15271106.68094 – reference: Zhang, M. L., & Zhang, K. (2010). Multi-label learning by exploiting label dependency. In Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10 (pp. 999–1008). Washington, DC, USA: ACM. – reference: Loza Mencía, E., & Fürnkranz, J. (2010). Efficient multilabel classification algorithms for large-scale problems in the legal domain. In E. Francesconi, S. Montemagni, W. Peters, & D. Tiscornia (Eds.), Semantic processing of legal texts—Where the language of law meets the law of language, lecture notes in artificial intelligence (1st ed., Vol. 6036, pp. 192–215). Berlin: Springer. – reference: ZhangMLZhouZHA review on multi-label learning algorithmsIEEE Transactions on Knowledge and Data Engineering20142681819183710.1109/TKDE.2013.39 – reference: BishopCMPattern recognition and machine learning (information science and statistics)2006New YorkSpringer1107.68072 – reference: HoffmanMDBleiDMWangCPaisleyJStochastic variational inferenceJournal of Machine Learning Research2013141303134730819261317.68163 – reference: Wicker, J., Pfahringer, B., & Kramer, S. (2012). Multi-label classification using boolean matrix decomposition. In Proceedings of the 27th annual ACM symposium on applied computing, SAC ’12 (pp. 179–186). New York, USA: ACM. – reference: Papanikolaou, Y., Foulds, J. R., Rubin, T. N., & Tsoumakas, G. (2015). Dense distributions from sparse samples: Improved Gibbs sampling parameter estimators for LDA. ArXiv e-prints. – reference: TsoumakasGKatakisIMulti-label classification: An overviewInternational Journal of Data Warehousing and Mining2007200711310.4018/jdwm.2007070101 – reference: Canini, K. R., Shi, L., & Griffiths, T. L. (2009). Online inference of topics with latent dirichlet allocation. In Proceedings of the international conference on artificial intelligence and statistics (pp. 65–72). – reference: TehYWJordanMIBealMJBleiDMHierarchical Dirichlet processesJournal of the American Statistical Association200610147615661581227948010.1198/0162145060000003021171.62349 – reference: Huang, S. J., & Zhou, Z. H. (2012). Multi-label learning by exploiting label correlations locally. In Proceedings of the twenty-sixth AAAI conference on artificial intelligence, AAAI’12 (pp. 949–955). Toronto, Ontario, Canada: AAAI Press. – reference: Ramage, D., Hall, D., Nallapati, R., & Manning, C. D. (2009). Labeled lda: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 conference on empirical methods in natural language processing: Volume 1, EMNLP ’09 (pp. 248–256). Stroudsburg, USA: Association for Computational Linguistics. – reference: RubinTChambersASmythPSteyversMStatistical topic models for multi-label document classificationMachine Learning2012881–2157208294260710.1007/s10994-011-5272-51243.68248 – reference: ZhangLShahSKakadiarisIHierarchical multi-label classification using fully associative ensemble learningPattern Recognition2017708910310.1016/j.patcog.2017.05.007 – reference: Hoffman, M., Bach, F. R., & Blei, D. M. (2010). Online learning for latent dirichlet allocation. In Advances in neural information processing systems (pp. 856–864). – reference: Gouk, H., Pfahringer, B., & Cree, M. J. (2016). Learning distance metrics for multi-label classification. In 8th Asian conference on machine learning (Vol. 63, pp. 318–333). – reference: Foulds, J., Boyles, L., DuBois, C., Smyth, P., & Welling, M. (2013). Stochastic collapsed variational bayesian inference for latent dirichlet allocation. In Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’13 (pp. 446–454). New York, USA: ACM. – reference: Yen, I.E.H., Huang, X., Ravikumar, P., Zhong, K., & Dhillon, I. (2016). Pd-sparse: A primal and dual sparse approach to extreme multiclass and multilabel classification. In Proceedings of the 33rd international conference on machine learning (pp. 3069–3077). New York: ACM. – reference: Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. In Proceedings of the National Academy of Sciences of the United States of America (Vol. 101, pp. 5228–5235). National Academy of Sciences. – reference: CappOMoulinesEOn-line expectation-maximization algorithm for latent data modelsJournal of the Royal Statistical Society Series B2009713593613274990910.1111/j.1467-9868.2009.00698.x1250.62015 – reference: Ghamrawi, N., & McCallum, A. (2005). Collective multi-label classification. In Proceedings of the 14th ACM international conference on information and knowledge management (pp. 195–200). New York: ACM. – reference: Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. In Proceedings of the 20th conference on uncertainty in artificial intelligence, UAI ’04 (pp. 487–494). Arlington, Virginia, United States: AUAI Press. – reference: WickerJTyukinAKramerSA Nonlinear Label Compression and Transformation Method for Multi-label Classification Using Autoencoders2016ChamSpringer International Publishing328340 – reference: LewisDDYangYRoseTGLiFRcv1: A new benchmark collection for text categorization researchJournal of Machine Learning Research20045361397 – reference: Prabhu, Y., & Varma, M. (2014). Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14 (pp. 263–272). New York, USA: ACM. – reference: BengioYLamblinPPopoviciDLarochelleHGreedy layer-wise training of deep networksAdvances in Neural Information Processing Systems200719153 – reference: Teh, Y. W., Newman, D., & Welling, M. (2006). A collapsed variational bayesian inference algorithm for latent dirichlet allocation. In Advances in neural information processing systems (pp. 1353–1360). – reference: Li, W., & McCallum, A. (2006). Pachinko allocation: Dag-structured mixture models of topic correlations. In Proceedings of the 23rd international conference on machine learning (pp. 577–584). New York: ACM. – reference: Tsoumakas, G., & Vlahavas, I. (2007). Random k-labelsets: An ensemble method for multilabel classification. In J. N. Kok, J. Koronacki, R. L. d. Mantaras, S. Matwin, D. Mladenič, & A. Skowron (Eds.), Proceedings of ECML (pp. 406–417). Warsaw, Poland: Springer. – reference: Wallach, H. M., Mimno, D. M., & McCallum, A. (2009). Rethinking lda: Why priors matter. In Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, A. Culotta (Eds.), Advances in neural information processing systems 22 (pp. 1973–1981). Curran Associates Inc. – reference: Nam, J., Kim, J., Loza Mencía, E., Gurevych, I., & Fürnkranz, J. (2014). Large-scale multi-label text classification—Revisiting neural networks. In T. Calders, F. Esposito, E. Hüllermeier, & R. Meo (Eds.), Proceedings of ECML-PKDD, Part II (pp. 437–452). Berlin, Heidelberg: Springer. – reference: AlSumait, L., Barbar, D., & Domeniconi, C. (2008). On-line lda: Adaptive topic models for mining text streams with applications to topic detection and tracking. In 2008 eighth IEEE international conference on data mining (pp. 3–12). – reference: ReadJPfahringerBHolmesGFrankEClassifier chains for multi-label classificationMachine Learning2011853333359310823710.1007/s10994-011-5256-5 – ident: 5689_CR17 – volume: 71 start-page: 593 issue: 3 year: 2009 ident: 5689_CR6 publication-title: Journal of the Royal Statistical Society Series B doi: 10.1111/j.1467-9868.2009.00698.x – ident: 5689_CR8 doi: 10.21236/ADA440081 – ident: 5689_CR29 doi: 10.1007/978-3-540-74958-5_38 – ident: 5689_CR21 – volume: 18 start-page: 1527 issue: 7 year: 2006 ident: 5689_CR11 publication-title: Neural Computation doi: 10.1162/neco.2006.18.7.1527 – volume: 14 start-page: 1303 year: 2013 ident: 5689_CR13 publication-title: Journal of Machine Learning Research – volume-title: Pattern recognition and machine learning (information science and statistics) year: 2006 ident: 5689_CR4 – ident: 5689_CR18 doi: 10.1007/978-3-642-12837-0_11 – ident: 5689_CR31 doi: 10.1145/2245276.2245311 – ident: 5689_CR27 – volume: 5 start-page: 361 year: 2004 ident: 5689_CR15 publication-title: Journal of Machine Learning Research – ident: 5689_CR33 – start-page: 328 volume-title: A Nonlinear Label Compression and Transformation Method for Multi-label Classification Using Autoencoders year: 2016 ident: 5689_CR32 – ident: 5689_CR12 – ident: 5689_CR5 – ident: 5689_CR35 – ident: 5689_CR7 – ident: 5689_CR1 doi: 10.1109/ICDM.2008.140 – volume: 70 start-page: 89 year: 2017 ident: 5689_CR34 publication-title: Pattern Recognition doi: 10.1016/j.patcog.2017.05.007 – ident: 5689_CR9 – volume: 101 start-page: 1566 issue: 476 year: 2006 ident: 5689_CR26 publication-title: Journal of the American Statistical Association doi: 10.1198/016214506000000302 – ident: 5689_CR22 doi: 10.3115/1699510.1699543 – volume: 19 start-page: 153 year: 2007 ident: 5689_CR3 publication-title: Advances in Neural Information Processing Systems – ident: 5689_CR10 doi: 10.1073/pnas.0307752101 – ident: 5689_CR14 – ident: 5689_CR20 – ident: 5689_CR24 – ident: 5689_CR16 doi: 10.1145/1143844.1143917 – volume: 2007 start-page: 1 year: 2007 ident: 5689_CR28 publication-title: International Journal of Data Warehousing and Mining doi: 10.4018/jdwm.2007070101 – volume: 85 start-page: 333 issue: 3 year: 2011 ident: 5689_CR23 publication-title: Machine Learning doi: 10.1007/s10994-011-5256-5 – volume: 26 start-page: 1819 issue: 8 year: 2014 ident: 5689_CR36 publication-title: IEEE Transactions on Knowledge and Data Engineering doi: 10.1109/TKDE.2013.39 – ident: 5689_CR19 doi: 10.1007/978-3-662-44851-9_28 – ident: 5689_CR30 – volume: 88 start-page: 157 issue: 1–2 year: 2012 ident: 5689_CR25 publication-title: Machine Learning doi: 10.1007/s10994-011-5272-5 – ident: 5689_CR2 |
| SSID | ssj0002686 |
| Score | 2.427457 |
| Snippet | Multi-label text classification is an increasingly important field as large amounts of text data are available and extracting relevant information is important... |
| SourceID | proquest crossref springer |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 859 |
| SubjectTerms | Artificial Intelligence Bayesian analysis Classification Computer Science Control Data mining Dirichlet problem Labels Machine learning Mechatronics Natural Language Processing (NLP) Robotics Simulation and Modeling Text editing Texts Training |
| SummonAdditionalLinks | – databaseName: Science Database dbid: M2P link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1NS8MwGA46PXhxfuJ0Sg6elGDbNE16EhWHBx07KOxWmi8YyDbXKfjvfZOmmwru4jlNKH2_niRvnwehc2pgVyGlIJaqnAD-VxBShhLFZUw1T4W2nl3_kff7YjjMB-HArQptlU1O9IlaT5Q7I7-KAagwKChZcj19I041yt2uBgmNdbQByCZ2LV1PyWCRiZPMKz1CIDHiKnlzq1n_OudJcSFHs0zkJPtZl5Zg89f9qC87vfZ_X3gHbQfAiW9qD9lFa2a8h9qNmAMOsb2PbmvSUew7DAn4hnnFjUKu-sTzyXSksNfNqTAAXew6RrBy2Ns1G3n7HqCX3v3z3QMJAgtEUZbPSabjRKclzVlaJjqRNraeQlAnKS-jCIZ4ZDjTmtpIMRmpSChBM8thk5Zzyeghao0nY3OEsBQQ_SWjEialpbClsVbnWjPl2GC46KCo-byFCuzjTgTjtVjyJjuLFGCRwlmkyDroYjFlWlNvrHq421ihCFFYFUsTdNBlY8dvw38tdrx6sRO0BbBJ1G2PXdSaz97NKdpUH_NRNTvzLvgFwWTg_A priority: 102 providerName: ProQuest |
| Title | Online multi-label dependency topic models for text classification |
| URI | https://link.springer.com/article/10.1007/s10994-017-5689-6 https://www.proquest.com/docview/1977502462 |
| Volume | 107 |
| WOSCitedRecordID | wos000428442300003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 1573-0565 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002686 issn: 0885-6125 databaseCode: RSV dateStart: 19970101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEB609eDF-sRqLTl4UgLbzWaTPVppEdRS6oPiZdlNNlAobWmr4L93ku62VVTQSy55sDvJZL6QyfcBnLMMTxVpKqlhKqKI_xW6VMaoEmmDaRFIbRy7_p3odGS_H3Xzd9yzItu9uJJ0O_XaYzdHY4u7Kg9lRMNNKGO0k1avoffwvNx-_dDJO6L3cGrDd3GV-d0Qn4PRCmF-uRR1saZd-ddX7sJODi3J1WIt7MFGNtqHSiHbQHIvPoDmgl6UuFxCiqsgG5JCC1e9k_l4MlDEKeTMCEJaYnNDiLIo26YVuZk8hKd26_H6huZSClQxHs1pqBu-DhIW8SDxtZ-ahnFkgdoPROJ5WCW8THCtmfEUTz3lSSVZaAQexyKRcnYEpdF4lB0DSSX6ecJZip2CRJokM0ZHWnNleV-ErIJX2DRWOc-4lbsYxiuGZGujGG0UWxvFYRUull0mC5KN3xrXiomKc3-bxQ2EsRx_KPSrcFlMzFr1T4Od_Kn1KWwjXpKLfMcalObT1-wMttTbfDCb1qHcbHW6vTps3gqK5b3fxbLLX-pugX4AL8ja_w |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3LbhMxFL2KWiS6IVComjaAF2UDsjqxx2PPAiFeUaKEqIsgdTed8UOKVCWhSUH5Kb6Ra8-4ASS6y4K1x5Zm7vF9jK_PATjjFquKqlLUcZ1TzP81binLqZZVjxuZKuMCu_5YTibq8jK_aMHPeBfGt1VGnxgctVlo_4_8vIeJisCAkrF3y2_Uq0b509UooVHDYmQ3P7BkW70dfkL7vmKs_3n6cUAbVQGqucjXNDM9ZtKS5yItmWGV67nAm2dYKsskwSGZWCmM4S7Rokp0orTimZNYmeSy8ioR6PL3U88s5lsF2cWd52dZUJbEjSuozxziKWp9VS-Q8GJMEJnKafZnHNwmt3-dx4Yw12__bx_oMTxqEmryvt4BT6Bl54fQjmIVpPFdT-FDTapKQgclRezbaxIVgPWGrBfLmSZBF2hFMJEnviOGaF9b-GaqgN9n8HUnb3IEe_PF3B4DqRR6t1LwCielpXKldc7kxgjt2W6k6kASzVnohl3di3xcF1teaI-AAhFQeAQUWQde301Z1tQi9z3cjVYvGi-zKrYm78CbiJvfhv-12Mn9i72Eh4Ppl3ExHk5Gp3CAKaKqWzy7sLe-ubXP4YH-vp6tbl4E-BO42jWcfgGW8Dx8 |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1NT9swGH6F2IS4DMaHKCvMB7iALNI4jp3DNAFdBQJVPQyp4hISf0hIqO1oAfWv7dfttRO3A2ncetjZiSXHz_sVv34egANmsKooS0ktUxnF_F-hSRlGlShbTItEauvZ9a9Ftyv7_ay3BL_DXRjXVhl8onfUeqjcP_KTFiYqHANKGp_Yui2i1-58H_2iTkHKnbQGOY0KIldm-oLl2_jbZRv3-jCOOz9-nl_QWmGAKsazCU11K9ZJwTKeFLGOS9uynkNPx4koogiHRGQE15rZSPEyUpFUkqVWYJWSidIpRqD7_yCwxnTthD1-O4sCcepVJtGIOXVZRDhRra7teUJejA88lRlNX8fEeaL75mzWh7zO2v_8sdbhU51ok9PKMj7DkhlswFoQsSC1T9uEs4pslfjOSoo2YR5IUAZWUzIZju4V8XpBY4IJPnGrJMrVHK7JyuN6C24WspJtWB4MB2YHSCnR6xWclfhSUkhbGGt1pjVXjgVHyAZEYWtzVbOuO_GPh3zOF-3QkCMacoeGPG3A0eyVUUU58t7DzYCAvPY-43y-_Q04Dhj6a_hfk-2-P9lXWEEU5deX3asvsIqZo6w6P5uwPHl8MnvwUT1P7seP-94SCNwtGk1_ADVRRWg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Online+multi-label+dependency+topic+models+for+text+classification&rft.jtitle=Machine+learning&rft.au=Burkhardt%2C+Sophie&rft.au=Kramer%2C+Stefan&rft.date=2018-05-01&rft.issn=0885-6125&rft.eissn=1573-0565&rft.volume=107&rft.issue=5&rft.spage=859&rft.epage=886&rft_id=info:doi/10.1007%2Fs10994-017-5689-6&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s10994_017_5689_6 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0885-6125&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0885-6125&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0885-6125&client=summon |