No Bias Left Behind: Fairness Testing for Deep Recommender Systems Targeting General Disadvantaged Groups
Recommender systems play an increasingly important role in modern society, powering digital platforms that suggest a wide array of content, from news and music to job listings, and influencing many aspects of daily life. To improve personalization, these systems often use demographic information. Ho...
Uložené v:
| Vydané v: | Proceedings of the ACM on software engineering Ročník 2; číslo ISSTA; s. 1607 - 1629 |
|---|---|
| Hlavní autori: | , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York, NY, USA
ACM
22.06.2025
|
| Predmet: | |
| ISSN: | 2994-970X, 2994-970X |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Recommender systems play an increasingly important role in modern society, powering digital platforms that suggest a wide array of content, from news and music to job listings, and influencing many aspects of daily life. To improve personalization, these systems often use demographic information. However, ensuring fairness in recommendation quality across demographic groups is challenging, especially since recommender systems are susceptible to the "rich get richer'' Matthew effect due to user feedback loops. With the adoption of deep learning algorithms, uncovering fairness issues has become even more complex. Researchers have started to explore methods for identifying the most disadvantaged user groups using optimization algorithms. Despite this, suboptimal disadvantaged groups remain underexplored, which leaves the risk of bias amplification due to the Matthew effect unaddressed. In this paper, we argue for the necessity of identifying both the most disadvantaged and suboptimal disadvantaged groups. We introduce FairAS, an adaptive sampling based approach, to achieve this goal. Through evaluations on four deep recommender systems and six datasets, FairAS demonstrates an average improvement of 19.2% in identifying the most disadvantaged groups over the state-of-the-art fairness testing approach (FairRec), while reducing testing time by 43.07%. Additionally, the extra suboptimal disadvantaged groups identified by FairAS help improve system fairness, achieving an average improvement of 70.27% over FairRec across all subjects. |
|---|---|
| AbstractList | Recommender systems play an increasingly important role in modern society, powering digital platforms that suggest a wide array of content, from news and music to job listings, and influencing many aspects of daily life. To improve personalization, these systems often use demographic information. However, ensuring fairness in recommendation quality across demographic groups is challenging, especially since recommender systems are susceptible to the "rich get richer'' Matthew effect due to user feedback loops. With the adoption of deep learning algorithms, uncovering fairness issues has become even more complex. Researchers have started to explore methods for identifying the most disadvantaged user groups using optimization algorithms. Despite this, suboptimal disadvantaged groups remain underexplored, which leaves the risk of bias amplification due to the Matthew effect unaddressed. In this paper, we argue for the necessity of identifying both the most disadvantaged and suboptimal disadvantaged groups. We introduce FairAS, an adaptive sampling based approach, to achieve this goal. Through evaluations on four deep recommender systems and six datasets, FairAS demonstrates an average improvement of 19.2% in identifying the most disadvantaged groups over the state-of-the-art fairness testing approach (FairRec), while reducing testing time by 43.07%. Additionally, the extra suboptimal disadvantaged groups identified by FairAS help improve system fairness, achieving an average improvement of 70.27% over FairRec across all subjects. |
| ArticleNumber | ISSTA071 |
| Author | Wang, Zan Wu, Zhuo Chen, Junjie Luo, Chuan Du, Xiaoning |
| Author_xml | – sequence: 1 givenname: Zhuo orcidid: 0009-0006-0165-8746 surname: Wu fullname: Wu, Zhuo email: wuzhuo@tju.edu.cn organization: Tianjin University, Tianjin, China – sequence: 2 givenname: Zan orcidid: 0000-0001-6173-8170 surname: Wang fullname: Wang, Zan email: wangzan@tju.edu.cn organization: Tianjin University, Tianjin, China – sequence: 3 givenname: Chuan orcidid: 0000-0001-5028-1064 surname: Luo fullname: Luo, Chuan email: chuanluo@buaa.edu.cn organization: Beihang University, Beijing, China – sequence: 4 givenname: Xiaoning orcidid: 0000-0003-3728-9541 surname: Du fullname: Du, Xiaoning email: xiaoning.du@monash.edu organization: Monash University, Melbourne, Australia – sequence: 5 givenname: Junjie orcidid: 0000-0003-3056-9962 surname: Chen fullname: Chen, Junjie email: junjiechen@tju.edu.cn organization: Tianjin University, Tianjin, China |
| BookMark | eNpN0EFPAjEQBeDGYCIi8e6pN0-rbSm09SYgaEI0UQ7eNtN2FmvYLmlXE_69KGg8zWTelzm8U9KJTURCzjm74lwOrwdKaCP1EekKY2RhFHvt_NtPSD_nd8bY7sK5Yl0SHhs6DpDpAquWjvEtRH9DZxBSxJzpEnMb4opWTaJTxA19RtfUNUaPib5sc4v1DkFa4Q-bY8QEazoNGfwnxBZW6Ok8NR-bfEaOK1hn7B9mjyxnd8vJfbF4mj9MbhcFaKkLLUB4y-xoZGGoFZMgBZOGMzVyjiP3bmCN0MKryiI3wB23AgW3Q8dAeRz0yOX-rUtNzgmrcpNCDWlbclZ-d1QeOtrJi70EV_-h3_ALObZjVA |
| Cites_doi | 10.1109/TSMCB.2010.2103055 10.1145/3442381.3450015 10.1145/3308558.3313497 10.1145/3404835.3463235 10.1145/3404835.3462966 10.1145/3336191.3371855 10.1145/3597926.3598058 10.1145/3238147.3238165 10.1145/582415.582418 10.1145/3447548.3467249 10.1109/ICSE48619.2023.00136 10.1145/2988450.2988454 10.1145/3236024.3264590 10.1145/3510003.3510202 10.1145/3377811.3380331 10.1145/2931037.2931054 10.1145/3591869 10.1145/3041021.3054197 10.1145/3442381.3449866 10.1145/3106237.3106277 10.5555/2540128.2540517 10.1145/1060745.1060754 10.1109/ICSE43902.2021.00042 10.1145/3468264.3468622 10.1145/3368089.3409761 10.1145/3038912.3052569 10.1017/S1351324901002789 10.1007/978-3-642-13287-2 10.1145/3450613.3456821 10.1145/3394112 10.1145/3510003.3510123 10.1145/3511808.3557220 10.1145/3459637.3481915 |
| ContentType | Journal Article |
| Copyright | Owner/Author |
| Copyright_xml | – notice: Owner/Author |
| DBID | AAYXX CITATION |
| DOI | 10.1145/3728948 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | CrossRef |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 2994-970X |
| EndPage | 1629 |
| ExternalDocumentID | 10_1145_3728948 3728948 |
| GrantInformation_xml | – fundername: Beijing Natural Science Foundation grantid: L241050 – fundername: CCF-Huawei Populus Grove Fund grantid: CCF-HuaweiFM2024005 – fundername: National Natural Science Foundation of China grantid: 62472310,62322208,62202025 funderid: https://doi.org/10.13039/501100001809 – fundername: Young Elite Scientist Sponsorship Program by CAST grantid: YESS20230566 |
| GroupedDBID | AAKMM ACM AEJOY AKRVB ALMA_UNASSIGNED_HOLDINGS LHSKQ M~E AAYXX CITATION |
| ID | FETCH-LOGICAL-a848-82a2db0b66ba58704a420491076cc1e1dc3b9282d7fbe19a1c1b2e21b5c0a7de3 |
| ISSN | 2994-970X |
| IngestDate | Sat Nov 29 07:43:49 EST 2025 Mon Jul 14 20:48:59 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | ISSTA |
| Keywords | Fairness Testing Deep Recommender System Search-based Software Testing |
| Language | English |
| License | This work is licensed under Creative Commons Attribution-NonCommercial-NoDerivs International 4.0. |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-a848-82a2db0b66ba58704a420491076cc1e1dc3b9282d7fbe19a1c1b2e21b5c0a7de3 |
| ORCID | 0000-0001-5028-1064 0009-0006-0165-8746 0000-0001-6173-8170 0000-0003-3728-9541 0000-0003-3056-9962 |
| OpenAccessLink | https://dl.acm.org/doi/10.1145/3728948 |
| PageCount | 23 |
| ParticipantIDs | crossref_primary_10_1145_3728948 acm_primary_3728948 |
| PublicationCentury | 2000 |
| PublicationDate | 20250622 2025-06-22 |
| PublicationDateYYYYMMDD | 2025-06-22 |
| PublicationDate_xml | – month: 06 year: 2025 text: 20250622 day: 22 |
| PublicationDecade | 2020 |
| PublicationPlace | New York, NY, USA |
| PublicationPlace_xml | – name: New York, NY, USA |
| PublicationTitle | Proceedings of the ACM on software engineering |
| PublicationTitleAbbrev | ACM PACMSE |
| PublicationYear | 2025 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| References | Òscar Celma Herrada. 2009. Music recommendation and discovery in the long tail. Universitat Pompeu Fabra. Haibin Zheng, Zhiqing Chen, Tianyu Du, Xuhong Zhang, Yao Cheng, Shouling Ji, Jingyi Wang, Yue Yu, and Jinyin Chen. 2022. Neuronfair: Interpretable white-box fairness testing through biased neuron identification. In Proceedings of the 44th International Conference on Software Engineering. 1519–1531. Accessed: 2023. BlackFriday. https://www.kaggle.com/datasets/sdolezel/black-friday Zhenpeng Chen, Jie M Zhang, Max Hort, Mark Harman, and Federica Sarro. 2023. Fairness Testing: A Comprehensive Survey and Analysis of Trends. ACM Transactions on Software Engineering and Methodology. Ellen M Voorhees. 2001. The TREC question answering track. Natural Language Engineering, 7, 4 (2001), 361–378. Accessed: 2023. Pandas. https://pandas.pydata.org Saeid Tizpaz-Niari, Ashish Kumar, Gang Tan, and Ashutosh Trivedi. 2022. Fairness-aware configuration of machine learning libraries. In Proceedings of the 44th International Conference on Software Engineering. 909–920. Junjie Chen, Zhuo Wu, Zan Wang, Hanmo You, Lingming Zhang, and Ming Yan. 2020. Practical accuracy estimation for efficient deep neural network testing. ACM Transactions on Software Engineering and Methodology (TOSEM), 29, 4 (2020), 1–35. Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20, 4 (2002), 422–446. Mengting Wan, Jianmo Ni, Rishabh Misra, and Julian McAuley. 2020. Addressing marketing bias in product recommendations. In Proceedings of the 13th international conference on web search and data mining. 618–626. Rico Angell, Brittany Johnson, Yuriy Brun, and Alexandra Meliou. 2018. Themis: Automatically testing software for discrimination. In Proceedings of the 2018 26th ACM Joint meeting on european software engineering conference and symposium on the foundations of software engineering. 871–875. Chuhan Wu, Fangzhao Wu, Xiting Wang, Yongfeng Huang, and Xing Xie. 2021. Fairness-aware News Recommendation with Decomposed Adversarial Learning. In AAAI. AAAI Press, 4462–4469. Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. ACM computing surveys (CSUR), 52, 1 (2019), 1–38. Eduard Baranov, Axel Legay, and Kuldeep S Meel. 2020. Baital: an adaptive weighted sampling approach for improved t-wise coverage. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1114–1126. Le Wu, Lei Chen, Pengyang Shao, Richang Hong, Xiting Wang, and Meng Wang. 2021. Learning fair representations for recommendation: A graph-based perspective. In Proceedings of the Web Conference 2021. 2198–2208. Himan Abdollahpouri, Masoud Mansoury, Robin Burke, Bamshad Mobasher, and Edward Malthouse. 2021. User-centered evaluation of popularity bias in recommender systems. In Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization. 119–129. Charles X Ling, Jin Huang, and Harry Zhang. 2003. AUC: a statistically consistent and more discriminating measure than accuracy. In Ijcai. 3, 519–524. Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. 2017. Fairness testing: testing software for discrimination. In Proceedings of the 2017 11th Joint meeting on foundations of software engineering. 498–510. Nian Si, Karthyek Murthy, Jose H. Blanchet, and Viet Anh Nguyen. 2021. Testing Group Fairness via Optimal Transport Projections. In ICML (Proceedings of Machine Learning Research, Vol. 139). PMLR, 9649–9659. Rishabh Mehrotra, Ashton Anderson, Fernando Diaz, Amit Sharma, Hanna Wallach, and Emine Yilmaz. 2017. Auditing search engines for differential satisfaction across demographics. In Proceedings of the 26th international conference on World Wide Web companion. 626–633. Michael D Ekstrand, Mucun Tian, Ion Madrazo Azpiazu, Jennifer D Ekstrand, Oghenemaro Anuyah, David McNeill, and Maria Soledad Pera. 2018. All the cool kids, how do they fit in?: Popularity and demographic biases in recommender evaluation and effectiveness. In Conference on fairness, accountability and transparency. 172–186. Lijing Qin and Xiaoyan Zhu. 2013. Promoting diversity in recommendation by entropy regularizer. In Twenty-Third International Joint Conference on Artificial Intelligence. Xu Chen, Jingsen Zhang, Lei Wang, Quanyu Dai, Zhenhua Dong, Ruiming Tang, Rui Zhang, Li Chen, and Ji-Rong Wen. 2023. REASONER: An Explainable Recommendation Dataset with Multi-aspect Real User Labeled Ground Truths Towards more Measurable Explainable Recommendation. CoRR, abs/2303.00168 (2023). Bin Liu, Ruiming Tang, Yingzhi Chen, Jinkai Yu, Huifeng Guo, and Yuzhou Zhang. 2019. Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction. In WWW. ACM, 1119–1129. András Vargha and Harold D Delaney. 2000. A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics, 25, 2 (2000), 101–132. Verya Monjezi, Ashutosh Trivedi, Gang Tan, and Saeid Tizpaz-Niari. 2023. Information-theoretic testing and debugging of fairness defects in deep neural networks. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 1571–1582. Huizhong Guo, Jinfeng Li, Jingyi Wang, Xiangyu Liu, Dongxia Wang, Zehong Hu, Rong Zhang, and Hui Xue. 2023. FairRec: Fairness Testing for Deep Recommender Systems. In ISSTA. ACM, 310–321. Mengdi Zhang, Jun Sun, Jingyi Wang, and Bing Sun. 2022. TestSGD: Interpretable Testing of Neural Networks Against Subtle Group Discrimination. ACM Transactions on Software Engineering and Methodology. Zan Wang, Ming Yan, Junjie Chen, Shuang Liu, and Dongdi Zhang. 2020. Deep learning library testing via effective model generation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 788–799. Peixin Zhang, Jingyi Wang, Jun Sun, Guoliang Dong, Xinyu Wang, Xingen Wang, Jin Song Dong, and Ting Dai. 2020. White-box fairness testing through adversarial sampling. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 949–960. Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247. Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, and Mustafa Ispir. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10. Yunqi Li, Hanxiong Chen, Shuyuan Xu, Yingqiang Ge, and Yongfeng Zhang. 2021. Towards personalized fairness based on causal notion. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1054–1063. Junjie Yang, Jiajun Jiang, Zeyu Sun, and Junjie Chen. 2024. A large-scale empirical study on improving the fairness of image classification models. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 210–222. Accessed: 2023. Homepage. https://github.com/anonyProjects/FairAS Elizabeth Gómez, Carlos Shui Zhang, Ludovico Boratto, Maria Salamó, and Mirko Marras. 2021. The winner takes it all: geographic imbalance and provider (un) fairness in educational recommender systems. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1808–1812. Steven K Thompson. 2012. Sampling, Third Edition. John Wiley & Sons, Inc.. Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web. 173–182. Sakshi Udeshi, Pryanshu Arora, and Sudipta Chattopadhyay. 2018. Automated directed fairness testing. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 98–108. Chuan Luo, Binqi Sun, Bo Qiao, Junjie Chen, Hongyu Zhang, Jinkun Lin, Qingwei Lin, and Dongmei Zhang. 2021. LS-Sampling: An Effective Local Search based Sampling Approach for Achieving High t-wise Coverage. In Proceedings of ESEC/FSE 2021. 1081–1092. Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25 (2012). Antonio Guerriero, Roberto Pietrantuono, and Stefano Russo. 2021. Operation is the hardest teacher: estimating DNN accuracy looking for mispredictions. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 348–358. F Maxwell Harper and Joseph A Konstan. 2015. The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5, 4 (2015), 1–19. Shutao Li, Mingkui Tan, Ivor W Tsang, and James Tin-Yau Kwok. 2011. A hybrid PSO-BFGS strategy for global optimization of multimodal functions. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 41, 4 (2011), 1003–1014. Yunqi Li, Hanxiong Chen, Zuohui Fu, Yingqiang Ge, and Yongfeng Zhang. 2021. User-oriented Fairness in Recommendation. In WWW. ACM / IW3C2, 624–632. Hanmo You, Zan Wang, Junjie Chen, Shuang Liu, and Shuochuan Li. 2023. Regression Fuzzing for Deep Learning Systems. In ICSE. IEEE, 82–94. Vishnu Asutosh Dasu, Ashish Kumar, Saeid Tizpaz-Niari, and Gang Tan. 2024. NeuFair: Neural Network Fairness Repair with Dropout. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 1541–1553. Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-objective automated testing for android applications. In Proceedings of the 25th international symposium on software testing and analysis. 94–105. Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthro Baranov Eduard (e_1_2_1_6_1) 2020 Dasu Vishnu Asutosh (e_1_2_1_13_1) 2024 Maxwell Harper F (e_1_2_1_21_1) 2015 Thompson Steven K (e_1_2_1_36_1) e_1_2_1_20_1 e_1_2_1_41_1 e_1_2_1_24_1 e_1_2_1_22_1 e_1_2_1_43_1 You Hanmo (e_1_2_1_48_1) e_1_2_1_28_1 e_1_2_1_49_1 e_1_2_1_26_1 Chen Xu (e_1_2_1_10_1) 2023 Chen Zhenpeng (e_1_2_1_11_1) 2023 e_1_2_1_31_1 e_1_2_1_8_1 e_1_2_1_12_1 e_1_2_1_50_1 e_1_2_1_4_1 Vargha András (e_1_2_1_39_1) 2000; 25 e_1_2_1_33_1 e_1_2_1_52_1 e_1_2_1_2_1 e_1_2_1_16_1 Yang Junjie (e_1_2_1_47_1) 2024 Si Nian (e_1_2_1_34_1) 2021; 9659 e_1_2_1_37_1 e_1_2_1_18_1 Wilcoxon Frank (e_1_2_1_44_1) e_1_2_1_42_1 e_1_2_1_40_1 Ekstrand Michael D (e_1_2_1_14_1) 2018 e_1_2_1_23_1 e_1_2_1_46_1 e_1_2_1_25_1 e_1_2_1_29_1 Wu Chuhan (e_1_2_1_45_1) e_1_2_1_7_1 e_1_2_1_30_1 e_1_2_1_5_1 e_1_2_1_3_1 e_1_2_1_1_1 Ling Charles X (e_1_2_1_27_1) 2003; 3 e_1_2_1_32_1 Snoek Jasper (e_1_2_1_35_1) 2012 e_1_2_1_53_1 Zhang Shuai (e_1_2_1_51_1) 2019 e_1_2_1_17_1 e_1_2_1_38_1 e_1_2_1_15_1 e_1_2_1_9_1 e_1_2_1_19_1 |
| References_xml | – reference: Junjie Yang, Jiajun Jiang, Zeyu Sun, and Junjie Chen. 2024. A large-scale empirical study on improving the fairness of image classification models. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 210–222. – reference: Charles X Ling, Jin Huang, and Harry Zhang. 2003. AUC: a statistically consistent and more discriminating measure than accuracy. In Ijcai. 3, 519–524. – reference: Ellen M Voorhees. 2001. The TREC question answering track. Natural Language Engineering, 7, 4 (2001), 361–378. – reference: Chuan Luo, Binqi Sun, Bo Qiao, Junjie Chen, Hongyu Zhang, Jinkun Lin, Qingwei Lin, and Dongmei Zhang. 2021. LS-Sampling: An Effective Local Search based Sampling Approach for Achieving High t-wise Coverage. In Proceedings of ESEC/FSE 2021. 1081–1092. – reference: Chuhan Wu, Fangzhao Wu, Xiting Wang, Yongfeng Huang, and Xing Xie. 2021. Fairness-aware News Recommendation with Decomposed Adversarial Learning. In AAAI. AAAI Press, 4462–4469. – reference: Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. ACM computing surveys (CSUR), 52, 1 (2019), 1–38. – reference: Vishnu Asutosh Dasu, Ashish Kumar, Saeid Tizpaz-Niari, and Gang Tan. 2024. NeuFair: Neural Network Fairness Repair with Dropout. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 1541–1553. – reference: Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthroughs in Statistics: Methodology and Distribution. Springer, 196–202. – reference: Rico Angell, Brittany Johnson, Yuriy Brun, and Alexandra Meliou. 2018. Themis: Automatically testing software for discrimination. In Proceedings of the 2018 26th ACM Joint meeting on european software engineering conference and symposium on the foundations of software engineering. 871–875. – reference: Wenjie Wang, Fuli Feng, Xiangnan He, Xiang Wang, and Tat-Seng Chua. 2021. Deconfounded recommendation for alleviating bias amplification. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1717–1725. – reference: Himan Abdollahpouri, Masoud Mansoury, Robin Burke, Bamshad Mobasher, and Edward Malthouse. 2021. User-centered evaluation of popularity bias in recommender systems. In Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization. 119–129. – reference: Accessed: 2023. BlackFriday. https://www.kaggle.com/datasets/sdolezel/black-friday – reference: Xu Chen, Jingsen Zhang, Lei Wang, Quanyu Dai, Zhenhua Dong, Ruiming Tang, Rui Zhang, Li Chen, and Ji-Rong Wen. 2023. REASONER: An Explainable Recommendation Dataset with Multi-aspect Real User Labeled Ground Truths Towards more Measurable Explainable Recommendation. CoRR, abs/2303.00168 (2023). – reference: Huizhong Guo, Jinfeng Li, Jingyi Wang, Xiangyu Liu, Dongxia Wang, Zehong Hu, Rong Zhang, and Hui Xue. 2023. FairRec: Fairness Testing for Deep Recommender Systems. In ISSTA. ACM, 310–321. – reference: Yunqi Li, Hanxiong Chen, Shuyuan Xu, Yingqiang Ge, and Yongfeng Zhang. 2021. Towards personalized fairness based on causal notion. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1054–1063. – reference: Antonio Guerriero, Roberto Pietrantuono, and Stefano Russo. 2021. Operation is the hardest teacher: estimating DNN accuracy looking for mispredictions. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 348–358. – reference: Sakshi Udeshi, Pryanshu Arora, and Sudipta Chattopadhyay. 2018. Automated directed fairness testing. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 98–108. – reference: Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-objective automated testing for android applications. In Proceedings of the 25th international symposium on software testing and analysis. 94–105. – reference: Cai-Nicolas Ziegler, Sean M McNee, Joseph A Konstan, and Georg Lausen. 2005. Improving recommendation lists through topic diversification. In Proceedings of the 14th international conference on World Wide Web. 22–32. – reference: Saeid Tizpaz-Niari, Ashish Kumar, Gang Tan, and Ashutosh Trivedi. 2022. Fairness-aware configuration of machine learning libraries. In Proceedings of the 44th International Conference on Software Engineering. 909–920. – reference: Verya Monjezi, Ashutosh Trivedi, Gang Tan, and Saeid Tizpaz-Niari. 2023. Information-theoretic testing and debugging of fairness defects in deep neural networks. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 1571–1582. – reference: Peixin Zhang, Jingyi Wang, Jun Sun, Guoliang Dong, Xinyu Wang, Xingen Wang, Jin Song Dong, and Ting Dai. 2020. White-box fairness testing through adversarial sampling. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 949–960. – reference: Shutao Li, Mingkui Tan, Ivor W Tsang, and James Tin-Yau Kwok. 2011. A hybrid PSO-BFGS strategy for global optimization of multimodal functions. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 41, 4 (2011), 1003–1014. – reference: András Vargha and Harold D Delaney. 2000. A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics, 25, 2 (2000), 101–132. – reference: Chongming Gao, Shijun Li, Wenqiang Lei, Jiawei Chen, Biao Li, Peng Jiang, Xiangnan He, Jiaxin Mao, and Tat-Seng Chua. 2022. KuaiRec: A Fully-Observed Dataset and Insights for Evaluating Recommender Systems. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM ’22). 540–550. https://doi.org/10.1145/3511808.3557220 10.1145/3511808.3557220 – reference: Òscar Celma Herrada. 2009. Music recommendation and discovery in the long tail. Universitat Pompeu Fabra. – reference: Zhenpeng Chen, Jie M Zhang, Max Hort, Mark Harman, and Federica Sarro. 2023. Fairness Testing: A Comprehensive Survey and Analysis of Trends. ACM Transactions on Software Engineering and Methodology. – reference: Hanmo You, Zan Wang, Junjie Chen, Shuang Liu, and Shuochuan Li. 2023. Regression Fuzzing for Deep Learning Systems. In ICSE. IEEE, 82–94. – reference: Accessed: 2023. Pandas. https://pandas.pydata.org/ – reference: Le Wu, Lei Chen, Pengyang Shao, Richang Hong, Xiting Wang, and Meng Wang. 2021. Learning fair representations for recommendation: A graph-based perspective. In Proceedings of the Web Conference 2021. 2198–2208. – reference: Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20, 4 (2002), 422–446. – reference: Eduard Baranov, Axel Legay, and Kuldeep S Meel. 2020. Baital: an adaptive weighted sampling approach for improved t-wise coverage. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1114–1126. – reference: Junjie Chen, Zhuo Wu, Zan Wang, Hanmo You, Lingming Zhang, and Ming Yan. 2020. Practical accuracy estimation for efficient deep neural network testing. ACM Transactions on Software Engineering and Methodology (TOSEM), 29, 4 (2020), 1–35. – reference: Yunqi Li, Hanxiong Chen, Zuohui Fu, Yingqiang Ge, and Yongfeng Zhang. 2021. User-oriented Fairness in Recommendation. In WWW. ACM / IW3C2, 624–632. – reference: Lijing Qin and Xiaoyan Zhu. 2013. Promoting diversity in recommendation by entropy regularizer. In Twenty-Third International Joint Conference on Artificial Intelligence. – reference: F Maxwell Harper and Joseph A Konstan. 2015. The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5, 4 (2015), 1–19. – reference: Haibin Zheng, Zhiqing Chen, Tianyu Du, Xuhong Zhang, Yao Cheng, Shouling Ji, Jingyi Wang, Yue Yu, and Jinyin Chen. 2022. Neuronfair: Interpretable white-box fairness testing through biased neuron identification. In Proceedings of the 44th International Conference on Software Engineering. 1519–1531. – reference: Bo Chen, Yichao Wang, Zhirong Liu, Ruiming Tang, Wei Guo, Hongkun Zheng, Weiwei Yao, Muyu Zhang, and Xiuqiang He. 2021. Enhancing explicit and implicit feature interactions via information sharing for parallel deep ctr models. In Proceedings of the 30th ACM international conference on information & knowledge management. 3757–3766. – reference: Accessed: 2023. Homepage. https://github.com/anonyProjects/FairAS – reference: Mengting Wan, Jianmo Ni, Rishabh Misra, and Julian McAuley. 2020. Addressing marketing bias in product recommendations. In Proceedings of the 13th international conference on web search and data mining. 618–626. – reference: Nian Si, Karthyek Murthy, Jose H. Blanchet, and Viet Anh Nguyen. 2021. Testing Group Fairness via Optimal Transport Projections. In ICML (Proceedings of Machine Learning Research, Vol. 139). PMLR, 9649–9659. – reference: Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247. – reference: Mengdi Zhang, Jun Sun, Jingyi Wang, and Bing Sun. 2022. TestSGD: Interpretable Testing of Neural Networks Against Subtle Group Discrimination. ACM Transactions on Software Engineering and Methodology. – reference: Michael D Ekstrand, Mucun Tian, Ion Madrazo Azpiazu, Jennifer D Ekstrand, Oghenemaro Anuyah, David McNeill, and Maria Soledad Pera. 2018. All the cool kids, how do they fit in?: Popularity and demographic biases in recommender evaluation and effectiveness. In Conference on fairness, accountability and transparency. 172–186. – reference: Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, and Mustafa Ispir. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10. – reference: Steven K Thompson. 2012. Sampling, Third Edition. John Wiley & Sons, Inc.. – reference: Elizabeth Gómez, Carlos Shui Zhang, Ludovico Boratto, Maria Salamó, and Mirko Marras. 2021. The winner takes it all: geographic imbalance and provider (un) fairness in educational recommender systems. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1808–1812. – reference: Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25 (2012). – reference: Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. 2017. Fairness testing: testing software for discrimination. In Proceedings of the 2017 11th Joint meeting on foundations of software engineering. 498–510. – reference: Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web. 173–182. – reference: Bin Liu, Ruiming Tang, Yingzhi Chen, Jinkai Yu, Huifeng Guo, and Yuzhou Zhang. 2019. Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction. In WWW. ACM, 1119–1129. – reference: Rishabh Mehrotra, Ashton Anderson, Fernando Diaz, Amit Sharma, Hanna Wallach, and Emine Yilmaz. 2017. Auditing search engines for differential satisfaction across demographics. In Proceedings of the 26th international conference on World Wide Web companion. 626–633. – reference: Zan Wang, Ming Yan, Junjie Chen, Shuang Liu, and Dongdi Zhang. 2020. Deep learning library testing via effective model generation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 788–799. – ident: e_1_2_1_24_1 doi: 10.1109/TSMCB.2010.2103055 – ident: e_1_2_1_46_1 doi: 10.1145/3442381.3450015 – ident: e_1_2_1_28_1 doi: 10.1145/3308558.3313497 – volume-title: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 1541–1553 year: 2024 ident: e_1_2_1_13_1 – ident: e_1_2_1_17_1 doi: 10.1145/3404835.3463235 – ident: e_1_2_1_26_1 doi: 10.1145/3404835.3462966 – ident: e_1_2_1_41_1 doi: 10.1145/3336191.3371855 – volume-title: Breakthroughs in Statistics: Methodology and Distribution ident: e_1_2_1_44_1 – ident: e_1_2_1_19_1 doi: 10.1145/3597926.3598058 – ident: e_1_2_1_20_1 – ident: e_1_2_1_38_1 doi: 10.1145/3238147.3238165 – ident: e_1_2_1_23_1 doi: 10.1145/582415.582418 – volume: 25 start-page: 101 year: 2000 ident: e_1_2_1_39_1 article-title: A critique and improvement of the CL common language effect size statistics of McGraw and Wong publication-title: Journal of Educational and Behavioral Statistics – ident: e_1_2_1_42_1 doi: 10.1145/3447548.3467249 – ident: e_1_2_1_32_1 doi: 10.1109/ICSE48619.2023.00136 – ident: e_1_2_1_12_1 doi: 10.1145/2988450.2988454 – ident: e_1_2_1_5_1 doi: 10.1145/3236024.3264590 – ident: e_1_2_1_37_1 doi: 10.1145/3510003.3510202 – volume-title: REASONER: An Explainable Recommendation Dataset with Multi-aspect Real User Labeled Ground Truths Towards more Measurable Explainable Recommendation. CoRR, abs/2303.00168 year: 2023 ident: e_1_2_1_10_1 – ident: e_1_2_1_50_1 doi: 10.1145/3377811.3380331 – volume-title: Sampling ident: e_1_2_1_36_1 – ident: e_1_2_1_30_1 doi: 10.1145/2931037.2931054 – ident: e_1_2_1_49_1 doi: 10.1145/3591869 – ident: e_1_2_1_2_1 – volume-title: Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25 year: 2012 ident: e_1_2_1_35_1 – ident: e_1_2_1_31_1 doi: 10.1145/3041021.3054197 – ident: e_1_2_1_25_1 doi: 10.1145/3442381.3449866 – volume-title: Deep learning based recommender system: A survey and new perspectives. ACM computing surveys (CSUR), 52, 1 year: 2019 ident: e_1_2_1_51_1 – ident: e_1_2_1_15_1 doi: 10.1145/3106237.3106277 – volume-title: Fairness Testing: A Comprehensive Survey and Analysis of Trends. ACM Transactions on Software Engineering and Methodology. year: 2023 ident: e_1_2_1_11_1 – ident: e_1_2_1_33_1 doi: 10.5555/2540128.2540517 – volume-title: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 210–222 year: 2024 ident: e_1_2_1_47_1 – ident: e_1_2_1_53_1 doi: 10.1145/1060745.1060754 – volume: 3 start-page: 519 year: 2003 ident: e_1_2_1_27_1 article-title: AUC: a statistically consistent and more discriminating measure than accuracy publication-title: Ijcai. – ident: e_1_2_1_1_1 – ident: e_1_2_1_18_1 doi: 10.1109/ICSE43902.2021.00042 – volume-title: Fairness-aware News Recommendation with Decomposed Adversarial Learning ident: e_1_2_1_45_1 – volume-title: The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5, 4 year: 2015 ident: e_1_2_1_21_1 – ident: e_1_2_1_29_1 doi: 10.1145/3468264.3468622 – ident: e_1_2_1_3_1 – ident: e_1_2_1_43_1 doi: 10.1145/3368089.3409761 – volume-title: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1114–1126 year: 2020 ident: e_1_2_1_6_1 – ident: e_1_2_1_22_1 doi: 10.1145/3038912.3052569 – ident: e_1_2_1_40_1 doi: 10.1017/S1351324901002789 – ident: e_1_2_1_7_1 doi: 10.1007/978-3-642-13287-2 – ident: e_1_2_1_4_1 doi: 10.1145/3450613.3456821 – volume: 9659 volume-title: ICML (Proceedings of Machine Learning Research year: 2021 ident: e_1_2_1_34_1 – volume-title: Regression Fuzzing for Deep Learning Systems ident: e_1_2_1_48_1 – ident: e_1_2_1_9_1 doi: 10.1145/3394112 – ident: e_1_2_1_52_1 doi: 10.1145/3510003.3510123 – volume-title: Conference on fairness, accountability and transparency. 172–186 year: 2018 ident: e_1_2_1_14_1 – ident: e_1_2_1_16_1 doi: 10.1145/3511808.3557220 – ident: e_1_2_1_8_1 doi: 10.1145/3459637.3481915 |
| SSID | ssj0002991170 |
| Score | 2.2953317 |
| Snippet | Recommender systems play an increasingly important role in modern society, powering digital platforms that suggest a wide array of content, from news and music... |
| SourceID | crossref acm |
| SourceType | Index Database Publisher |
| StartPage | 1607 |
| SubjectTerms | Information systems Recommender systems Search-based software engineering Software and its engineering Software testing and debugging |
| SubjectTermsDisplay | Information systems -- Recommender systems Software and its engineering -- Search-based software engineering Software and its engineering -- Software testing and debugging |
| Title | No Bias Left Behind: Fairness Testing for Deep Recommender Systems Targeting General Disadvantaged Groups |
| URI | https://dl.acm.org/doi/10.1145/3728948 |
| Volume | 2 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2994-970X dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002991170 issn: 2994-970X databaseCode: M~E dateStart: 20240101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELaWwoELjwKi5SEfuK0i1l4nTrgtLRVI7apSc1hxqRzHq43EJqvNpvTU_8O_ZMZ2Hi1IwIFLFE3iKMp8mlfG3xDyDvlI5HIaBYoLHQjORBBnOguWyyjMmFwmWll2_VM5n8eLRXI-Gv1o98JcfZNlGV9fJ5v_qmqQgbJx6-w_qLt7KAjgHJQOR1A7HP9K8fNq_LFQ9fjUYF-7WUHSjWn_iSq21qylyKvh2yePjdlg5Fit13akXEtgPk5tg7gtTjleaqTptP0COzBAvoZVDyPb884T1m3fwezoDP9F1GDpv2ODmem5Dztf0Ni_I6um6iv7zvp8HTQLNa6eu2p62bFduCiULScPaxc8xB4rPihnciQmTuRk4bzRb2TeRvMBFL9cXKSzgc1FiryB_2aRK6H86hsE0mhMJaSYjtzzNvv2Ha_Y9Sq6ndvhpV94j9znMkzQgJ7d9OU8eGuc4YOjDNv3d1u0ce17vxYjIL0eRECDUCZ9Qh75HITOHHaekpEp98njdr4H9eb-GSnmFUUoUYQSdVD6QFsgUQ8kCkCiCCQ6ABL1QKIdkKgHEr0FJOqA9JykJ5_So8-BH80RqFjEQcwVz7NJFkWZCsHiCyU4pJpsIiOtmWG5nmYJJPO5XGaGJYpplnHDWRbqiZK5mb4ge2VVmpeEMm20ACmEwkZA9B5Lk_MYXUOEw9DFAdmHb3a5cdwrrRYOCG2_YXfpjqIO_3zLK_KwR-VrsrfbNuYNeaCvdkW9fWsV_BPtYH7l |
| linkProvider | ISSN International Centre |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=No+Bias+Left+Behind%3A+Fairness+Testing+for+Deep+Recommender+Systems+Targeting+General+Disadvantaged+Groups&rft.jtitle=Proceedings+of+the+ACM+on+software+engineering&rft.au=Wu%2C+Zhuo&rft.au=Wang%2C+Zan&rft.au=Luo%2C+Chuan&rft.au=Du%2C+Xiaoning&rft.date=2025-06-22&rft.issn=2994-970X&rft.eissn=2994-970X&rft.volume=2&rft.issue=ISSTA&rft.spage=1607&rft.epage=1629&rft_id=info:doi/10.1145%2F3728948&rft.externalDBID=n%2Fa&rft.externalDocID=10_1145_3728948 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2994-970X&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2994-970X&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2994-970X&client=summon |