No Bias Left Behind: Fairness Testing for Deep Recommender Systems Targeting General Disadvantaged Groups

Recommender systems play an increasingly important role in modern society, powering digital platforms that suggest a wide array of content, from news and music to job listings, and influencing many aspects of daily life. To improve personalization, these systems often use demographic information. Ho...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings of the ACM on software engineering Ročník 2; číslo ISSTA; s. 1607 - 1629
Hlavní autori: Wu, Zhuo, Wang, Zan, Luo, Chuan, Du, Xiaoning, Chen, Junjie
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York, NY, USA ACM 22.06.2025
Predmet:
ISSN:2994-970X, 2994-970X
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Recommender systems play an increasingly important role in modern society, powering digital platforms that suggest a wide array of content, from news and music to job listings, and influencing many aspects of daily life. To improve personalization, these systems often use demographic information. However, ensuring fairness in recommendation quality across demographic groups is challenging, especially since recommender systems are susceptible to the "rich get richer'' Matthew effect due to user feedback loops. With the adoption of deep learning algorithms, uncovering fairness issues has become even more complex. Researchers have started to explore methods for identifying the most disadvantaged user groups using optimization algorithms. Despite this, suboptimal disadvantaged groups remain underexplored, which leaves the risk of bias amplification due to the Matthew effect unaddressed. In this paper, we argue for the necessity of identifying both the most disadvantaged and suboptimal disadvantaged groups. We introduce FairAS, an adaptive sampling based approach, to achieve this goal. Through evaluations on four deep recommender systems and six datasets, FairAS demonstrates an average improvement of 19.2% in identifying the most disadvantaged groups over the state-of-the-art fairness testing approach (FairRec), while reducing testing time by 43.07%. Additionally, the extra suboptimal disadvantaged groups identified by FairAS help improve system fairness, achieving an average improvement of 70.27% over FairRec across all subjects.
AbstractList Recommender systems play an increasingly important role in modern society, powering digital platforms that suggest a wide array of content, from news and music to job listings, and influencing many aspects of daily life. To improve personalization, these systems often use demographic information. However, ensuring fairness in recommendation quality across demographic groups is challenging, especially since recommender systems are susceptible to the "rich get richer'' Matthew effect due to user feedback loops. With the adoption of deep learning algorithms, uncovering fairness issues has become even more complex. Researchers have started to explore methods for identifying the most disadvantaged user groups using optimization algorithms. Despite this, suboptimal disadvantaged groups remain underexplored, which leaves the risk of bias amplification due to the Matthew effect unaddressed. In this paper, we argue for the necessity of identifying both the most disadvantaged and suboptimal disadvantaged groups. We introduce FairAS, an adaptive sampling based approach, to achieve this goal. Through evaluations on four deep recommender systems and six datasets, FairAS demonstrates an average improvement of 19.2% in identifying the most disadvantaged groups over the state-of-the-art fairness testing approach (FairRec), while reducing testing time by 43.07%. Additionally, the extra suboptimal disadvantaged groups identified by FairAS help improve system fairness, achieving an average improvement of 70.27% over FairRec across all subjects.
ArticleNumber ISSTA071
Author Wang, Zan
Wu, Zhuo
Chen, Junjie
Luo, Chuan
Du, Xiaoning
Author_xml – sequence: 1
  givenname: Zhuo
  orcidid: 0009-0006-0165-8746
  surname: Wu
  fullname: Wu, Zhuo
  email: wuzhuo@tju.edu.cn
  organization: Tianjin University, Tianjin, China
– sequence: 2
  givenname: Zan
  orcidid: 0000-0001-6173-8170
  surname: Wang
  fullname: Wang, Zan
  email: wangzan@tju.edu.cn
  organization: Tianjin University, Tianjin, China
– sequence: 3
  givenname: Chuan
  orcidid: 0000-0001-5028-1064
  surname: Luo
  fullname: Luo, Chuan
  email: chuanluo@buaa.edu.cn
  organization: Beihang University, Beijing, China
– sequence: 4
  givenname: Xiaoning
  orcidid: 0000-0003-3728-9541
  surname: Du
  fullname: Du, Xiaoning
  email: xiaoning.du@monash.edu
  organization: Monash University, Melbourne, Australia
– sequence: 5
  givenname: Junjie
  orcidid: 0000-0003-3056-9962
  surname: Chen
  fullname: Chen, Junjie
  email: junjiechen@tju.edu.cn
  organization: Tianjin University, Tianjin, China
BookMark eNpN0EFPAjEQBeDGYCIi8e6pN0-rbSm09SYgaEI0UQ7eNtN2FmvYLmlXE_69KGg8zWTelzm8U9KJTURCzjm74lwOrwdKaCP1EekKY2RhFHvt_NtPSD_nd8bY7sK5Yl0SHhs6DpDpAquWjvEtRH9DZxBSxJzpEnMb4opWTaJTxA19RtfUNUaPib5sc4v1DkFa4Q-bY8QEazoNGfwnxBZW6Ok8NR-bfEaOK1hn7B9mjyxnd8vJfbF4mj9MbhcFaKkLLUB4y-xoZGGoFZMgBZOGMzVyjiP3bmCN0MKryiI3wB23AgW3Q8dAeRz0yOX-rUtNzgmrcpNCDWlbclZ-d1QeOtrJi70EV_-h3_ALObZjVA
Cites_doi 10.1109/TSMCB.2010.2103055
10.1145/3442381.3450015
10.1145/3308558.3313497
10.1145/3404835.3463235
10.1145/3404835.3462966
10.1145/3336191.3371855
10.1145/3597926.3598058
10.1145/3238147.3238165
10.1145/582415.582418
10.1145/3447548.3467249
10.1109/ICSE48619.2023.00136
10.1145/2988450.2988454
10.1145/3236024.3264590
10.1145/3510003.3510202
10.1145/3377811.3380331
10.1145/2931037.2931054
10.1145/3591869
10.1145/3041021.3054197
10.1145/3442381.3449866
10.1145/3106237.3106277
10.5555/2540128.2540517
10.1145/1060745.1060754
10.1109/ICSE43902.2021.00042
10.1145/3468264.3468622
10.1145/3368089.3409761
10.1145/3038912.3052569
10.1017/S1351324901002789
10.1007/978-3-642-13287-2
10.1145/3450613.3456821
10.1145/3394112
10.1145/3510003.3510123
10.1145/3511808.3557220
10.1145/3459637.3481915
ContentType Journal Article
Copyright Owner/Author
Copyright_xml – notice: Owner/Author
DBID AAYXX
CITATION
DOI 10.1145/3728948
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef

DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2994-970X
EndPage 1629
ExternalDocumentID 10_1145_3728948
3728948
GrantInformation_xml – fundername: Beijing Natural Science Foundation
  grantid: L241050
– fundername: CCF-Huawei Populus Grove Fund
  grantid: CCF-HuaweiFM2024005
– fundername: National Natural Science Foundation of China
  grantid: 62472310,62322208,62202025
  funderid: https://doi.org/10.13039/501100001809
– fundername: Young Elite Scientist Sponsorship Program by CAST
  grantid: YESS20230566
GroupedDBID AAKMM
ACM
AEJOY
AKRVB
ALMA_UNASSIGNED_HOLDINGS
LHSKQ
M~E
AAYXX
CITATION
ID FETCH-LOGICAL-a848-82a2db0b66ba58704a420491076cc1e1dc3b9282d7fbe19a1c1b2e21b5c0a7de3
ISSN 2994-970X
IngestDate Sat Nov 29 07:43:49 EST 2025
Mon Jul 14 20:48:59 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue ISSTA
Keywords Fairness Testing
Deep Recommender System
Search-based Software Testing
Language English
License This work is licensed under Creative Commons Attribution-NonCommercial-NoDerivs International 4.0.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-a848-82a2db0b66ba58704a420491076cc1e1dc3b9282d7fbe19a1c1b2e21b5c0a7de3
ORCID 0000-0001-5028-1064
0009-0006-0165-8746
0000-0001-6173-8170
0000-0003-3728-9541
0000-0003-3056-9962
OpenAccessLink https://dl.acm.org/doi/10.1145/3728948
PageCount 23
ParticipantIDs crossref_primary_10_1145_3728948
acm_primary_3728948
PublicationCentury 2000
PublicationDate 20250622
2025-06-22
PublicationDateYYYYMMDD 2025-06-22
PublicationDate_xml – month: 06
  year: 2025
  text: 20250622
  day: 22
PublicationDecade 2020
PublicationPlace New York, NY, USA
PublicationPlace_xml – name: New York, NY, USA
PublicationTitle Proceedings of the ACM on software engineering
PublicationTitleAbbrev ACM PACMSE
PublicationYear 2025
Publisher ACM
Publisher_xml – name: ACM
References Òscar Celma Herrada. 2009. Music recommendation and discovery in the long tail. Universitat Pompeu Fabra.
Haibin Zheng, Zhiqing Chen, Tianyu Du, Xuhong Zhang, Yao Cheng, Shouling Ji, Jingyi Wang, Yue Yu, and Jinyin Chen. 2022. Neuronfair: Interpretable white-box fairness testing through biased neuron identification. In Proceedings of the 44th International Conference on Software Engineering. 1519–1531.
Accessed: 2023. BlackFriday. https://www.kaggle.com/datasets/sdolezel/black-friday
Zhenpeng Chen, Jie M Zhang, Max Hort, Mark Harman, and Federica Sarro. 2023. Fairness Testing: A Comprehensive Survey and Analysis of Trends. ACM Transactions on Software Engineering and Methodology.
Ellen M Voorhees. 2001. The TREC question answering track. Natural Language Engineering, 7, 4 (2001), 361–378.
Accessed: 2023. Pandas. https://pandas.pydata.org
Saeid Tizpaz-Niari, Ashish Kumar, Gang Tan, and Ashutosh Trivedi. 2022. Fairness-aware configuration of machine learning libraries. In Proceedings of the 44th International Conference on Software Engineering. 909–920.
Junjie Chen, Zhuo Wu, Zan Wang, Hanmo You, Lingming Zhang, and Ming Yan. 2020. Practical accuracy estimation for efficient deep neural network testing. ACM Transactions on Software Engineering and Methodology (TOSEM), 29, 4 (2020), 1–35.
Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20, 4 (2002), 422–446.
Mengting Wan, Jianmo Ni, Rishabh Misra, and Julian McAuley. 2020. Addressing marketing bias in product recommendations. In Proceedings of the 13th international conference on web search and data mining. 618–626.
Rico Angell, Brittany Johnson, Yuriy Brun, and Alexandra Meliou. 2018. Themis: Automatically testing software for discrimination. In Proceedings of the 2018 26th ACM Joint meeting on european software engineering conference and symposium on the foundations of software engineering. 871–875.
Chuhan Wu, Fangzhao Wu, Xiting Wang, Yongfeng Huang, and Xing Xie. 2021. Fairness-aware News Recommendation with Decomposed Adversarial Learning. In AAAI. AAAI Press, 4462–4469.
Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. ACM computing surveys (CSUR), 52, 1 (2019), 1–38.
Eduard Baranov, Axel Legay, and Kuldeep S Meel. 2020. Baital: an adaptive weighted sampling approach for improved t-wise coverage. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1114–1126.
Le Wu, Lei Chen, Pengyang Shao, Richang Hong, Xiting Wang, and Meng Wang. 2021. Learning fair representations for recommendation: A graph-based perspective. In Proceedings of the Web Conference 2021. 2198–2208.
Himan Abdollahpouri, Masoud Mansoury, Robin Burke, Bamshad Mobasher, and Edward Malthouse. 2021. User-centered evaluation of popularity bias in recommender systems. In Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization. 119–129.
Charles X Ling, Jin Huang, and Harry Zhang. 2003. AUC: a statistically consistent and more discriminating measure than accuracy. In Ijcai. 3, 519–524.
Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. 2017. Fairness testing: testing software for discrimination. In Proceedings of the 2017 11th Joint meeting on foundations of software engineering. 498–510.
Nian Si, Karthyek Murthy, Jose H. Blanchet, and Viet Anh Nguyen. 2021. Testing Group Fairness via Optimal Transport Projections. In ICML (Proceedings of Machine Learning Research, Vol. 139). PMLR, 9649–9659.
Rishabh Mehrotra, Ashton Anderson, Fernando Diaz, Amit Sharma, Hanna Wallach, and Emine Yilmaz. 2017. Auditing search engines for differential satisfaction across demographics. In Proceedings of the 26th international conference on World Wide Web companion. 626–633.
Michael D Ekstrand, Mucun Tian, Ion Madrazo Azpiazu, Jennifer D Ekstrand, Oghenemaro Anuyah, David McNeill, and Maria Soledad Pera. 2018. All the cool kids, how do they fit in?: Popularity and demographic biases in recommender evaluation and effectiveness. In Conference on fairness, accountability and transparency. 172–186.
Lijing Qin and Xiaoyan Zhu. 2013. Promoting diversity in recommendation by entropy regularizer. In Twenty-Third International Joint Conference on Artificial Intelligence.
Xu Chen, Jingsen Zhang, Lei Wang, Quanyu Dai, Zhenhua Dong, Ruiming Tang, Rui Zhang, Li Chen, and Ji-Rong Wen. 2023. REASONER: An Explainable Recommendation Dataset with Multi-aspect Real User Labeled Ground Truths Towards more Measurable Explainable Recommendation. CoRR, abs/2303.00168 (2023).
Bin Liu, Ruiming Tang, Yingzhi Chen, Jinkai Yu, Huifeng Guo, and Yuzhou Zhang. 2019. Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction. In WWW. ACM, 1119–1129.
András Vargha and Harold D Delaney. 2000. A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics, 25, 2 (2000), 101–132.
Verya Monjezi, Ashutosh Trivedi, Gang Tan, and Saeid Tizpaz-Niari. 2023. Information-theoretic testing and debugging of fairness defects in deep neural networks. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 1571–1582.
Huizhong Guo, Jinfeng Li, Jingyi Wang, Xiangyu Liu, Dongxia Wang, Zehong Hu, Rong Zhang, and Hui Xue. 2023. FairRec: Fairness Testing for Deep Recommender Systems. In ISSTA. ACM, 310–321.
Mengdi Zhang, Jun Sun, Jingyi Wang, and Bing Sun. 2022. TestSGD: Interpretable Testing of Neural Networks Against Subtle Group Discrimination. ACM Transactions on Software Engineering and Methodology.
Zan Wang, Ming Yan, Junjie Chen, Shuang Liu, and Dongdi Zhang. 2020. Deep learning library testing via effective model generation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 788–799.
Peixin Zhang, Jingyi Wang, Jun Sun, Guoliang Dong, Xinyu Wang, Xingen Wang, Jin Song Dong, and Ting Dai. 2020. White-box fairness testing through adversarial sampling. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 949–960.
Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247.
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, and Mustafa Ispir. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10.
Yunqi Li, Hanxiong Chen, Shuyuan Xu, Yingqiang Ge, and Yongfeng Zhang. 2021. Towards personalized fairness based on causal notion. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1054–1063.
Junjie Yang, Jiajun Jiang, Zeyu Sun, and Junjie Chen. 2024. A large-scale empirical study on improving the fairness of image classification models. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 210–222.
Accessed: 2023. Homepage. https://github.com/anonyProjects/FairAS
Elizabeth Gómez, Carlos Shui Zhang, Ludovico Boratto, Maria Salamó, and Mirko Marras. 2021. The winner takes it all: geographic imbalance and provider (un) fairness in educational recommender systems. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1808–1812.
Steven K Thompson. 2012. Sampling, Third Edition. John Wiley & Sons, Inc..
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web. 173–182.
Sakshi Udeshi, Pryanshu Arora, and Sudipta Chattopadhyay. 2018. Automated directed fairness testing. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 98–108.
Chuan Luo, Binqi Sun, Bo Qiao, Junjie Chen, Hongyu Zhang, Jinkun Lin, Qingwei Lin, and Dongmei Zhang. 2021. LS-Sampling: An Effective Local Search based Sampling Approach for Achieving High t-wise Coverage. In Proceedings of ESEC/FSE 2021. 1081–1092.
Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25 (2012).
Antonio Guerriero, Roberto Pietrantuono, and Stefano Russo. 2021. Operation is the hardest teacher: estimating DNN accuracy looking for mispredictions. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 348–358.
F Maxwell Harper and Joseph A Konstan. 2015. The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5, 4 (2015), 1–19.
Shutao Li, Mingkui Tan, Ivor W Tsang, and James Tin-Yau Kwok. 2011. A hybrid PSO-BFGS strategy for global optimization of multimodal functions. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 41, 4 (2011), 1003–1014.
Yunqi Li, Hanxiong Chen, Zuohui Fu, Yingqiang Ge, and Yongfeng Zhang. 2021. User-oriented Fairness in Recommendation. In WWW. ACM / IW3C2, 624–632.
Hanmo You, Zan Wang, Junjie Chen, Shuang Liu, and Shuochuan Li. 2023. Regression Fuzzing for Deep Learning Systems. In ICSE. IEEE, 82–94.
Vishnu Asutosh Dasu, Ashish Kumar, Saeid Tizpaz-Niari, and Gang Tan. 2024. NeuFair: Neural Network Fairness Repair with Dropout. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 1541–1553.
Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-objective automated testing for android applications. In Proceedings of the 25th international symposium on software testing and analysis. 94–105.
Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthro
Baranov Eduard (e_1_2_1_6_1) 2020
Dasu Vishnu Asutosh (e_1_2_1_13_1) 2024
Maxwell Harper F (e_1_2_1_21_1) 2015
Thompson Steven K (e_1_2_1_36_1)
e_1_2_1_20_1
e_1_2_1_41_1
e_1_2_1_24_1
e_1_2_1_22_1
e_1_2_1_43_1
You Hanmo (e_1_2_1_48_1)
e_1_2_1_28_1
e_1_2_1_49_1
e_1_2_1_26_1
Chen Xu (e_1_2_1_10_1) 2023
Chen Zhenpeng (e_1_2_1_11_1) 2023
e_1_2_1_31_1
e_1_2_1_8_1
e_1_2_1_12_1
e_1_2_1_50_1
e_1_2_1_4_1
Vargha András (e_1_2_1_39_1) 2000; 25
e_1_2_1_33_1
e_1_2_1_52_1
e_1_2_1_2_1
e_1_2_1_16_1
Yang Junjie (e_1_2_1_47_1) 2024
Si Nian (e_1_2_1_34_1) 2021; 9659
e_1_2_1_37_1
e_1_2_1_18_1
Wilcoxon Frank (e_1_2_1_44_1)
e_1_2_1_42_1
e_1_2_1_40_1
Ekstrand Michael D (e_1_2_1_14_1) 2018
e_1_2_1_23_1
e_1_2_1_46_1
e_1_2_1_25_1
e_1_2_1_29_1
Wu Chuhan (e_1_2_1_45_1)
e_1_2_1_7_1
e_1_2_1_30_1
e_1_2_1_5_1
e_1_2_1_3_1
e_1_2_1_1_1
Ling Charles X (e_1_2_1_27_1) 2003; 3
e_1_2_1_32_1
Snoek Jasper (e_1_2_1_35_1) 2012
e_1_2_1_53_1
Zhang Shuai (e_1_2_1_51_1) 2019
e_1_2_1_17_1
e_1_2_1_38_1
e_1_2_1_15_1
e_1_2_1_9_1
e_1_2_1_19_1
References_xml – reference: Junjie Yang, Jiajun Jiang, Zeyu Sun, and Junjie Chen. 2024. A large-scale empirical study on improving the fairness of image classification models. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 210–222.
– reference: Charles X Ling, Jin Huang, and Harry Zhang. 2003. AUC: a statistically consistent and more discriminating measure than accuracy. In Ijcai. 3, 519–524.
– reference: Ellen M Voorhees. 2001. The TREC question answering track. Natural Language Engineering, 7, 4 (2001), 361–378.
– reference: Chuan Luo, Binqi Sun, Bo Qiao, Junjie Chen, Hongyu Zhang, Jinkun Lin, Qingwei Lin, and Dongmei Zhang. 2021. LS-Sampling: An Effective Local Search based Sampling Approach for Achieving High t-wise Coverage. In Proceedings of ESEC/FSE 2021. 1081–1092.
– reference: Chuhan Wu, Fangzhao Wu, Xiting Wang, Yongfeng Huang, and Xing Xie. 2021. Fairness-aware News Recommendation with Decomposed Adversarial Learning. In AAAI. AAAI Press, 4462–4469.
– reference: Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. ACM computing surveys (CSUR), 52, 1 (2019), 1–38.
– reference: Vishnu Asutosh Dasu, Ashish Kumar, Saeid Tizpaz-Niari, and Gang Tan. 2024. NeuFair: Neural Network Fairness Repair with Dropout. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 1541–1553.
– reference: Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthroughs in Statistics: Methodology and Distribution. Springer, 196–202.
– reference: Rico Angell, Brittany Johnson, Yuriy Brun, and Alexandra Meliou. 2018. Themis: Automatically testing software for discrimination. In Proceedings of the 2018 26th ACM Joint meeting on european software engineering conference and symposium on the foundations of software engineering. 871–875.
– reference: Wenjie Wang, Fuli Feng, Xiangnan He, Xiang Wang, and Tat-Seng Chua. 2021. Deconfounded recommendation for alleviating bias amplification. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1717–1725.
– reference: Himan Abdollahpouri, Masoud Mansoury, Robin Burke, Bamshad Mobasher, and Edward Malthouse. 2021. User-centered evaluation of popularity bias in recommender systems. In Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization. 119–129.
– reference: Accessed: 2023. BlackFriday. https://www.kaggle.com/datasets/sdolezel/black-friday
– reference: Xu Chen, Jingsen Zhang, Lei Wang, Quanyu Dai, Zhenhua Dong, Ruiming Tang, Rui Zhang, Li Chen, and Ji-Rong Wen. 2023. REASONER: An Explainable Recommendation Dataset with Multi-aspect Real User Labeled Ground Truths Towards more Measurable Explainable Recommendation. CoRR, abs/2303.00168 (2023).
– reference: Huizhong Guo, Jinfeng Li, Jingyi Wang, Xiangyu Liu, Dongxia Wang, Zehong Hu, Rong Zhang, and Hui Xue. 2023. FairRec: Fairness Testing for Deep Recommender Systems. In ISSTA. ACM, 310–321.
– reference: Yunqi Li, Hanxiong Chen, Shuyuan Xu, Yingqiang Ge, and Yongfeng Zhang. 2021. Towards personalized fairness based on causal notion. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1054–1063.
– reference: Antonio Guerriero, Roberto Pietrantuono, and Stefano Russo. 2021. Operation is the hardest teacher: estimating DNN accuracy looking for mispredictions. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 348–358.
– reference: Sakshi Udeshi, Pryanshu Arora, and Sudipta Chattopadhyay. 2018. Automated directed fairness testing. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 98–108.
– reference: Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-objective automated testing for android applications. In Proceedings of the 25th international symposium on software testing and analysis. 94–105.
– reference: Cai-Nicolas Ziegler, Sean M McNee, Joseph A Konstan, and Georg Lausen. 2005. Improving recommendation lists through topic diversification. In Proceedings of the 14th international conference on World Wide Web. 22–32.
– reference: Saeid Tizpaz-Niari, Ashish Kumar, Gang Tan, and Ashutosh Trivedi. 2022. Fairness-aware configuration of machine learning libraries. In Proceedings of the 44th International Conference on Software Engineering. 909–920.
– reference: Verya Monjezi, Ashutosh Trivedi, Gang Tan, and Saeid Tizpaz-Niari. 2023. Information-theoretic testing and debugging of fairness defects in deep neural networks. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 1571–1582.
– reference: Peixin Zhang, Jingyi Wang, Jun Sun, Guoliang Dong, Xinyu Wang, Xingen Wang, Jin Song Dong, and Ting Dai. 2020. White-box fairness testing through adversarial sampling. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 949–960.
– reference: Shutao Li, Mingkui Tan, Ivor W Tsang, and James Tin-Yau Kwok. 2011. A hybrid PSO-BFGS strategy for global optimization of multimodal functions. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 41, 4 (2011), 1003–1014.
– reference: András Vargha and Harold D Delaney. 2000. A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics, 25, 2 (2000), 101–132.
– reference: Chongming Gao, Shijun Li, Wenqiang Lei, Jiawei Chen, Biao Li, Peng Jiang, Xiangnan He, Jiaxin Mao, and Tat-Seng Chua. 2022. KuaiRec: A Fully-Observed Dataset and Insights for Evaluating Recommender Systems. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM ’22). 540–550. https://doi.org/10.1145/3511808.3557220 10.1145/3511808.3557220
– reference: Òscar Celma Herrada. 2009. Music recommendation and discovery in the long tail. Universitat Pompeu Fabra.
– reference: Zhenpeng Chen, Jie M Zhang, Max Hort, Mark Harman, and Federica Sarro. 2023. Fairness Testing: A Comprehensive Survey and Analysis of Trends. ACM Transactions on Software Engineering and Methodology.
– reference: Hanmo You, Zan Wang, Junjie Chen, Shuang Liu, and Shuochuan Li. 2023. Regression Fuzzing for Deep Learning Systems. In ICSE. IEEE, 82–94.
– reference: Accessed: 2023. Pandas. https://pandas.pydata.org/
– reference: Le Wu, Lei Chen, Pengyang Shao, Richang Hong, Xiting Wang, and Meng Wang. 2021. Learning fair representations for recommendation: A graph-based perspective. In Proceedings of the Web Conference 2021. 2198–2208.
– reference: Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20, 4 (2002), 422–446.
– reference: Eduard Baranov, Axel Legay, and Kuldeep S Meel. 2020. Baital: an adaptive weighted sampling approach for improved t-wise coverage. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1114–1126.
– reference: Junjie Chen, Zhuo Wu, Zan Wang, Hanmo You, Lingming Zhang, and Ming Yan. 2020. Practical accuracy estimation for efficient deep neural network testing. ACM Transactions on Software Engineering and Methodology (TOSEM), 29, 4 (2020), 1–35.
– reference: Yunqi Li, Hanxiong Chen, Zuohui Fu, Yingqiang Ge, and Yongfeng Zhang. 2021. User-oriented Fairness in Recommendation. In WWW. ACM / IW3C2, 624–632.
– reference: Lijing Qin and Xiaoyan Zhu. 2013. Promoting diversity in recommendation by entropy regularizer. In Twenty-Third International Joint Conference on Artificial Intelligence.
– reference: F Maxwell Harper and Joseph A Konstan. 2015. The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5, 4 (2015), 1–19.
– reference: Haibin Zheng, Zhiqing Chen, Tianyu Du, Xuhong Zhang, Yao Cheng, Shouling Ji, Jingyi Wang, Yue Yu, and Jinyin Chen. 2022. Neuronfair: Interpretable white-box fairness testing through biased neuron identification. In Proceedings of the 44th International Conference on Software Engineering. 1519–1531.
– reference: Bo Chen, Yichao Wang, Zhirong Liu, Ruiming Tang, Wei Guo, Hongkun Zheng, Weiwei Yao, Muyu Zhang, and Xiuqiang He. 2021. Enhancing explicit and implicit feature interactions via information sharing for parallel deep ctr models. In Proceedings of the 30th ACM international conference on information & knowledge management. 3757–3766.
– reference: Accessed: 2023. Homepage. https://github.com/anonyProjects/FairAS
– reference: Mengting Wan, Jianmo Ni, Rishabh Misra, and Julian McAuley. 2020. Addressing marketing bias in product recommendations. In Proceedings of the 13th international conference on web search and data mining. 618–626.
– reference: Nian Si, Karthyek Murthy, Jose H. Blanchet, and Viet Anh Nguyen. 2021. Testing Group Fairness via Optimal Transport Projections. In ICML (Proceedings of Machine Learning Research, Vol. 139). PMLR, 9649–9659.
– reference: Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. arXiv preprint arXiv:1703.04247.
– reference: Mengdi Zhang, Jun Sun, Jingyi Wang, and Bing Sun. 2022. TestSGD: Interpretable Testing of Neural Networks Against Subtle Group Discrimination. ACM Transactions on Software Engineering and Methodology.
– reference: Michael D Ekstrand, Mucun Tian, Ion Madrazo Azpiazu, Jennifer D Ekstrand, Oghenemaro Anuyah, David McNeill, and Maria Soledad Pera. 2018. All the cool kids, how do they fit in?: Popularity and demographic biases in recommender evaluation and effectiveness. In Conference on fairness, accountability and transparency. 172–186.
– reference: Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, and Mustafa Ispir. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10.
– reference: Steven K Thompson. 2012. Sampling, Third Edition. John Wiley & Sons, Inc..
– reference: Elizabeth Gómez, Carlos Shui Zhang, Ludovico Boratto, Maria Salamó, and Mirko Marras. 2021. The winner takes it all: geographic imbalance and provider (un) fairness in educational recommender systems. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1808–1812.
– reference: Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25 (2012).
– reference: Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. 2017. Fairness testing: testing software for discrimination. In Proceedings of the 2017 11th Joint meeting on foundations of software engineering. 498–510.
– reference: Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web. 173–182.
– reference: Bin Liu, Ruiming Tang, Yingzhi Chen, Jinkai Yu, Huifeng Guo, and Yuzhou Zhang. 2019. Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction. In WWW. ACM, 1119–1129.
– reference: Rishabh Mehrotra, Ashton Anderson, Fernando Diaz, Amit Sharma, Hanna Wallach, and Emine Yilmaz. 2017. Auditing search engines for differential satisfaction across demographics. In Proceedings of the 26th international conference on World Wide Web companion. 626–633.
– reference: Zan Wang, Ming Yan, Junjie Chen, Shuang Liu, and Dongdi Zhang. 2020. Deep learning library testing via effective model generation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 788–799.
– ident: e_1_2_1_24_1
  doi: 10.1109/TSMCB.2010.2103055
– ident: e_1_2_1_46_1
  doi: 10.1145/3442381.3450015
– ident: e_1_2_1_28_1
  doi: 10.1145/3308558.3313497
– volume-title: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 1541–1553
  year: 2024
  ident: e_1_2_1_13_1
– ident: e_1_2_1_17_1
  doi: 10.1145/3404835.3463235
– ident: e_1_2_1_26_1
  doi: 10.1145/3404835.3462966
– ident: e_1_2_1_41_1
  doi: 10.1145/3336191.3371855
– volume-title: Breakthroughs in Statistics: Methodology and Distribution
  ident: e_1_2_1_44_1
– ident: e_1_2_1_19_1
  doi: 10.1145/3597926.3598058
– ident: e_1_2_1_20_1
– ident: e_1_2_1_38_1
  doi: 10.1145/3238147.3238165
– ident: e_1_2_1_23_1
  doi: 10.1145/582415.582418
– volume: 25
  start-page: 101
  year: 2000
  ident: e_1_2_1_39_1
  article-title: A critique and improvement of the CL common language effect size statistics of McGraw and Wong
  publication-title: Journal of Educational and Behavioral Statistics
– ident: e_1_2_1_42_1
  doi: 10.1145/3447548.3467249
– ident: e_1_2_1_32_1
  doi: 10.1109/ICSE48619.2023.00136
– ident: e_1_2_1_12_1
  doi: 10.1145/2988450.2988454
– ident: e_1_2_1_5_1
  doi: 10.1145/3236024.3264590
– ident: e_1_2_1_37_1
  doi: 10.1145/3510003.3510202
– volume-title: REASONER: An Explainable Recommendation Dataset with Multi-aspect Real User Labeled Ground Truths Towards more Measurable Explainable Recommendation. CoRR, abs/2303.00168
  year: 2023
  ident: e_1_2_1_10_1
– ident: e_1_2_1_50_1
  doi: 10.1145/3377811.3380331
– volume-title: Sampling
  ident: e_1_2_1_36_1
– ident: e_1_2_1_30_1
  doi: 10.1145/2931037.2931054
– ident: e_1_2_1_49_1
  doi: 10.1145/3591869
– ident: e_1_2_1_2_1
– volume-title: Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25
  year: 2012
  ident: e_1_2_1_35_1
– ident: e_1_2_1_31_1
  doi: 10.1145/3041021.3054197
– ident: e_1_2_1_25_1
  doi: 10.1145/3442381.3449866
– volume-title: Deep learning based recommender system: A survey and new perspectives. ACM computing surveys (CSUR), 52, 1
  year: 2019
  ident: e_1_2_1_51_1
– ident: e_1_2_1_15_1
  doi: 10.1145/3106237.3106277
– volume-title: Fairness Testing: A Comprehensive Survey and Analysis of Trends. ACM Transactions on Software Engineering and Methodology.
  year: 2023
  ident: e_1_2_1_11_1
– ident: e_1_2_1_33_1
  doi: 10.5555/2540128.2540517
– volume-title: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 210–222
  year: 2024
  ident: e_1_2_1_47_1
– ident: e_1_2_1_53_1
  doi: 10.1145/1060745.1060754
– volume: 3
  start-page: 519
  year: 2003
  ident: e_1_2_1_27_1
  article-title: AUC: a statistically consistent and more discriminating measure than accuracy
  publication-title: Ijcai.
– ident: e_1_2_1_1_1
– ident: e_1_2_1_18_1
  doi: 10.1109/ICSE43902.2021.00042
– volume-title: Fairness-aware News Recommendation with Decomposed Adversarial Learning
  ident: e_1_2_1_45_1
– volume-title: The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5, 4
  year: 2015
  ident: e_1_2_1_21_1
– ident: e_1_2_1_29_1
  doi: 10.1145/3468264.3468622
– ident: e_1_2_1_3_1
– ident: e_1_2_1_43_1
  doi: 10.1145/3368089.3409761
– volume-title: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1114–1126
  year: 2020
  ident: e_1_2_1_6_1
– ident: e_1_2_1_22_1
  doi: 10.1145/3038912.3052569
– ident: e_1_2_1_40_1
  doi: 10.1017/S1351324901002789
– ident: e_1_2_1_7_1
  doi: 10.1007/978-3-642-13287-2
– ident: e_1_2_1_4_1
  doi: 10.1145/3450613.3456821
– volume: 9659
  volume-title: ICML (Proceedings of Machine Learning Research
  year: 2021
  ident: e_1_2_1_34_1
– volume-title: Regression Fuzzing for Deep Learning Systems
  ident: e_1_2_1_48_1
– ident: e_1_2_1_9_1
  doi: 10.1145/3394112
– ident: e_1_2_1_52_1
  doi: 10.1145/3510003.3510123
– volume-title: Conference on fairness, accountability and transparency. 172–186
  year: 2018
  ident: e_1_2_1_14_1
– ident: e_1_2_1_16_1
  doi: 10.1145/3511808.3557220
– ident: e_1_2_1_8_1
  doi: 10.1145/3459637.3481915
SSID ssj0002991170
Score 2.2953317
Snippet Recommender systems play an increasingly important role in modern society, powering digital platforms that suggest a wide array of content, from news and music...
SourceID crossref
acm
SourceType Index Database
Publisher
StartPage 1607
SubjectTerms Information systems
Recommender systems
Search-based software engineering
Software and its engineering
Software testing and debugging
SubjectTermsDisplay Information systems -- Recommender systems
Software and its engineering -- Search-based software engineering
Software and its engineering -- Software testing and debugging
Title No Bias Left Behind: Fairness Testing for Deep Recommender Systems Targeting General Disadvantaged Groups
URI https://dl.acm.org/doi/10.1145/3728948
Volume 2
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2994-970X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002991170
  issn: 2994-970X
  databaseCode: M~E
  dateStart: 20240101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELaWwoELjwKi5SEfuK0i1l4nTrgtLRVI7apSc1hxqRzHq43EJqvNpvTU_8O_ZMZ2Hi1IwIFLFE3iKMp8mlfG3xDyDvlI5HIaBYoLHQjORBBnOguWyyjMmFwmWll2_VM5n8eLRXI-Gv1o98JcfZNlGV9fJ5v_qmqQgbJx6-w_qLt7KAjgHJQOR1A7HP9K8fNq_LFQ9fjUYF-7WUHSjWn_iSq21qylyKvh2yePjdlg5Fit13akXEtgPk5tg7gtTjleaqTptP0COzBAvoZVDyPb884T1m3fwezoDP9F1GDpv2ODmem5Dztf0Ni_I6um6iv7zvp8HTQLNa6eu2p62bFduCiULScPaxc8xB4rPihnciQmTuRk4bzRb2TeRvMBFL9cXKSzgc1FiryB_2aRK6H86hsE0mhMJaSYjtzzNvv2Ha_Y9Sq6ndvhpV94j9znMkzQgJ7d9OU8eGuc4YOjDNv3d1u0ce17vxYjIL0eRECDUCZ9Qh75HITOHHaekpEp98njdr4H9eb-GSnmFUUoUYQSdVD6QFsgUQ8kCkCiCCQ6ABL1QKIdkKgHEr0FJOqA9JykJ5_So8-BH80RqFjEQcwVz7NJFkWZCsHiCyU4pJpsIiOtmWG5nmYJJPO5XGaGJYpplnHDWRbqiZK5mb4ge2VVmpeEMm20ACmEwkZA9B5Lk_MYXUOEw9DFAdmHb3a5cdwrrRYOCG2_YXfpjqIO_3zLK_KwR-VrsrfbNuYNeaCvdkW9fWsV_BPtYH7l
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=No+Bias+Left+Behind%3A+Fairness+Testing+for+Deep+Recommender+Systems+Targeting+General+Disadvantaged+Groups&rft.jtitle=Proceedings+of+the+ACM+on+software+engineering&rft.au=Wu%2C+Zhuo&rft.au=Wang%2C+Zan&rft.au=Luo%2C+Chuan&rft.au=Du%2C+Xiaoning&rft.date=2025-06-22&rft.issn=2994-970X&rft.eissn=2994-970X&rft.volume=2&rft.issue=ISSTA&rft.spage=1607&rft.epage=1629&rft_id=info:doi/10.1145%2F3728948&rft.externalDBID=n%2Fa&rft.externalDocID=10_1145_3728948
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2994-970X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2994-970X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2994-970X&client=summon