Adversarial Constrained Bidding via Minimax Regret Optimization with Causality-Aware Reinforcement Learning
The proliferation of the Internet has led to the emergence of online advertising, driven by the mechanics of online auctions. In these repeated auctions, software agents participate on behalf of aggregated advertisers to optimize for their long-term utility. To fulfill the diverse demands, bidding s...
Uložené v:
| Vydané v: | arXiv.org |
|---|---|
| Hlavní autori: | , , , , , |
| Médium: | Paper |
| Jazyk: | English |
| Vydavateľské údaje: |
Ithaca
Cornell University Library, arXiv.org
12.06.2023
|
| Predmet: | |
| ISSN: | 2331-8422 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | The proliferation of the Internet has led to the emergence of online advertising, driven by the mechanics of online auctions. In these repeated auctions, software agents participate on behalf of aggregated advertisers to optimize for their long-term utility. To fulfill the diverse demands, bidding strategies are employed to optimize advertising objectives subject to different spending constraints. Existing approaches on constrained bidding typically rely on i.i.d. train and test conditions, which contradicts the adversarial nature of online ad markets where different parties possess potentially conflicting objectives. In this regard, we explore the problem of constrained bidding in adversarial bidding environments, which assumes no knowledge about the adversarial factors. Instead of relying on the i.i.d. assumption, our insight is to align the train distribution of environments with the potential test distribution meanwhile minimizing policy regret. Based on this insight, we propose a practical Minimax Regret Optimization (MiRO) approach that interleaves between a teacher finding adversarial environments for tutoring and a learner meta-learning its policy over the given distribution of environments. In addition, we pioneer to incorporate expert demonstrations for learning bidding strategies. Through a causality-aware policy design, we improve upon MiRO by distilling knowledge from the experts. Extensive experiments on both industrial data and synthetic data show that our method, MiRO with Causality-aware reinforcement Learning (MiROCL), outperforms prior methods by over 30%. |
|---|---|
| AbstractList | The proliferation of the Internet has led to the emergence of online advertising, driven by the mechanics of online auctions. In these repeated auctions, software agents participate on behalf of aggregated advertisers to optimize for their long-term utility. To fulfill the diverse demands, bidding strategies are employed to optimize advertising objectives subject to different spending constraints. Existing approaches on constrained bidding typically rely on i.i.d. train and test conditions, which contradicts the adversarial nature of online ad markets where different parties possess potentially conflicting objectives. In this regard, we explore the problem of constrained bidding in adversarial bidding environments, which assumes no knowledge about the adversarial factors. Instead of relying on the i.i.d. assumption, our insight is to align the train distribution of environments with the potential test distribution meanwhile minimizing policy regret. Based on this insight, we propose a practical Minimax Regret Optimization (MiRO) approach that interleaves between a teacher finding adversarial environments for tutoring and a learner meta-learning its policy over the given distribution of environments. In addition, we pioneer to incorporate expert demonstrations for learning bidding strategies. Through a causality-aware policy design, we improve upon MiRO by distilling knowledge from the experts. Extensive experiments on both industrial data and synthetic data show that our method, MiRO with Causality-aware reinforcement Learning (MiROCL), outperforms prior methods by over 30%. |
| Author | Wang, Haozhe Du, Chao Zheng, Bo Fang, Panyan Wang, Liang He, Li |
| Author_xml | – sequence: 1 givenname: Haozhe surname: Wang fullname: Wang, Haozhe – sequence: 2 givenname: Chao surname: Du fullname: Du, Chao – sequence: 3 givenname: Panyan surname: Fang fullname: Fang, Panyan – sequence: 4 givenname: Li surname: He fullname: He, Li – sequence: 5 givenname: Liang surname: Wang fullname: Wang, Liang – sequence: 6 givenname: Bo surname: Zheng fullname: Zheng, Bo |
| BookMark | eNotjU1PAjEUABujiYj8AG9NPC92-9rdcsSNXwmGxHAnr7uvWIQutgXRXy-JnuYymbli56EPxNhNKcbKaC3uMB79YSxBVGNRl6I6YwMJUBZGSXnJRimthRCyqqXWMGAf0-5AMWH0uOFNH1KO6AN1_N53nQ8rfvDIX33wWzzyN1pFyny-y37rfzD7PvAvn995g_uEG5-_i-kXRjqJPrg-trSlkPmMMIZT65pdONwkGv1zyBaPD4vmuZjNn16a6axALaFQSrWSBDiyZNu6daUy1hgnSpDo0IFwBDVaMVGTzoIkQlVaqcga21VoYchu_7K72H_uKeXlut_HcDoupZEahFYA8Atq7V4_ |
| ContentType | Paper |
| Copyright | 2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: 2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS |
| DOI | 10.48550/arxiv.2306.07106 |
| DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection ProQuest Materials Science & Engineering ProQuest Central (Alumni) ProQuest Central ProQuest Central Essentials ProQuest Central Technology collection ProQuest One Community College ProQuest Central SciTech Premium Collection ProQuest Engineering Collection Engineering Database ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection |
| DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) Engineering Collection |
| DatabaseTitleList | Publicly Available Content Database |
| Database_xml | – sequence: 1 dbid: PIMPY name: Publicly Available Content Database url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Physics |
| EISSN | 2331-8422 |
| Genre | Working Paper/Pre-Print |
| GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS |
| ID | FETCH-LOGICAL-a523-444c2e03febebc7cf148b88f0132afaf30fe37ab0949db32eea41b24eb8bd6ab3 |
| IEDL.DBID | BENPR |
| IngestDate | Mon Jun 30 09:21:37 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a523-444c2e03febebc7cf148b88f0132afaf30fe37ab0949db32eea41b24eb8bd6ab3 |
| Notes | SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 content type line 50 |
| OpenAccessLink | https://www.proquest.com/docview/2825305433?pq-origsite=%requestingapplication% |
| PQID | 2825305433 |
| PQPubID | 2050157 |
| ParticipantIDs | proquest_journals_2825305433 |
| PublicationCentury | 2000 |
| PublicationDate | 20230612 |
| PublicationDateYYYYMMDD | 2023-06-12 |
| PublicationDate_xml | – month: 06 year: 2023 text: 20230612 day: 12 |
| PublicationDecade | 2020 |
| PublicationPlace | Ithaca |
| PublicationPlace_xml | – name: Ithaca |
| PublicationTitle | arXiv.org |
| PublicationYear | 2023 |
| Publisher | Cornell University Library, arXiv.org |
| Publisher_xml | – name: Cornell University Library, arXiv.org |
| SSID | ssj0002672553 |
| Score | 1.834506 |
| SecondaryResourceType | preprint |
| Snippet | The proliferation of the Internet has led to the emergence of online advertising, driven by the mechanics of online auctions. In these repeated auctions,... |
| SourceID | proquest |
| SourceType | Aggregation Database |
| SubjectTerms | Advertising Causality Constraints Distillation Minimax technique Online advertising Optimization Software agents Synthetic data |
| Title | Adversarial Constrained Bidding via Minimax Regret Optimization with Causality-Aware Reinforcement Learning |
| URI | https://www.proquest.com/docview/2825305433 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3LTgIxFG0UNHHlOz6QdOG2YZiWeayMEIgmghMkBlfk9mUmhgFnAPl82zrowsSNy2Y6yaTtnPaee3sOQtcAWsQW_fyWDgmLm5xEVAfECpkAbQUxOOe554dwMIjG4zgpCbeiLKvcYKIDajkTliNv2DuWZm0ySm_m78S6RtnsammhsY2qVqmMVVC13R0kw2-WxQ9Cc2amX-lMJ97VgHydrmz9s1XtbHrBLxB2O0tv_7_fdICqCcxVfoi2VHaEdl1FpyiO0ZuzWi7ALjBsbTmdGYSSuJ1Ku1_hVQq4n2bpFNZ4qEzYvcCPBj-m5cVMbBla3IFl4U7q5PYDcmU6OqVV4UhFXIqzvp6gUa876tyR0lmBgAk8CWNM-Mqj2swgF6HQJibiUaRt3gU0aOppRUPgJvSLJae-UsCa3GeKR1wGwOkpqmSzTJ0hTEUYBJzrWCrOwIvNO1EglCciKUMZsXNU2wzdpPw7isnPuF38_fgS7Vl7d-K8gmqossiX6grtiNUiLfJ6Odl1W6_5ZFrJfT95-QRWO7re |
| linkProvider | ProQuest |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3JTsMwEB2VAoITu9jxAY4WaWyyHBCCAqKiFAQV4laNNxShtpCUFj6Kf8R2WzggcePAOYuSzPg5b7YHsItoZOrQLzwwMeVpRdCEmYi6QSbIDqIUvfLcfT1uNJKHh_SmBB_jXhhXVjnGRA_UqitdjHzf9Vha3-SMHT2_UKca5bKrYwmNoVtc6veBpWzFYe3U2ncvDM_PmtULOlIVoGhJF-Wcy1AHzNinFzKWxvIBkSTG5RzQoGGB0SxGYWlPqgQLtUZeESHXIhEqQsHsbSdgkjvw95WCd18hnTCK7Q86G-ZO_aSwfczfsr4rtnYjQitB9APx_TZ2PvfPPsA8TN7gs84XoKQ7izDtq1VlsQRPXka6QLd4iJMc9UIXWpGTTLm9mPQzJFdZJ2vjG7nVj7nukWuLje1R0ylx0WdSxdfCsxB6PMBc2xP9FFnpA6ZkNHj2cRmaf_F6K1DudDt6FQiTcRQJYVKlBccgtdckkdSBTJSKVcLXYHNsqdZo5RetbzOt_354B2Yumlf1Vr3WuNyAWSdjT70m0iaUe_mr3oIp2e9lRb7tvYxA64-N-gkhghZa |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Adversarial+Constrained+Bidding+via+Minimax+Regret+Optimization+with+Causality-Aware+Reinforcement+Learning&rft.jtitle=arXiv.org&rft.au=Wang%2C+Haozhe&rft.au=Du%2C+Chao&rft.au=Fang%2C+Panyan&rft.au=He%2C+Li&rft.date=2023-06-12&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.2306.07106 |