Generating Factual Text via Entailment Recognition Task
Generating diverse and factual text is challenging and is receiving increasing attention. By sampling from the latent space, variational autoencoder-based models have recently enhanced the diversity of generated text. However, existing research predominantly depends on summarization models to offer...
Uložené v:
| Vydané v: | Computers, materials & continua Ročník 80; číslo 1; s. 547 - 565 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Henderson
Tech Science Press
2024
|
| Predmet: | |
| ISSN: | 1546-2226, 1546-2218, 1546-2226 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Generating diverse and factual text is challenging and is receiving increasing attention. By sampling from the latent space, variational autoencoder-based models have recently enhanced the diversity of generated text. However, existing research predominantly depends on summarization models to offer paragraph-level semantic information for enhancing factual correctness. The challenge lies in effectively generating factual text using sentence-level variational autoencoder-based models. In this paper, a novel model called fact-aware conditional variational autoencoder is proposed to balance the factual correctness and diversity of generated text. Specifically, our model encodes the input sentences and uses them as facts to build a conditional variational autoencoder network. By training a conditional variational autoencoder network, the model is enabled to generate text based on input facts. Building upon this foundation, the input text is passed to the discriminator along with the generated text. By employing adversarial training, the model is encouraged to generate text that is indistinguishable to the discriminator, thereby enhancing the quality of the generated text. To further improve the factual correctness, inspired by the natural language inference system, the entailment recognition task is introduced to be trained together with the discriminator via multi-task learning. Moreover, based on the entailment recognition results, a penalty term is further proposed to reconstruct the loss of our model, forcing the generator to generate text consistent with the facts. Experimental results demonstrate that compared with competitive models, our model has achieved substantial improvements in both the quality and factual correctness of the text, despite only sacrificing a small amount of diversity. Furthermore, when considering a comprehensive evaluation of diversity and quality metrics, our model has also demonstrated the best performance. |
|---|---|
| AbstractList | Generating diverse and factual text is challenging and is receiving increasing attention. By sampling from the latent space, variational autoencoder-based models have recently enhanced the diversity of generated text. However, existing research predominantly depends on summarization models to offer paragraph-level semantic information for enhancing factual correctness. The challenge lies in effectively generating factual text using sentence-level variational autoencoder-based models. In this paper, a novel model called fact-aware conditional variational autoencoder is proposed to balance the factual correctness and diversity of generated text. Specifically, our model encodes the input sentences and uses them as facts to build a conditional variational autoencoder network. By training a conditional variational autoencoder network, the model is enabled to generate text based on input facts. Building upon this foundation, the input text is passed to the discriminator along with the generated text. By employing adversarial training, the model is encouraged to generate text that is indistinguishable to the discriminator, thereby enhancing the quality of the generated text. To further improve the factual correctness, inspired by the natural language inference system, the entailment recognition task is introduced to be trained together with the discriminator via multi-task learning. Moreover, based on the entailment recognition results, a penalty term is further proposed to reconstruct the loss of our model, forcing the generator to generate text consistent with the facts. Experimental results demonstrate that compared with competitive models, our model has achieved substantial improvements in both the quality and factual correctness of the text, despite only sacrificing a small amount of diversity. Furthermore, when considering a comprehensive evaluation of diversity and quality metrics, our model has also demonstrated the best performance. |
| Author | Cheng, Pengsen Liu, Jiayong Dai, Jinqiao |
| Author_xml | – sequence: 1 givenname: Jinqiao surname: Dai fullname: Dai, Jinqiao – sequence: 2 givenname: Pengsen surname: Cheng fullname: Cheng, Pengsen – sequence: 3 givenname: Jiayong surname: Liu fullname: Liu, Jiayong |
| BookMark | eNpNkMFLwzAYxYNMcJvePRY8t-ZL0rQ5ytimMBCknkOafhmdWzqTTPS_tzoPnt47_HgPfjMy8YNHQm6BFpxJKu7twRaMMlHQEipRXpAplELmjDE5-devyCzGHaVcckWnpFqjx2BS77fZyth0Mvuswc-UffQmW_pk-v0Bfcpe0A5b36d-8Flj4ts1uXRmH_HmL-fkdbVsFo_55nn9tHjY5JZJmXIru4piW3aq7cCCZF2NJdamYxUAlkYBMN5hi6haJYVzztSOghJOgJCq5XNyd949huH9hDHp3XAKfrzUHJSquWCcjhQ9UzYMMQZ0-hj6gwlfGqj-1aNHPfpHjz7r4d-ra1nJ |
| Cites_doi | 10.1145/3422622 10.18653/v1/P17-1070 10.1016/j.ipm.2020.102478 10.1017/S1351324997001502 10.1016/j.knosys.2022.108491 10.1609/aaai.v32i1.11912 10.1017/S1351324919000202 |
| ContentType | Journal Article |
| Copyright | 2024. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: 2024. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | AAYXX CITATION 7SC 7SR 8BQ 8FD ABUWG AFKRA AZQEC BENPR CCPQU DWQXO JG9 JQ2 L7M L~C L~D PHGZM PHGZT PIMPY PKEHL PQEST PQQKQ PQUKI PRINS |
| DOI | 10.32604/cmc.2024.051745 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Engineered Materials Abstracts METADEX Technology Research Database ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials ProQuest Central (New) ProQuest One ProQuest Central Korea Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ProQuest One Academic ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China |
| DatabaseTitle | CrossRef Publicly Available Content Database Materials Research Database Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest Central China METADEX Computer and Information Systems Abstracts Professional ProQuest Central Engineered Materials Abstracts ProQuest One Academic UKI Edition ProQuest Central Korea ProQuest Central (New) ProQuest One Academic Advanced Technologies Database with Aerospace ProQuest One Academic (New) |
| DatabaseTitleList | Publicly Available Content Database |
| Database_xml | – sequence: 1 dbid: PIMPY name: ProQuest Publicly Available Content Database url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1546-2226 |
| EndPage | 565 |
| ExternalDocumentID | 10_32604_cmc_2024_051745 |
| GroupedDBID | AAFWJ AAYXX ACIWK ADMLS AFFHD AFKRA ALMA_UNASSIGNED_HOLDINGS BENPR CCPQU CITATION EBS EJD J9A OK1 P2P PHGZM PHGZT PIMPY RTS TUS 7SC 7SR 8BQ 8FD ABUWG AZQEC DWQXO JG9 JQ2 L7M L~C L~D PKEHL PQEST PQQKQ PQUKI PRINS |
| ID | FETCH-LOGICAL-c266t-c6d70eb5d9bd1c162d8e5e8ad2711e5a91123debee9b964fffa8f0194f41469b3 |
| IEDL.DBID | PIMPY |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001290256900024&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1546-2226 1546-2218 |
| IngestDate | Mon Jun 30 08:13:10 EDT 2025 Sat Nov 29 08:15:48 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c266t-c6d70eb5d9bd1c162d8e5e8ad2711e5a91123debee9b964fffa8f0194f41469b3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| OpenAccessLink | https://www.proquest.com/publiccontent/docview/3199834230?pq-origsite=%requestingapplication% |
| PQID | 3199834230 |
| PQPubID | 2048737 |
| PageCount | 19 |
| ParticipantIDs | proquest_journals_3199834230 crossref_primary_10_32604_cmc_2024_051745 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-00-00 20240101 |
| PublicationDateYYYYMMDD | 2024-01-01 |
| PublicationDate_xml | – year: 2024 text: 2024-00-00 |
| PublicationDecade | 2020 |
| PublicationPlace | Henderson |
| PublicationPlace_xml | – name: Henderson |
| PublicationTitle | Computers, materials & continua |
| PublicationYear | 2024 |
| Publisher | Tech Science Press |
| Publisher_xml | – name: Tech Science Press |
| References | Shao (ref23) 2018 Wang (ref32) 2018 Wang (ref19) 2018 ref11 Bengio (ref17) 2015; 28 Shao (ref29) 2020; 119 ref2 Zhao (ref34) 2018; 80 Williams (ref31) 2018 Ting (ref21) 2017; 70 Pasunuru (ref26) 2018 ref38 Liu (ref20) 2020; 34, no. 5 Zhang (ref10) 2021; 58 Sohn (ref14) 2015; 28 Goodfellow (ref15) 2020; 63 Shao (ref22) 2019 Lin (ref35) 2004 Li (ref12) 2018 Guo (ref18) 2018; 32, no. 1 Zhao (ref7) 2017 Shu (ref37) 2019 ref28 ref27 Reiter (ref1) 1997; 3 Cheng (ref5) 2022; 244 Lin (ref33) 2020 Russo (ref24) 2020 ref4 ref3 ref6 Papineni (ref36) 2002 Talman (ref16) 2019; 25 Bowan (ref30) 2015 Bowman (ref9) 2016 Goodrich (ref25) 2019 Yu (ref8) 2017; 31 Falke (ref13) 2019 |
| References_xml | – volume: 32, no. 1 year: 2018 ident: ref18 article-title: Long text generation via adversarial training with leaked information – start-page: 5233 year: 2018 ident: ref23 article-title: Transformer-based conditioned variational autoencoder for story completion – ident: ref3 – volume: 28 start-page: 1771 year: 2015 ident: ref17 publication-title: Advances in Neural Information Processing Systems – start-page: 351 year: 2020 ident: ref24 article-title: Control, generate, augment: A scalable framework for multi-attribute text generation – start-page: 654 year: 2017 ident: ref7 article-title: Learning discourse-level diversity for neural dialog models using conditional variational autoencoders – start-page: 2214 year: 2019 ident: ref13 article-title: Ranking generated summaries by correctness: An interesting but challenging application for natural language inference – volume: 63 start-page: 139 year: 2020 ident: ref15 article-title: Generative adversarial networks publication-title: Commun. ACM doi: 10.1145/3422622 – ident: ref6 doi: 10.18653/v1/P17-1070 – start-page: 646 year: 2018 ident: ref26 article-title: Multi-reward reinforced summarization with saliency and entailment – ident: ref27 – start-page: 74 year: 2004 ident: ref35 publication-title: Text Summarization Branches Out – start-page: 353 year: 2018 ident: ref32 article-title: GLUE: A multi-task benchmark and analysis platform for natural language understanding – volume: 58 start-page: 102478 year: 2021 ident: ref10 article-title: FAR-ASS: Fact-aware reinforced abstractive sentence summarization publication-title: Inf. Process. Manage. doi: 10.1016/j.ipm.2020.102478 – start-page: 4446 year: 2018 ident: ref19 article-title: SentiGAN: Generating sentimental text via mixture adversarial networks – start-page: 3257 year: 2019 ident: ref22 article-title: Long and diverse text generation with planning-based hierarchical variational model – volume: 3 start-page: 57 year: 1997 ident: ref1 article-title: Building applied natural language generation systems publication-title: Nat. Lang. Eng. doi: 10.1017/S1351324997001502 – volume: 70 start-page: 1587 year: 2017 ident: ref21 article-title: Toward controlled generation of text – volume: 28 start-page: 3483 year: 2015 ident: ref14 publication-title: Advances in Neural Information Processing Systems – start-page: 311 year: 2002 ident: ref36 article-title: BLEU: A method for automatic evaluation of machine translation – ident: ref4 – start-page: 1823 year: 2019 ident: ref37 article-title: Generating diverse translations with sentence codes – ident: ref2 – start-page: 10 year: 2016 ident: ref9 article-title: Generating sentences from a continuous space – ident: ref38 – volume: 31 year: 2017 ident: ref8 article-title: SeqGAN: Sequence generative adversarial nets with policy gradient – volume: 244 start-page: 108491 year: 2022 ident: ref5 article-title: CatVRNN: Generating category text via multi-task learning publication-title: Knowl.-Based Syst. doi: 10.1016/j.knosys.2022.108491 – ident: ref11 doi: 10.1609/aaai.v32i1.11912 – start-page: 632 year: 2015 ident: ref30 article-title: A large annotated corpus for learning natural language inference – ident: ref28 – start-page: 1112 year: 2018 ident: ref31 article-title: A broad-coverage challenge corpus for sentence understanding through inference – volume: 80 start-page: 5902 year: 2018 ident: ref34 article-title: Adversarially regularized autoencoders – start-page: 1823 year: 2020 ident: ref33 article-title: CommonGen: A constrained text generation challenge for generative commonsense reasoning – start-page: 166 year: 2019 ident: ref25 article-title: Assessing the factual accuracy of generated text – start-page: 1430 year: 2018 ident: ref12 article-title: Ensure the correctness of the summary: Incorporate entailment knowledge into abstractive sentence summarization – volume: 34, no. 5 start-page: 8425 year: 2020 ident: ref20 article-title: CatGAN: Category-aware generative adversarial networks with hierarchical evolutionary learning for category text generation – volume: 119 start-page: 8655 year: 2020 ident: ref29 article-title: ControlVAE: Controllable variational autoencoder – volume: 25 start-page: 467 year: 2019 ident: ref16 article-title: Sentence embeddings in nli with iterative refinement encoders publication-title: Nat. Lang. Eng. doi: 10.1017/S1351324919000202 |
| SSID | ssj0036390 |
| Score | 2.297857 |
| Snippet | Generating diverse and factual text is challenging and is receiving increasing attention. By sampling from the latent space, variational autoencoder-based... |
| SourceID | proquest crossref |
| SourceType | Aggregation Database Index Database |
| StartPage | 547 |
| SubjectTerms | Discriminators Natural language Natural language processing Recognition Semantics Sentences |
| Title | Generating Factual Text via Entailment Recognition Task |
| URI | https://www.proquest.com/docview/3199834230 |
| Volume | 80 |
| WOSCitedRecordID | wos001290256900024&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 1546-2226 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0036390 issn: 1546-2226 databaseCode: BENPR dateStart: 20040101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Publicly Available Content Database customDbUrl: eissn: 1546-2226 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0036390 issn: 1546-2226 databaseCode: PIMPY dateStart: 20040101 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8NADLagZWChPEWhVDewMBxNLu8JAWoFA1VUFalM0T0lhEhLW_r7OScXoS5MzKecItv32eezPwNcBz5LhL0m00QbRcOAKZoy41MjU-UZIZANtRo2kYzH6WyW5a49euXKKhtMrIC6ZnvGum0LwgM1l5gxHwTYGobkdd7d4oviDCl8a3UDNXahjcRbXgva-fNL_tYgc2C9cdUgGYUxZda31c-WNoDxwoH8REJDFt5W3M3RtpvaRunK9Yw6__vTh3DgQlByX9vMEezo8hg6zXgH4k77CSQ1JTXWRZMRrxpNyNRiOdm8czIssfQUU4tk0tQgzUsy5auPU3gdDaePT9TNWaDSuuc1lbFKPC0ilQnlSz9mKtWRTrliie_riFs8ZIGy2taZyOLQGMNTY0PD0IQWZzMRnEGrnJf6HDvAuRdJTE4ZFcZaCWm4_T5JY2sr3Mgu3DRiLRY1nUZhryGVCgqrggJVUNQq6EKvEWrhDtaq-JXhxd_Ll7CPe9XZkh601stvfQV7crN-Xy370H4YjvNJ31nID7WvyuE |
| linkProvider | ProQuest |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LS8NAEB5qFfTiW6zPPejBw2qy2bwOIqIWi7YUqaCnuNkHFDGtba34p_yNzuaBePHmwXPIkuSbzDc7O_MNwIHnsjDFbTINtVGUe0zRiBmXGhkpx6SpVUPNh02EnU708BB3a_BZ9cLYssrKJ-aOWg2kzZGfeLYZzMrVOWfDV2qnRtnT1WqERmEWN_rjHbds49PWJeJ7yFjzqndxTcupAlQiGU2oDFTo6NRXcapc6QZMRdrXkVAsdF3tC_z7mafw3XScxgE3xojIYCDEDUevEqcerjsDsxyN3anDbLfV7j5Wvt9Dvs9bMH0eUIbsWRyMYojk8BP5YiUTGT_O1aH9n0T4kwdycmsu_bfPsgyLZRhNzgu7X4GazlZhqRpRQUqPtQZhIatta7tJU-TNMqSHfESmfUGuMls-a9Oj5K6qoxpkpCfGz-tw_yePvwH1bJDpTdvFLhxf2gSbUTzQKpVG4P1hFKC9CyMbcFQBlwwLSZAEt1I5yAmCnFiQkwLkBuxUsCWlcxgn35ht_X55H-ave-3b5LbVudmGBbtukf3Zgfpk9KZ3YU5OJ_3xaK-0QwJPf43xFwYYHOg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Generating+Factual+Text+via+Entailment+Recognition+Task&rft.jtitle=Computers%2C+materials+%26+continua&rft.au=Dai%2C+Jinqiao&rft.au=Cheng%2C+Pengsen&rft.au=Liu%2C+Jiayong&rft.date=2024&rft.issn=1546-2226&rft.eissn=1546-2226&rft.volume=80&rft.issue=1&rft.spage=547&rft.epage=565&rft_id=info:doi/10.32604%2Fcmc.2024.051745&rft.externalDBID=n%2Fa&rft.externalDocID=10_32604_cmc_2024_051745 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1546-2226&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1546-2226&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1546-2226&client=summon |