Generating Factual Text via Entailment Recognition Task

Generating diverse and factual text is challenging and is receiving increasing attention. By sampling from the latent space, variational autoencoder-based models have recently enhanced the diversity of generated text. However, existing research predominantly depends on summarization models to offer...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Computers, materials & continua Ročník 80; číslo 1; s. 547 - 565
Hlavní autori: Dai, Jinqiao, Cheng, Pengsen, Liu, Jiayong
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Henderson Tech Science Press 2024
Predmet:
ISSN:1546-2226, 1546-2218, 1546-2226
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Generating diverse and factual text is challenging and is receiving increasing attention. By sampling from the latent space, variational autoencoder-based models have recently enhanced the diversity of generated text. However, existing research predominantly depends on summarization models to offer paragraph-level semantic information for enhancing factual correctness. The challenge lies in effectively generating factual text using sentence-level variational autoencoder-based models. In this paper, a novel model called fact-aware conditional variational autoencoder is proposed to balance the factual correctness and diversity of generated text. Specifically, our model encodes the input sentences and uses them as facts to build a conditional variational autoencoder network. By training a conditional variational autoencoder network, the model is enabled to generate text based on input facts. Building upon this foundation, the input text is passed to the discriminator along with the generated text. By employing adversarial training, the model is encouraged to generate text that is indistinguishable to the discriminator, thereby enhancing the quality of the generated text. To further improve the factual correctness, inspired by the natural language inference system, the entailment recognition task is introduced to be trained together with the discriminator via multi-task learning. Moreover, based on the entailment recognition results, a penalty term is further proposed to reconstruct the loss of our model, forcing the generator to generate text consistent with the facts. Experimental results demonstrate that compared with competitive models, our model has achieved substantial improvements in both the quality and factual correctness of the text, despite only sacrificing a small amount of diversity. Furthermore, when considering a comprehensive evaluation of diversity and quality metrics, our model has also demonstrated the best performance.
AbstractList Generating diverse and factual text is challenging and is receiving increasing attention. By sampling from the latent space, variational autoencoder-based models have recently enhanced the diversity of generated text. However, existing research predominantly depends on summarization models to offer paragraph-level semantic information for enhancing factual correctness. The challenge lies in effectively generating factual text using sentence-level variational autoencoder-based models. In this paper, a novel model called fact-aware conditional variational autoencoder is proposed to balance the factual correctness and diversity of generated text. Specifically, our model encodes the input sentences and uses them as facts to build a conditional variational autoencoder network. By training a conditional variational autoencoder network, the model is enabled to generate text based on input facts. Building upon this foundation, the input text is passed to the discriminator along with the generated text. By employing adversarial training, the model is encouraged to generate text that is indistinguishable to the discriminator, thereby enhancing the quality of the generated text. To further improve the factual correctness, inspired by the natural language inference system, the entailment recognition task is introduced to be trained together with the discriminator via multi-task learning. Moreover, based on the entailment recognition results, a penalty term is further proposed to reconstruct the loss of our model, forcing the generator to generate text consistent with the facts. Experimental results demonstrate that compared with competitive models, our model has achieved substantial improvements in both the quality and factual correctness of the text, despite only sacrificing a small amount of diversity. Furthermore, when considering a comprehensive evaluation of diversity and quality metrics, our model has also demonstrated the best performance.
Author Cheng, Pengsen
Liu, Jiayong
Dai, Jinqiao
Author_xml – sequence: 1
  givenname: Jinqiao
  surname: Dai
  fullname: Dai, Jinqiao
– sequence: 2
  givenname: Pengsen
  surname: Cheng
  fullname: Cheng, Pengsen
– sequence: 3
  givenname: Jiayong
  surname: Liu
  fullname: Liu, Jiayong
BookMark eNpNkMFLwzAYxYNMcJvePRY8t-ZL0rQ5ytimMBCknkOafhmdWzqTTPS_tzoPnt47_HgPfjMy8YNHQm6BFpxJKu7twRaMMlHQEipRXpAplELmjDE5-devyCzGHaVcckWnpFqjx2BS77fZyth0Mvuswc-UffQmW_pk-v0Bfcpe0A5b36d-8Flj4ts1uXRmH_HmL-fkdbVsFo_55nn9tHjY5JZJmXIru4piW3aq7cCCZF2NJdamYxUAlkYBMN5hi6haJYVzztSOghJOgJCq5XNyd949huH9hDHp3XAKfrzUHJSquWCcjhQ9UzYMMQZ0-hj6gwlfGqj-1aNHPfpHjz7r4d-ra1nJ
Cites_doi 10.1145/3422622
10.18653/v1/P17-1070
10.1016/j.ipm.2020.102478
10.1017/S1351324997001502
10.1016/j.knosys.2022.108491
10.1609/aaai.v32i1.11912
10.1017/S1351324919000202
ContentType Journal Article
Copyright 2024. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2024. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
7SC
7SR
8BQ
8FD
ABUWG
AFKRA
AZQEC
BENPR
CCPQU
DWQXO
JG9
JQ2
L7M
L~C
L~D
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
DOI 10.32604/cmc.2024.051745
DatabaseName CrossRef
Computer and Information Systems Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
ProQuest Central (New)
ProQuest One
ProQuest Central Korea
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ProQuest One Academic
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
DatabaseTitle CrossRef
Publicly Available Content Database
Materials Research Database
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest Central China
METADEX
Computer and Information Systems Abstracts Professional
ProQuest Central
Engineered Materials Abstracts
ProQuest One Academic UKI Edition
ProQuest Central Korea
ProQuest Central (New)
ProQuest One Academic
Advanced Technologies Database with Aerospace
ProQuest One Academic (New)
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: PIMPY
  name: ProQuest Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1546-2226
EndPage 565
ExternalDocumentID 10_32604_cmc_2024_051745
GroupedDBID AAFWJ
AAYXX
ACIWK
ADMLS
AFFHD
AFKRA
ALMA_UNASSIGNED_HOLDINGS
BENPR
CCPQU
CITATION
EBS
EJD
J9A
OK1
P2P
PHGZM
PHGZT
PIMPY
RTS
TUS
7SC
7SR
8BQ
8FD
ABUWG
AZQEC
DWQXO
JG9
JQ2
L7M
L~C
L~D
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
ID FETCH-LOGICAL-c266t-c6d70eb5d9bd1c162d8e5e8ad2711e5a91123debee9b964fffa8f0194f41469b3
IEDL.DBID PIMPY
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001290256900024&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1546-2226
1546-2218
IngestDate Mon Jun 30 08:13:10 EDT 2025
Sat Nov 29 08:15:48 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c266t-c6d70eb5d9bd1c162d8e5e8ad2711e5a91123debee9b964fffa8f0194f41469b3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://www.proquest.com/publiccontent/docview/3199834230?pq-origsite=%requestingapplication%
PQID 3199834230
PQPubID 2048737
PageCount 19
ParticipantIDs proquest_journals_3199834230
crossref_primary_10_32604_cmc_2024_051745
PublicationCentury 2000
PublicationDate 2024-00-00
20240101
PublicationDateYYYYMMDD 2024-01-01
PublicationDate_xml – year: 2024
  text: 2024-00-00
PublicationDecade 2020
PublicationPlace Henderson
PublicationPlace_xml – name: Henderson
PublicationTitle Computers, materials & continua
PublicationYear 2024
Publisher Tech Science Press
Publisher_xml – name: Tech Science Press
References Shao (ref23) 2018
Wang (ref32) 2018
Wang (ref19) 2018
ref11
Bengio (ref17) 2015; 28
Shao (ref29) 2020; 119
ref2
Zhao (ref34) 2018; 80
Williams (ref31) 2018
Ting (ref21) 2017; 70
Pasunuru (ref26) 2018
ref38
Liu (ref20) 2020; 34, no. 5
Zhang (ref10) 2021; 58
Sohn (ref14) 2015; 28
Goodfellow (ref15) 2020; 63
Shao (ref22) 2019
Lin (ref35) 2004
Li (ref12) 2018
Guo (ref18) 2018; 32, no. 1
Zhao (ref7) 2017
Shu (ref37) 2019
ref28
ref27
Reiter (ref1) 1997; 3
Cheng (ref5) 2022; 244
Lin (ref33) 2020
Russo (ref24) 2020
ref4
ref3
ref6
Papineni (ref36) 2002
Talman (ref16) 2019; 25
Bowan (ref30) 2015
Bowman (ref9) 2016
Goodrich (ref25) 2019
Yu (ref8) 2017; 31
Falke (ref13) 2019
References_xml – volume: 32, no. 1
  year: 2018
  ident: ref18
  article-title: Long text generation via adversarial training with leaked information
– start-page: 5233
  year: 2018
  ident: ref23
  article-title: Transformer-based conditioned variational autoencoder for story completion
– ident: ref3
– volume: 28
  start-page: 1771
  year: 2015
  ident: ref17
  publication-title: Advances in Neural Information Processing Systems
– start-page: 351
  year: 2020
  ident: ref24
  article-title: Control, generate, augment: A scalable framework for multi-attribute text generation
– start-page: 654
  year: 2017
  ident: ref7
  article-title: Learning discourse-level diversity for neural dialog models using conditional variational autoencoders
– start-page: 2214
  year: 2019
  ident: ref13
  article-title: Ranking generated summaries by correctness: An interesting but challenging application for natural language inference
– volume: 63
  start-page: 139
  year: 2020
  ident: ref15
  article-title: Generative adversarial networks
  publication-title: Commun. ACM
  doi: 10.1145/3422622
– ident: ref6
  doi: 10.18653/v1/P17-1070
– start-page: 646
  year: 2018
  ident: ref26
  article-title: Multi-reward reinforced summarization with saliency and entailment
– ident: ref27
– start-page: 74
  year: 2004
  ident: ref35
  publication-title: Text Summarization Branches Out
– start-page: 353
  year: 2018
  ident: ref32
  article-title: GLUE: A multi-task benchmark and analysis platform for natural language understanding
– volume: 58
  start-page: 102478
  year: 2021
  ident: ref10
  article-title: FAR-ASS: Fact-aware reinforced abstractive sentence summarization
  publication-title: Inf. Process. Manage.
  doi: 10.1016/j.ipm.2020.102478
– start-page: 4446
  year: 2018
  ident: ref19
  article-title: SentiGAN: Generating sentimental text via mixture adversarial networks
– start-page: 3257
  year: 2019
  ident: ref22
  article-title: Long and diverse text generation with planning-based hierarchical variational model
– volume: 3
  start-page: 57
  year: 1997
  ident: ref1
  article-title: Building applied natural language generation systems
  publication-title: Nat. Lang. Eng.
  doi: 10.1017/S1351324997001502
– volume: 70
  start-page: 1587
  year: 2017
  ident: ref21
  article-title: Toward controlled generation of text
– volume: 28
  start-page: 3483
  year: 2015
  ident: ref14
  publication-title: Advances in Neural Information Processing Systems
– start-page: 311
  year: 2002
  ident: ref36
  article-title: BLEU: A method for automatic evaluation of machine translation
– ident: ref4
– start-page: 1823
  year: 2019
  ident: ref37
  article-title: Generating diverse translations with sentence codes
– ident: ref2
– start-page: 10
  year: 2016
  ident: ref9
  article-title: Generating sentences from a continuous space
– ident: ref38
– volume: 31
  year: 2017
  ident: ref8
  article-title: SeqGAN: Sequence generative adversarial nets with policy gradient
– volume: 244
  start-page: 108491
  year: 2022
  ident: ref5
  article-title: CatVRNN: Generating category text via multi-task learning
  publication-title: Knowl.-Based Syst.
  doi: 10.1016/j.knosys.2022.108491
– ident: ref11
  doi: 10.1609/aaai.v32i1.11912
– start-page: 632
  year: 2015
  ident: ref30
  article-title: A large annotated corpus for learning natural language inference
– ident: ref28
– start-page: 1112
  year: 2018
  ident: ref31
  article-title: A broad-coverage challenge corpus for sentence understanding through inference
– volume: 80
  start-page: 5902
  year: 2018
  ident: ref34
  article-title: Adversarially regularized autoencoders
– start-page: 1823
  year: 2020
  ident: ref33
  article-title: CommonGen: A constrained text generation challenge for generative commonsense reasoning
– start-page: 166
  year: 2019
  ident: ref25
  article-title: Assessing the factual accuracy of generated text
– start-page: 1430
  year: 2018
  ident: ref12
  article-title: Ensure the correctness of the summary: Incorporate entailment knowledge into abstractive sentence summarization
– volume: 34, no. 5
  start-page: 8425
  year: 2020
  ident: ref20
  article-title: CatGAN: Category-aware generative adversarial networks with hierarchical evolutionary learning for category text generation
– volume: 119
  start-page: 8655
  year: 2020
  ident: ref29
  article-title: ControlVAE: Controllable variational autoencoder
– volume: 25
  start-page: 467
  year: 2019
  ident: ref16
  article-title: Sentence embeddings in nli with iterative refinement encoders
  publication-title: Nat. Lang. Eng.
  doi: 10.1017/S1351324919000202
SSID ssj0036390
Score 2.297857
Snippet Generating diverse and factual text is challenging and is receiving increasing attention. By sampling from the latent space, variational autoencoder-based...
SourceID proquest
crossref
SourceType Aggregation Database
Index Database
StartPage 547
SubjectTerms Discriminators
Natural language
Natural language processing
Recognition
Semantics
Sentences
Title Generating Factual Text via Entailment Recognition Task
URI https://www.proquest.com/docview/3199834230
Volume 80
WOSCitedRecordID wos001290256900024&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1546-2226
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0036390
  issn: 1546-2226
  databaseCode: BENPR
  dateStart: 20040101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Publicly Available Content Database
  customDbUrl:
  eissn: 1546-2226
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0036390
  issn: 1546-2226
  databaseCode: PIMPY
  dateStart: 20040101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8NADLagZWChPEWhVDewMBxNLu8JAWoFA1VUFalM0T0lhEhLW_r7OScXoS5MzKecItv32eezPwNcBz5LhL0m00QbRcOAKZoy41MjU-UZIZANtRo2kYzH6WyW5a49euXKKhtMrIC6ZnvGum0LwgM1l5gxHwTYGobkdd7d4oviDCl8a3UDNXahjcRbXgva-fNL_tYgc2C9cdUgGYUxZda31c-WNoDxwoH8REJDFt5W3M3RtpvaRunK9Yw6__vTh3DgQlByX9vMEezo8hg6zXgH4k77CSQ1JTXWRZMRrxpNyNRiOdm8czIssfQUU4tk0tQgzUsy5auPU3gdDaePT9TNWaDSuuc1lbFKPC0ilQnlSz9mKtWRTrliie_riFs8ZIGy2taZyOLQGMNTY0PD0IQWZzMRnEGrnJf6HDvAuRdJTE4ZFcZaCWm4_T5JY2sr3Mgu3DRiLRY1nUZhryGVCgqrggJVUNQq6EKvEWrhDtaq-JXhxd_Ll7CPe9XZkh601stvfQV7crN-Xy370H4YjvNJ31nID7WvyuE
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LS8NAEB5qFfTiW6zPPejBw2qy2bwOIqIWi7YUqaCnuNkHFDGtba34p_yNzuaBePHmwXPIkuSbzDc7O_MNwIHnsjDFbTINtVGUe0zRiBmXGhkpx6SpVUPNh02EnU708BB3a_BZ9cLYssrKJ-aOWg2kzZGfeLYZzMrVOWfDV2qnRtnT1WqERmEWN_rjHbds49PWJeJ7yFjzqndxTcupAlQiGU2oDFTo6NRXcapc6QZMRdrXkVAsdF3tC_z7mafw3XScxgE3xojIYCDEDUevEqcerjsDsxyN3anDbLfV7j5Wvt9Dvs9bMH0eUIbsWRyMYojk8BP5YiUTGT_O1aH9n0T4kwdycmsu_bfPsgyLZRhNzgu7X4GazlZhqRpRQUqPtQZhIatta7tJU-TNMqSHfESmfUGuMls-a9Oj5K6qoxpkpCfGz-tw_yePvwH1bJDpTdvFLhxf2gSbUTzQKpVG4P1hFKC9CyMbcFQBlwwLSZAEt1I5yAmCnFiQkwLkBuxUsCWlcxgn35ht_X55H-ave-3b5LbVudmGBbtukf3Zgfpk9KZ3YU5OJ_3xaK-0QwJPf43xFwYYHOg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Generating+Factual+Text+via+Entailment+Recognition+Task&rft.jtitle=Computers%2C+materials+%26+continua&rft.au=Dai%2C+Jinqiao&rft.au=Cheng%2C+Pengsen&rft.au=Liu%2C+Jiayong&rft.date=2024&rft.issn=1546-2226&rft.eissn=1546-2226&rft.volume=80&rft.issue=1&rft.spage=547&rft.epage=565&rft_id=info:doi/10.32604%2Fcmc.2024.051745&rft.externalDBID=n%2Fa&rft.externalDocID=10_32604_cmc_2024_051745
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1546-2226&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1546-2226&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1546-2226&client=summon