Mapping the Bentham Corpus: Concept-based Navigation

British philosopher and reformer Jeremy Bentham (1748-1832) left over 60,000 folios of unpublished manuscripts. The Bentham Project, at University College London, is creating a TEI version of the manuscripts, via crowdsourced transcription verified by experts. We present here an interface to navigat...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of data mining and digital humanities Ročník Atelier Digit_Hum; číslo Data deluge: which skills for...
Hlavní autoři: Ruiz Fabo, Pablo, Poibeau, Thierry
Médium: Journal Article
Jazyk:angličtina
Vydáno: INRIA 06.03.2019
Nicolas Turenne
Témata:
ISSN:2416-5999, 2416-5999
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract British philosopher and reformer Jeremy Bentham (1748-1832) left over 60,000 folios of unpublished manuscripts. The Bentham Project, at University College London, is creating a TEI version of the manuscripts, via crowdsourced transcription verified by experts. We present here an interface to navigate these largely unedited manuscripts, and the language technologies the corpus was enriched with to facilitate navigation, i.e Entity Linking against the DBpedia knowledge base and keyphrase extraction. The challenges of tagging a historical domain-specific corpus with a contemporary knowledge base are discussed. The concepts extracted were used to create interactive co-occurrence networks, that serve as a map for the corpus and help navigate it, along with a search index. These corpus representations were integrated in a user interface. The interface was evaluated by domain experts with satisfactory results , e.g. they found the distributional semantics methods exploited here applicable in order to assist in retrieving related passages for scholarly editing of the corpus.
AbstractList British philosopher and reformer Jeremy Bentham (1748-1832) left over 60,000 folios of unpublished manuscripts. The Bentham Project, at University College London, is creating a TEI version of the manuscripts, via crowdsourced transcription verified by experts. We present here an interface to navigate these largely unedited manuscripts, and the language technologies the corpus was enriched with to facilitate navigation, i.e Entity Linking against the DBpedia knowledge base and keyphrase extraction. The challenges of tagging a historical domain-specific corpus with a contemporary knowledge base are discussed. The concepts extracted were used to create interactive co-occurrence networks, that serve as a map for the corpus and help navigate it, along with a search index. These corpus representations were integrated in a user interface. The interface was evaluated by domain experts with satisfactory results , e.g. they found the distributional semantics methods exploited here applicable in order to assist in retrieving related passages for scholarly editing of the corpus.
Author Ruiz Fabo, Pablo
Poibeau, Thierry
Author_xml – sequence: 1
  givenname: Pablo
  orcidid: 0000-0002-4349-4835
  surname: Ruiz Fabo
  fullname: Ruiz Fabo, Pablo
– sequence: 2
  givenname: Thierry
  orcidid: 0000-0003-3669-4051
  surname: Poibeau
  fullname: Poibeau, Thierry
  organization: Lattice - Langues, Textes, Traitements informatiques, Cognition - UMR 8094
BackLink https://hal.science/hal-01915730$$DView record in HAL
BookMark eNptkFFLwzAUhYNMcM49-Qf6KtKZpEmb-jaHusHUF30Od8ntmtG1Ja0D_71dK6Li07kczvkunHMyKqsSCblkdCZinqqbnd3bfCapECdkzAWLQ5mm6ejHfUamTbOjlDIplJRyTMQT1LUrt0GbY3CHZZvDPlhUvn5vbjstDdZtuIEGbfAMB7eF1lXlBTnNoGhw-qUT8vZw_7pYhuuXx9Vivg4N7x6EjNGMJ4IbKZFmaOLMGgvMRMiUjShKGxkuhEK7QRlbg8pyjjQCFadUQhRNyGrg2gp2uvZuD_5DV-B0b1R-q8G3zhSoxeYIkkxkEAuJRjFAlSmQCVilEtqxrgZWDsUv1HK-1kePspTJJKIH3mWvh6zxVdN4zL4LjOp-a91vrY9bd2n2J21c2-_UenDFv51PSGeDPg
CitedBy_id crossref_primary_10_1007_s11356_022_24553_w
crossref_primary_10_3389_fcomp_2024_1472512
ContentType Journal Article
Copyright Distributed under a Creative Commons Attribution 4.0 International License
Copyright_xml – notice: Distributed under a Creative Commons Attribution 4.0 International License
DBID AAYXX
CITATION
1XC
BXJBU
IHQJB
VOOES
DOA
DOI 10.46298/jdmdh.5044
DatabaseName CrossRef
Hyper Article en Ligne (HAL)
HAL-SHS: Archive ouverte en Sciences de l'Homme et de la Société
HAL-SHS: Archive ouverte en Sciences de l'Homme et de la Société (Open Access)
Hyper Article en Ligne (HAL) (Open Access)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList
CrossRef

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Philosophy
EISSN 2416-5999
ExternalDocumentID oai_doaj_org_article_4b448e514fa645ec81ae8f8a57ad8870
oai:HAL:hal-01915730v2
10_46298_jdmdh_5044
GroupedDBID 5VS
AAFWJ
AAYXX
ADBBV
ADQAK
AFPKN
ALMA_UNASSIGNED_HOLDINGS
BCNDV
CITATION
FRP
GROUPED_DOAJ
KQ8
M~E
OK1
1XC
BXJBU
IHQJB
VOOES
ID FETCH-LOGICAL-c2154-110f2742c55e0fec6fdcda1c3e18d30e5d3c2448edbe56dce8d22e03a86905a33
IEDL.DBID DOA
ISSN 2416-5999
IngestDate Fri Oct 03 12:44:35 EDT 2025
Tue Oct 14 20:28:41 EDT 2025
Tue Nov 18 21:12:29 EST 2025
Sat Nov 29 04:10:29 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue Data deluge: which skills for...
Keywords keyphrase extraction
manuscripts
corpus navigation
Jeremy Bentham
entity linking
Language English
License Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c2154-110f2742c55e0fec6fdcda1c3e18d30e5d3c2448edbe56dce8d22e03a86905a33
ORCID 0000-0002-4349-4835
0000-0003-3669-4051
OpenAccessLink https://doaj.org/article/4b448e514fa645ec81ae8f8a57ad8870
ParticipantIDs doaj_primary_oai_doaj_org_article_4b448e514fa645ec81ae8f8a57ad8870
hal_primary_oai_HAL_hal_01915730v2
crossref_primary_10_46298_jdmdh_5044
crossref_citationtrail_10_46298_jdmdh_5044
PublicationCentury 2000
PublicationDate 2019-03-06
PublicationDateYYYYMMDD 2019-03-06
PublicationDate_xml – month: 03
  year: 2019
  text: 2019-03-06
  day: 06
PublicationDecade 2010
PublicationTitle Journal of data mining and digital humanities
PublicationYear 2019
Publisher INRIA
Nicolas Turenne
Publisher_xml – name: INRIA
– name: Nicolas Turenne
SSID ssj0001548555
Score 2.0587156
Snippet British philosopher and reformer Jeremy Bentham (1748-1832) left over 60,000 folios of unpublished manuscripts. The Bentham Project, at University College...
SourceID doaj
hal
crossref
SourceType Open Website
Open Access Repository
Enrichment Source
Index Database
SubjectTerms [info.info-cl]computer science [cs]/computation and language [cs.cl]
[shs.langue]humanities and social sciences/linguistics
[shs.phil]humanities and social sciences/philosophy
Computation and Language
Computer Science
corpus navigation
entity linking
Humanities and Social Sciences
jeremy bentham
keyphrase extraction
Linguistics
Philosophy
Title Mapping the Bentham Corpus: Concept-based Navigation
URI https://hal.science/hal-01915730
https://doaj.org/article/4b448e514fa645ec81ae8f8a57ad8870
Volume Atelier Digit_Hum
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2416-5999
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001548555
  issn: 2416-5999
  databaseCode: DOA
  dateStart: 20140101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2416-5999
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001548555
  issn: 2416-5999
  databaseCode: M~E
  dateStart: 20140101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8NAEF6kePDiW6wvgvQkrM1rN1lvrbT0YIsHhd7C7s7GKjaVvo7-dmc3aWlB8OIlgbAk4ZtsvvmSnW8IaQQ6hxBUQgOhBLW11lREqaIxlxx4qI3Q2jWbSAaDdDgUzxutvuyasNIeuASuGSsUEAZpPZc8ZkangTRpnkqWSMAJ4tS6n4gNMVXWB1vTE1YW5MU8FGnzA8Ywumd-HG9RkHPqR2IZrT6kOmLpHpL9KiP0WuWdHJEdUxyTg1W3Ba-afCck7kvrpfDmYcrmtZErRnLsWSPixewB9676kFpSAm8gl845Y1Kcktdu5-WxR6ueB1Qj-cYU2Ti3f081Y8bPjeY5aJCBjkyQQuQbBpEOLSKgDOOgTQphaPxI2s5STEbRGakVk8KcEy8HXziJYVASAQolUEIwqXjAlfGlrpO7FQyZrgzBbV-KzwyFgcMsc5hlFrM6aawHf5U-GL8Pa1s810OsebU7gCHNqpBmf4W0Tm4xGlvn6LWeMnsME9KA4UtpGV78x5UuyR4mQMKtKeNXpDafLsw12dXL-ftseuMeKdz2vzs_ptjSFg
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Mapping+the+Bentham+Corpus%3A+Concept-based+Navigation&rft.jtitle=Journal+of+data+mining+and+digital+humanities&rft.au=Pablo+Ruiz&rft.au=Thierry+Poibeau&rft.date=2019-03-06&rft.pub=Nicolas+Turenne&rft.eissn=2416-5999&rft.volume=Atelier+Digit%5C_Hum&rft.issue=Data+deluge%3A+which+skills+for...&rft_id=info:doi/10.46298%2Fjdmdh.5044&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_4b448e514fa645ec81ae8f8a57ad8870
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2416-5999&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2416-5999&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2416-5999&client=summon