DATABOOK : a standardised framework for dynamic documentation of algorithm design during Data Science projects

This paper proposes a standard documentation framework for Data Science projects, called Databook. It is a result of five years of action-research on multiple projects in several sectors of activity in France, and of a confrontation of standard theoretical Data Science processes, such as CRISP_DM, w...

Full description

Saved in:
Bibliographic Details
Published in:IASSIST quarterly Vol. 45; no. 2
Main Author: Nesvijevskaia, Anna
Format: Journal Article
Language:English
Published: International Association for Social Science Information Service and Technology 26.09.2021
Subjects:
ISSN:0739-1137, 2331-4141
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract This paper proposes a standard documentation framework for Data Science projects, called Databook. It is a result of five years of action-research on multiple projects in several sectors of activity in France, and of a confrontation of standard theoretical Data Science processes, such as CRISP_DM, with the reality of the field. As a vector for knowledge sharing and capitalisation, the Databook has been identified as one of the main facilitators of Human Data Mediation. Transformed into an operational prototype of simple and minimalist documentation, it has since been tested then on about a hundred Data Science projects, has proven its benefits for the internal and external efficiency of Data Science projects, and can be turned into a more ambitious standard framework for data patrimony valorisation and data quality governance.
AbstractList This paper proposes a standard documentation framework for Data Science projects, called Databook. It is a result of five years of action-research on multiple projects in several sectors of activity in France, and of a confrontation of standard theoretical Data Science processes, such as CRISP_DM, with the reality of the field. As a vector for knowledge sharing and capitalisation, the Databook has been identified as one of the main facilitators of Human Data Mediation. Transformed into an operational prototype of simple and minimalist documentation, it has since been tested on about a hundred Data Science projects, has proven its benefits for the internal and external efficiency of Data Science projects, and can be turned into a more ambitious standard framework for data patrimony valorisation and data quality governance. Cet article propose un cadre documentaire standard, appelé Databook, pour les projets de Data Science. Il est le résultat de cinq années de recherche-action sur de multiples projets dans plusieurs secteurs d'activité en France, et d'une confrontation des processus théoriques standards de la Data Science, tels que CRISP_DM, avec la réalité du terrain. En tant que vecteur de partage et de capitalisation des connaissances, le Databook a été identifié comme l'un des principaux facilitateurs de la Médiation des Données Humaines. Transformé en prototype opérationnel de documentation simple et minimaliste, il a depuis été testé sur une centaine de projets de Data Science, a prouvé ses bénéfices pour l'efficacité interne et externe des projets de Data Science, et peut être transformé en un cadre standard plus ambitieux pour la valorisation du patrimoine de données et la gouvernance de la qualité des données.
This paper proposes a standard documentation framework for Data Science projects, called Databook. It is a result of five years of action-research on multiple projects in several sectors of activity in France, and of a confrontation of standard theoretical Data Science processes, such as CRISP_DM, with the reality of the field. As a vector for knowledge sharing and capitalisation, the Databook has been identified as one of the main facilitators of Human Data Mediation. Transformed into an operational prototype of simple and minimalist documentation, it has since been tested then on about a hundred Data Science projects, has proven its benefits for the internal and external efficiency of Data Science projects, and can be turned into a more ambitious standard framework for data patrimony valorisation and data quality governance.
Author Nesvijevskaia, Anna
Author_xml – sequence: 1
  givenname: Anna
  surname: Nesvijevskaia
  fullname: Nesvijevskaia, Anna
BackLink https://hal.science/hal-03356739$$DView record in HAL
BookMark eNpdkVtrGzEQhUVxoXaa_gY9FQLZVNddqW-Oc3GowQ9Nn5exLo7cXcmRNin599nYpdA8DXP45jCcM0OTmKJD6AslF0zThn8Lj1rpD2jKOKeVoIJO0JQ0XFeU8uYTmpWyI4TXUrMpilfz-_nlev0Df8eAywDRQrahOIt9ht79Sfk39ilj-xKhDwbbZJ56FwcYQoo4eQzdNuUwPPTYuhK2EdunHOIWX8EA-KcJLhqH9zntnBnKZ_TRQ1fc6d95gn7dXN8vltVqfXu3mK8qwwjRVc2Y1bVyjhunpK2FkkyNH5NGMer1RlNjmPDEC7rZSMalpVJ54Rhn1ilm-Am6O_raBLt2n0MP-aVNENqDkPK2hTwE07lWaGlBEqNlI4QiRDXejgthghkpPRm9zo5eD9D9Z7Wcr9o3jXAu6zHfZzqyX4-syamU7Py_A0raQzvtoZ0RPH8HmnDMdMgQuvf4K-D-kLY
CitedBy_id crossref_primary_10_1017_dap_2021_3
crossref_primary_10_1177_09610006251342811
ContentType Journal Article
Copyright Attribution - NonCommercial
Copyright_xml – notice: Attribution - NonCommercial
DBID AAYXX
CITATION
1XC
BXJBU
IHQJB
VOOES
DOA
DOI 10.29173/iq989
DatabaseName CrossRef
Hyper Article en Ligne (HAL)
HAL-SHS: Archive ouverte en Sciences de l'Homme et de la Société
HAL-SHS: Archive ouverte en Sciences de l'Homme et de la Société (Open Access)
Hyper Article en Ligne (HAL) (Open Access)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList

CrossRef
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Social Sciences (General)
Statistics
Computer Science
EISSN 2331-4141
ExternalDocumentID oai_doaj_org_article_495da50c9574480087fdc950242c55f0
oai:HAL:hal-03356739v1
10_29173_iq989
GroupedDBID .4I
29I
2WC
5GY
AAFWJ
AAKPC
AAOTV
AAYXX
ABDBF
ACIPV
AFPKN
ALMA_UNASSIGNED_HOLDINGS
CITATION
E3Z
ELW
GROUPED_DOAJ
M48
MK~
ML~
OK1
OVT
RNS
UGJ
1XC
BXJBU
IHQJB
VOOES
ID FETCH-LOGICAL-c2009-622d968ee3ce85d64852836507821f9b91cc24f0f41bb5235d158f4e232de82c3
IEDL.DBID DOA
ISSN 0739-1137
IngestDate Fri Oct 03 12:47:33 EDT 2025
Tue Oct 14 20:41:06 EDT 2025
Tue Nov 18 20:52:23 EST 2025
Sat Nov 29 03:35:19 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License http://creativecommons.org/licenses/by-nc/4.0
Attribution - NonCommercial: http://creativecommons.org/licenses/by-nc
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c2009-622d968ee3ce85d64852836507821f9b91cc24f0f41bb5235d158f4e232de82c3
OpenAccessLink https://doaj.org/article/495da50c9574480087fdc950242c55f0
ParticipantIDs doaj_primary_oai_doaj_org_article_495da50c9574480087fdc950242c55f0
hal_primary_oai_HAL_hal_03356739v1
crossref_primary_10_29173_iq989
crossref_citationtrail_10_29173_iq989
PublicationCentury 2000
PublicationDate 2021-09-26
PublicationDateYYYYMMDD 2021-09-26
PublicationDate_xml – month: 09
  year: 2021
  text: 2021-09-26
  day: 26
PublicationDecade 2020
PublicationTitle IASSIST quarterly
PublicationYear 2021
Publisher International Association for Social Science Information Service and Technology
Publisher_xml – name: International Association for Social Science Information Service and Technology
SSID ssj0036592
Score 2.1575885
Snippet This paper proposes a standard documentation framework for Data Science projects, called Databook. It is a result of five years of action-research on multiple...
SourceID doaj
hal
crossref
SourceType Open Website
Open Access Repository
Enrichment Source
Index Database
SubjectTerms Algorithm Transparency
Artificial Intelligence
Computer Science
Data Science
Documentation
Humanities and Social Sciences
Library and information sciences
Machine Learning
Project Process
Reproducibility
Statistics
Title DATABOOK : a standardised framework for dynamic documentation of algorithm design during Data Science projects
URI https://hal.science/hal-03356739
https://doaj.org/article/495da50c9574480087fdc950242c55f0
Volume 45
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2331-4141
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0036592
  issn: 0739-1137
  databaseCode: DOA
  dateStart: 20180101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA4iHryIT1yfgyjoodimSZt421UXQVk9KHgraR66oF3d7u7vN49WVjx48dg0NGUy7XyTTL4PoeMSK5bjWEU4kTwigsRRmUkdMU11rIROsfKqJXf5YMCen_nDnNSXqwkL9MDBcOcWwCtBY8lpbjMJx6BmlL1woUVSany2blFPm0yFf3DqNguDkhC2-Uh6PvzkTsh9LvR4hn4bUF7bBVQfUPqraKVBgtANb7CGFnS1jjrhuCw0n1wNpw0v9NkGqq66j93e_f0tXICAdg1gWGsFpi2xAotBQQWVeVAjOX1vzhZVMDIg3l5G4-Hk9R2UL9yAcEgRrsREtENCszZTb6Kn_vXj5U3UyCVE0m9xZBgrnjGtUydFqjLCHHGLRWAWBCSGlzyREhMTG5KUpc0_qUooM0RbTKU0wzLdQovVqNLbCHImCCEqYY68PY8FF446Dls4Q43O46yDTlpLFrLhEneSFm-FzSm8xQtv8Q46_O73EdgzfvXouYn4vuvYrn2D9YGi8YHiLx_ooCM7jT-ecdO9K1xbnKY0y1M-S3b-Y6RdtIxdUYvblsr20OJkPNX7aEnOJsN6fOB98AspSN6y
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=DATABOOK+%3A+a+standardised+framework+for+dynamic+documentation+of+algorithm+design+during+Data+Science+projects&rft.jtitle=IASSIST+quarterly&rft.au=Nesvijevskaia%2C+Anna&rft.date=2021-09-26&rft.pub=International+Association+for+Social+Science+Information+Service+and+Technology&rft.issn=0739-1137&rft.eissn=2331-4141&rft.volume=45&rft.issue=2&rft_id=info:doi/10.29173%2Fiq989&rft.externalDBID=HAS_PDF_LINK&rft.externalDocID=oai%3AHAL%3Ahal-03356739v1
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0739-1137&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0739-1137&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0739-1137&client=summon