DATABOOK : a standardised framework for dynamic documentation of algorithm design during Data Science projects
This paper proposes a standard documentation framework for Data Science projects, called Databook. It is a result of five years of action-research on multiple projects in several sectors of activity in France, and of a confrontation of standard theoretical Data Science processes, such as CRISP_DM, w...
Saved in:
| Published in: | IASSIST quarterly Vol. 45; no. 2 |
|---|---|
| Main Author: | |
| Format: | Journal Article |
| Language: | English |
| Published: |
International Association for Social Science Information Service and Technology
26.09.2021
|
| Subjects: | |
| ISSN: | 0739-1137, 2331-4141 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | This paper proposes a standard documentation framework for Data Science projects, called Databook. It is a result of five years of action-research on multiple projects in several sectors of activity in France, and of a confrontation of standard theoretical Data Science processes, such as CRISP_DM, with the reality of the field. As a vector for knowledge sharing and capitalisation, the Databook has been identified as one of the main facilitators of Human Data Mediation. Transformed into an operational prototype of simple and minimalist documentation, it has since been tested then on about a hundred Data Science projects, has proven its benefits for the internal and external efficiency of Data Science projects, and can be turned into a more ambitious standard framework for data patrimony valorisation and data quality governance. |
|---|---|
| AbstractList | This paper proposes a standard documentation framework for Data Science projects, called Databook. It is a result of five years of action-research on multiple projects in several sectors of activity in France, and of a confrontation of standard theoretical Data Science processes, such as CRISP_DM, with the reality of the field. As a vector for knowledge sharing and capitalisation, the Databook has been identified as one of the main facilitators of Human Data Mediation. Transformed into an operational prototype of simple and minimalist documentation, it has since been tested on about a hundred Data Science projects, has proven its benefits for the internal and external efficiency of Data Science projects, and can be turned into a more ambitious standard framework for data patrimony valorisation and data quality governance.
Cet article propose un cadre documentaire standard, appelé Databook, pour les projets de Data Science. Il est le résultat de cinq années de recherche-action sur de multiples projets dans plusieurs secteurs d'activité en France, et d'une confrontation des processus théoriques standards de la Data Science, tels que CRISP_DM, avec la réalité du terrain. En tant que vecteur de partage et de capitalisation des connaissances, le Databook a été identifié comme l'un des principaux facilitateurs de la Médiation des Données Humaines. Transformé en prototype opérationnel de documentation simple et minimaliste, il a depuis été testé sur une centaine de projets de Data Science, a prouvé ses bénéfices pour l'efficacité interne et externe des projets de Data Science, et peut être transformé en un cadre standard plus ambitieux pour la valorisation du patrimoine de données et la gouvernance de la qualité des données. This paper proposes a standard documentation framework for Data Science projects, called Databook. It is a result of five years of action-research on multiple projects in several sectors of activity in France, and of a confrontation of standard theoretical Data Science processes, such as CRISP_DM, with the reality of the field. As a vector for knowledge sharing and capitalisation, the Databook has been identified as one of the main facilitators of Human Data Mediation. Transformed into an operational prototype of simple and minimalist documentation, it has since been tested then on about a hundred Data Science projects, has proven its benefits for the internal and external efficiency of Data Science projects, and can be turned into a more ambitious standard framework for data patrimony valorisation and data quality governance. |
| Author | Nesvijevskaia, Anna |
| Author_xml | – sequence: 1 givenname: Anna surname: Nesvijevskaia fullname: Nesvijevskaia, Anna |
| BackLink | https://hal.science/hal-03356739$$DView record in HAL |
| BookMark | eNpdkVtrGzEQhUVxoXaa_gY9FQLZVNddqW-Oc3GowQ9Nn5exLo7cXcmRNin599nYpdA8DXP45jCcM0OTmKJD6AslF0zThn8Lj1rpD2jKOKeVoIJO0JQ0XFeU8uYTmpWyI4TXUrMpilfz-_nlev0Df8eAywDRQrahOIt9ht79Sfk39ilj-xKhDwbbZJ56FwcYQoo4eQzdNuUwPPTYuhK2EdunHOIWX8EA-KcJLhqH9zntnBnKZ_TRQ1fc6d95gn7dXN8vltVqfXu3mK8qwwjRVc2Y1bVyjhunpK2FkkyNH5NGMer1RlNjmPDEC7rZSMalpVJ54Rhn1ilm-Am6O_raBLt2n0MP-aVNENqDkPK2hTwE07lWaGlBEqNlI4QiRDXejgthghkpPRm9zo5eD9D9Z7Wcr9o3jXAu6zHfZzqyX4-syamU7Py_A0raQzvtoZ0RPH8HmnDMdMgQuvf4K-D-kLY |
| CitedBy_id | crossref_primary_10_1017_dap_2021_3 crossref_primary_10_1177_09610006251342811 |
| ContentType | Journal Article |
| Copyright | Attribution - NonCommercial |
| Copyright_xml | – notice: Attribution - NonCommercial |
| DBID | AAYXX CITATION 1XC BXJBU IHQJB VOOES DOA |
| DOI | 10.29173/iq989 |
| DatabaseName | CrossRef Hyper Article en Ligne (HAL) HAL-SHS: Archive ouverte en Sciences de l'Homme et de la Société HAL-SHS: Archive ouverte en Sciences de l'Homme et de la Société (Open Access) Hyper Article en Ligne (HAL) (Open Access) DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | CrossRef |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Social Sciences (General) Statistics Computer Science |
| EISSN | 2331-4141 |
| ExternalDocumentID | oai_doaj_org_article_495da50c9574480087fdc950242c55f0 oai:HAL:hal-03356739v1 10_29173_iq989 |
| GroupedDBID | .4I 29I 2WC 5GY AAFWJ AAKPC AAOTV AAYXX ABDBF ACIPV AFPKN ALMA_UNASSIGNED_HOLDINGS CITATION E3Z ELW GROUPED_DOAJ M48 MK~ ML~ OK1 OVT RNS UGJ 1XC BXJBU IHQJB VOOES |
| ID | FETCH-LOGICAL-c2009-622d968ee3ce85d64852836507821f9b91cc24f0f41bb5235d158f4e232de82c3 |
| IEDL.DBID | DOA |
| ISSN | 0739-1137 |
| IngestDate | Fri Oct 03 12:47:33 EDT 2025 Tue Oct 14 20:41:06 EDT 2025 Tue Nov 18 20:52:23 EST 2025 Sat Nov 29 03:35:19 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 2 |
| Language | English |
| License | http://creativecommons.org/licenses/by-nc/4.0 Attribution - NonCommercial: http://creativecommons.org/licenses/by-nc |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c2009-622d968ee3ce85d64852836507821f9b91cc24f0f41bb5235d158f4e232de82c3 |
| OpenAccessLink | https://doaj.org/article/495da50c9574480087fdc950242c55f0 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_495da50c9574480087fdc950242c55f0 hal_primary_oai_HAL_hal_03356739v1 crossref_primary_10_29173_iq989 crossref_citationtrail_10_29173_iq989 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-09-26 |
| PublicationDateYYYYMMDD | 2021-09-26 |
| PublicationDate_xml | – month: 09 year: 2021 text: 2021-09-26 day: 26 |
| PublicationDecade | 2020 |
| PublicationTitle | IASSIST quarterly |
| PublicationYear | 2021 |
| Publisher | International Association for Social Science Information Service and Technology |
| Publisher_xml | – name: International Association for Social Science Information Service and Technology |
| SSID | ssj0036592 |
| Score | 2.1575885 |
| Snippet | This paper proposes a standard documentation framework for Data Science projects, called Databook. It is a result of five years of action-research on multiple... |
| SourceID | doaj hal crossref |
| SourceType | Open Website Open Access Repository Enrichment Source Index Database |
| SubjectTerms | Algorithm Transparency Artificial Intelligence Computer Science Data Science Documentation Humanities and Social Sciences Library and information sciences Machine Learning Project Process Reproducibility Statistics |
| Title | DATABOOK : a standardised framework for dynamic documentation of algorithm design during Data Science projects |
| URI | https://hal.science/hal-03356739 https://doaj.org/article/495da50c9574480087fdc950242c55f0 |
| Volume | 45 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 2331-4141 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0036592 issn: 0739-1137 databaseCode: DOA dateStart: 20180101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA4iHryIT1yfgyjoodimSZt421UXQVk9KHgraR66oF3d7u7vN49WVjx48dg0NGUy7XyTTL4PoeMSK5bjWEU4kTwigsRRmUkdMU11rIROsfKqJXf5YMCen_nDnNSXqwkL9MDBcOcWwCtBY8lpbjMJx6BmlL1woUVSany2blFPm0yFf3DqNguDkhC2-Uh6PvzkTsh9LvR4hn4bUF7bBVQfUPqraKVBgtANb7CGFnS1jjrhuCw0n1wNpw0v9NkGqq66j93e_f0tXICAdg1gWGsFpi2xAotBQQWVeVAjOX1vzhZVMDIg3l5G4-Hk9R2UL9yAcEgRrsREtENCszZTb6Kn_vXj5U3UyCVE0m9xZBgrnjGtUydFqjLCHHGLRWAWBCSGlzyREhMTG5KUpc0_qUooM0RbTKU0wzLdQovVqNLbCHImCCEqYY68PY8FF446Dls4Q43O46yDTlpLFrLhEneSFm-FzSm8xQtv8Q46_O73EdgzfvXouYn4vuvYrn2D9YGi8YHiLx_ooCM7jT-ecdO9K1xbnKY0y1M-S3b-Y6RdtIxdUYvblsr20OJkPNX7aEnOJsN6fOB98AspSN6y |
| linkProvider | Directory of Open Access Journals |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=DATABOOK+%3A+a+standardised+framework+for+dynamic+documentation+of+algorithm+design+during+Data+Science+projects&rft.jtitle=IASSIST+quarterly&rft.au=Nesvijevskaia%2C+Anna&rft.date=2021-09-26&rft.pub=International+Association+for+Social+Science+Information+Service+and+Technology&rft.issn=0739-1137&rft.eissn=2331-4141&rft.volume=45&rft.issue=2&rft_id=info:doi/10.29173%2Fiq989&rft.externalDBID=HAS_PDF_LINK&rft.externalDocID=oai%3AHAL%3Ahal-03356739v1 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0739-1137&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0739-1137&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0739-1137&client=summon |