Rule-based spreadsheet data transformation from arbitrary to relational tables

•Spreadsheet data transformation can be considered as table understanding.•The two-layered table object model represents arbitrary spreadsheet tables.•Our rule-based language enables to express consecutive steps of table understanding.•Execution of the rules provides recovering implicit table semant...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Information systems (Oxford) Ročník 71; s. 123 - 136
Hlavní autoři: Shigarov, Alexey O., Mikhailov, Andrey A.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Oxford Elsevier Ltd 01.11.2017
Elsevier Science Ltd
Témata:
ISSN:0306-4379, 1873-6076
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:•Spreadsheet data transformation can be considered as table understanding.•The two-layered table object model represents arbitrary spreadsheet tables.•Our rule-based language enables to express consecutive steps of table understanding.•Execution of the rules provides recovering implicit table semantic (structure).•A rule-set allows converting arbitrary tables of the same genre into databases. The paper discusses issues of rule-based data transformation from arbitrary spreadsheet tables to a canonical (relational) form. We present a novel table object model and rule-based language for table analysis and interpretation. The model is intended to represent a physical (cellular) and logical (semantic) structure of an arbitrary table in the transformation process. The language allows drawing up this process as consecutive steps of table understanding, i. e. recovering implicit semantics. Both are implemented in our tool for spreadsheet data canonicalization. The presented case study demonstrates the use of the tool for developing a task-specific rule-set to convert data from arbitrary tables of the same genre (government statistical websites) to flat file databases. The performance evaluation confirms the applicability of the implemented rule-set in accomplishing the stated objectives of the application.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0306-4379
1873-6076
DOI:10.1016/j.is.2017.08.004