Can a domain-specific language improve program structure comprehension of data pipelines? A mixed-methods study

In many application domains, domain-specific languages can allow domain experts to contribute to collaborative projects more correctly and efficiently. To do so, they must be able to understand program structure from reading existing source code. With high-quality data becoming an increasingly impor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Empirical software engineering : an international journal Jg. 31; H. 1; S. 17
Hauptverfasser: Heltweg, Philip, Schwarz, Georg-Daniel, Riehle, Dirk
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York Springer US 01.02.2026
Springer Nature B.V
Schlagworte:
ISSN:1382-3256, 1573-7616
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In many application domains, domain-specific languages can allow domain experts to contribute to collaborative projects more correctly and efficiently. To do so, they must be able to understand program structure from reading existing source code. With high-quality data becoming an increasingly important resource, the creation of data pipelines is an important application domain for domain-specific languages. We execute a mixed-method study consisting of a controlled experiment and a follow-up descriptive survey among the participants to understand the effects of a domain-specific language on bottom-up program understanding and generate hypotheses for future research. During the experiment, participants ( ) need the same time (Wilcoxon signed-rank test, , , RBC = .093) to solve program structure comprehension tasks, but submit significantly more correct solutions (McNemar’s test, , , OR = 4.8) when using the domain-specific language. In the descriptive survey, participants describe reasons related to the programming language itself, such as a better pipeline overview, more enforced code structure, and a closer alignment to the mental model of a data pipeline. In addition, human factors such as less required programming experience and the ability to reuse experience from other data engineering tools are discussed. Based on these results, domain-specific languages are a promising tool for creating data pipelines that can increase correct understanding of program structure and lower barriers to entry for domain experts. Open questions exist to make more informed implementation decisions for domain-specific languages for data pipelines in the future.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1382-3256
1573-7616
DOI:10.1007/s10664-025-10746-7