A project-based learning framework for teaching distributed data processing
Saved in:
| Title: | A project-based learning framework for teaching distributed data processing |
|---|---|
| Authors: | Rashid Turgunbaev |
| Source: | Technical Science Integrated Research; Vol. 1 No. 5 (2025): Technical Science Integrated Research; 3-7 ; 3051-3855 |
| Publisher Information: | Technical Science Integrated Research |
| Publication Year: | 2025 |
| Subject Terms: | distributed data processing, project-based learning, big data education, apache spark, computational pedagogy, data engineering |
| Description: | The rapid ascent of big data technologies has fundamentally reshaped the computational landscape, creating a significant demand for a workforce proficient in distributed data processing. Traditional pedagogical methods in computer science, which often emphasize discrete algorithmic problems and localized execution environments, are increasingly misaligned with the practical, systems-oriented challenges inherent in this domain. This article proposes a comprehensive project-based learning framework designed specifically for teaching distributed data processing. The framework moves beyond theoretical exposition and simple syntax tutorials, instead situating learning within the context of a sustained, complex, and authentic project that mirrors the realities of data engineering in industry and research. We argue that this approach is not merely beneficial but essential for cultivating a deep, integrated understanding of concepts such as parallelization, fault tolerance, and cluster resource management. The article details the core principles of the framework, outlines a phased implementation strategy, discusses the challenges of managing a distributed systems classroom, and presents a qualitative analysis of the competencies developed. The primary thesis is that by grappling with the entire data lifecycle - from ingestion and storage to processing and analysis - within a project-based paradigm, students develop the robust technical skills and, more critically, the systemic problem-solving mindset required to navigate the complexities of modern data infrastructure. |
| Document Type: | article in journal/newspaper |
| File Description: | application/pdf |
| Language: | English |
| Relation: | https://altumnova.com/index.php/tsir/article/view/25/22; https://altumnova.com/index.php/tsir/article/view/25 |
| Availability: | https://altumnova.com/index.php/tsir/article/view/25 |
| Rights: | https://creativecommons.org/licenses/by/4.0 |
| Accession Number: | edsbas.4F52FB06 |
| Database: | BASE |
| Abstract: | The rapid ascent of big data technologies has fundamentally reshaped the computational landscape, creating a significant demand for a workforce proficient in distributed data processing. Traditional pedagogical methods in computer science, which often emphasize discrete algorithmic problems and localized execution environments, are increasingly misaligned with the practical, systems-oriented challenges inherent in this domain. This article proposes a comprehensive project-based learning framework designed specifically for teaching distributed data processing. The framework moves beyond theoretical exposition and simple syntax tutorials, instead situating learning within the context of a sustained, complex, and authentic project that mirrors the realities of data engineering in industry and research. We argue that this approach is not merely beneficial but essential for cultivating a deep, integrated understanding of concepts such as parallelization, fault tolerance, and cluster resource management. The article details the core principles of the framework, outlines a phased implementation strategy, discusses the challenges of managing a distributed systems classroom, and presents a qualitative analysis of the competencies developed. The primary thesis is that by grappling with the entire data lifecycle - from ingestion and storage to processing and analysis - within a project-based paradigm, students develop the robust technical skills and, more critically, the systemic problem-solving mindset required to navigate the complexities of modern data infrastructure. |
|---|
Nájsť tento článok vo Web of Science