Concurrent Processing of Retail Data in Python to Optimize Runtime

Saved in:
Bibliographic Details
Title: Concurrent Processing of Retail Data in Python to Optimize Runtime
Authors: Slavin, Bobby
Source: Data Science Undergraduate Honors Theses
Publisher Information: ScholarWorks@UARK
Publication Year: 2024
Collection: University of Arkansas: ScholarWorks@UARK
Subject Terms: Concurrent Processing, Multiprocessing, Multithreading, Data Science
Description: This thesis explores the application of multiprocessing and multithreading techniques in Python to optimize runtime efficiency on the analysis of retail data. As the retail data processed by a program increases, so does the runtime of the program. If you are performing this processing using only a single core, even a gigabyte of data can potentially take upwards to half an hour to finish processing, while larger datasets of 100 GB or more could take days, heavily limiting the amount of retail data that can be processed in a reasonable amount of time. By employing multithreading and multiprocessing architectures in Python, a programming language commonly used in data analysis, this study attempts to evaluate their efficacy and feasibility in reducing the runtime of retail data processing to more manageable levels. The results of this study show the importance of utilizing concurrent computing paradigms to address the computational challenges posed by the ever-expanding volumes of retail data.
Document Type: text
File Description: application/pdf
Language: unknown
Relation: https://scholarworks.uark.edu/dtscuht/12; https://scholarworks.uark.edu/context/dtscuht/article/1012/viewcontent/Bobby_Slavin_Undergraduate_Thesis___Practicum_Thesis_on_Concurrent_Processing_of_Retail_Data_to_Optimize_Runtime.pdf
Availability: https://scholarworks.uark.edu/dtscuht/12
https://scholarworks.uark.edu/context/dtscuht/article/1012/viewcontent/Bobby_Slavin_Undergraduate_Thesis___Practicum_Thesis_on_Concurrent_Processing_of_Retail_Data_to_Optimize_Runtime.pdf
Accession Number: edsbas.A4E3DA7
Database: BASE
Description
Abstract:This thesis explores the application of multiprocessing and multithreading techniques in Python to optimize runtime efficiency on the analysis of retail data. As the retail data processed by a program increases, so does the runtime of the program. If you are performing this processing using only a single core, even a gigabyte of data can potentially take upwards to half an hour to finish processing, while larger datasets of 100 GB or more could take days, heavily limiting the amount of retail data that can be processed in a reasonable amount of time. By employing multithreading and multiprocessing architectures in Python, a programming language commonly used in data analysis, this study attempts to evaluate their efficacy and feasibility in reducing the runtime of retail data processing to more manageable levels. The results of this study show the importance of utilizing concurrent computing paradigms to address the computational challenges posed by the ever-expanding volumes of retail data.