Impact of Different Data Management Frameworks on Common Data Management Tasks in Information System (R Language Perspective)

Saved in:
Bibliographic Details
Title: Impact of Different Data Management Frameworks on Common Data Management Tasks in Information System (R Language Perspective)
Authors: Masood H. Siddiqui, Aanchal A. Awasthi, Anant Prakash Awasthi, Niraj Kumar Singh
Source: International Journal of Intelligent Systems and Applications in Engineering; Vol. 12 No. 21s (2024); 1711-1720
Publisher Information: International Journal of Intelligent Systems and Applications in Engineering, 2024.
Publication Year: 2024
Subject Terms: Memory Management in R, Performance in R, Native R, Tidyverse, Data.Table
Description: To maximize data processing and analysis, effective data management is essential. It ensures that data is efficiently processed, readily accessible, secure, and well-organized. This enhances data integrity, reduces the amount of redundancy, and it makes decision-making more prompt. In an era where data is a valued asset that drives innovation and strategic decision-making, effective data management techniques are essential. The two essential data management activities for improving data processing are joining and sorting. By combining datasets based on common characteristics, joining makes thorough analysis easier. Sorting data well enhances search and retrieval. When combined, these processes enhance the accuracy and speed of data processing, simplifying workflows and enabling sound decision-making. Database management systems depend on joining and sorting to enable the creation of value, the extraction of significant insights, and the identification of trends from massive datasets. The performance of native R, tidyverse, and data.table when merging data in R varies. Large datasets may cause Native R to lag, despite its versatility. Known for its readability, Tidyverse strikes a balance between performance and simplicity. Because of its exceptional speed, Data.table is a very effective option for large-scale data joins. The decision is based on the complexity and amount of the dataset. The best option for maximum performance, particularly for complex and large-scale jobs, is Data.table. Native R and Tidyverse work well with smaller, more manageable datasets when code readability is crucial. Every method addresses particular requirements in R data analysis. Similarly, when it comes to sorting data in R, Native R, tidyverse, and data.table behave differently. While Native R provides a standard method, it might not be as effective with larger datasets. Although readability is given priority in Tidyverse's user-friendly syntax, it may not be as fast as more efficient options. Once more, Data.table runs faster and uses less memory when sorting large amounts of data than the competition. The decision is based on the needs of the analysis: data.table for best performance, especially with large datasets and computationally intensive tasks; tidyverse for readability; and Native R for simplicity. Hence, in order to sum up, effective data management is essential for businesses to fully utilize their data and make wise decisions. Optimizing data processing and analysis requires careful consideration of joining, sorting, and tool selection.
Document Type: Article
File Description: application/pdf
Language: English
ISSN: 2147-6799
Access URL: https://www.ijisae.org/index.php/IJISAE/article/view/5709
Rights: CC BY SA
Accession Number: edsair.issn21476799..d1e5b09c44d9d149c461fefb9f76c5d3
Database: OpenAIRE
Description
Abstract:To maximize data processing and analysis, effective data management is essential. It ensures that data is efficiently processed, readily accessible, secure, and well-organized. This enhances data integrity, reduces the amount of redundancy, and it makes decision-making more prompt. In an era where data is a valued asset that drives innovation and strategic decision-making, effective data management techniques are essential. The two essential data management activities for improving data processing are joining and sorting. By combining datasets based on common characteristics, joining makes thorough analysis easier. Sorting data well enhances search and retrieval. When combined, these processes enhance the accuracy and speed of data processing, simplifying workflows and enabling sound decision-making. Database management systems depend on joining and sorting to enable the creation of value, the extraction of significant insights, and the identification of trends from massive datasets. The performance of native R, tidyverse, and data.table when merging data in R varies. Large datasets may cause Native R to lag, despite its versatility. Known for its readability, Tidyverse strikes a balance between performance and simplicity. Because of its exceptional speed, Data.table is a very effective option for large-scale data joins. The decision is based on the complexity and amount of the dataset. The best option for maximum performance, particularly for complex and large-scale jobs, is Data.table. Native R and Tidyverse work well with smaller, more manageable datasets when code readability is crucial. Every method addresses particular requirements in R data analysis. Similarly, when it comes to sorting data in R, Native R, tidyverse, and data.table behave differently. While Native R provides a standard method, it might not be as effective with larger datasets. Although readability is given priority in Tidyverse's user-friendly syntax, it may not be as fast as more efficient options. Once more, Data.table runs faster and uses less memory when sorting large amounts of data than the competition. The decision is based on the needs of the analysis: data.table for best performance, especially with large datasets and computationally intensive tasks; tidyverse for readability; and Native R for simplicity. Hence, in order to sum up, effective data management is essential for businesses to fully utilize their data and make wise decisions. Optimizing data processing and analysis requires careful consideration of joining, sorting, and tool selection.
ISSN:21476799