To Ameliorate Classification Accuracy using Ensemble Distributed Decision Tree (DDT) Vote Approach: An Empirical discourse of Geographical Data Mining

Weather data of Kashmir province has 6 attributes recorded at three different substations. This paper proposes a distributed decision tree algorithm and its implementation on Historical Geographical data of Kashmir province. The machine learning Decision tree algorithm applied on the Kashmir provinc...

Full description

Saved in:
Bibliographic Details
Published in:Procedia computer science Vol. 184; pp. 935 - 940
Main Authors: Fayaz, Sheikh Amir, Zaman, Majid, Butt, Muheet Ahmed
Format: Journal Article
Language:English
Published: Elsevier B.V 2021
Subjects:
ISSN:1877-0509, 1877-0509
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Weather data of Kashmir province has 6 attributes recorded at three different substations. This paper proposes a distributed decision tree algorithm and its implementation on Historical Geographical data of Kashmir province. The machine learning Decision tree algorithm applied on the Kashmir province dataset generates the accuracy of 81.54%. The distributed decision tree generates multiple trees based on the partitions of the original dataset in which the data is segregated according to the substations (42026, 42027 and 42044). The ratio between generated data sets was distributed in 32.38%, 34.19% and 33.42% respectively which is appropriate for the parallelism. Its distributed implementation, i.e. Distributed Decision Tree produces a specified number of sub-trees (depending upon number of partitions of input dataset) and at the end collects votes or averages the prediction or classification. In this paper, we have implemented the hard- voting approach to calculate the overall performance of the n-number of trees in distributed environment. The empirical results demonstrate that distributed decision trees approach has not improved the overall accuracy as compared to the original dataset without partitioning.
ISSN:1877-0509
1877-0509
DOI:10.1016/j.procs.2021.03.116