Application of supervised and unsupervised algorithms to find the important features related to barley ('Hurdeum vulgare' L.) grain yield: A new vista in data mining
Data mining methods are useful tools for crop physiologists to search through large datasets seeking patterns for agronomic factors, and that may assist the selection of the most important features for the individual site and field. To find the main features contributing to barley grain yield (outpu...
Saved in:
| Published in: | Australian Journal of Crop Science Vol. 8; no. 12; pp. 1590 - 1596 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Lismore, N.S.W
Southern Cross Publishers
01.12.2014
|
| Subjects: | |
| ISSN: | 1835-2693 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Data mining methods are useful tools for crop physiologists to search through large datasets seeking patterns for agronomic factors, and that may assist the selection of the most important features for the individual site and field. To find the main features contributing to barley grain yield (output), supervised and unsupervised algorithms as feature selection and attribute weighting were performed using SPSS Clementine 11.1 and Rapid Miner 5.0.001 softwares, respectively. Data presented in this study was collected from the literatures on the subject of barley physiology in Iran that was existed in http://sid.ir website. A total of 10563 data was extracted from the literatures, including 21 features and 503 records. Ranking of features by feature selection indicated that from 20 features as input, 10 features including culture type, location, irrigation regime, biological yield, nitrogen applied to the soil, rainfall amount, and genotype, with a value of 1.0 were the most important features related to the barley grain yield. General linear model between location and barley grain yield showed that Kermanshah with 3721 kg/ha had significant differences (p=0.01) with Badjgah, Sararood and Gachsaran under dryland farming. By ten attribute weighting algorithms, 13 features had weights = 0.5 and biological yield, location, genotype, and culture type were the most important features highlighted by 7, 6, 5 and 5 algorithms related to grain yield, respectively. Overall, feature classification by supervised and unsupervised algorithms can provide a comprehensive view of important features such as biological yield, location, culture type, irrigation regime, nitrogen applied and genotype, which contribute to grain yield improvement. |
|---|---|
| Bibliography: | Australian Journal of Crop Science, Vol. 8, No. 12, Dec 2014, 1590-1596 Informit, Melbourne (Vic) |
| ISSN: | 1835-2693 |