Identifying the influential features on the regional energy use intensity of residential buildings based on Random Forests
•The influence of 171 different kinds of features was analyzed using Random Forests.•Average energy use intensities of 1322 regions were set as the regression target.•The model built by Random Forest has lower MSE than Lasso and SVM.•An educational feature was found to be the most influential.•The s...
Uložené v:
| Vydané v: | Applied energy Ročník 183; s. 193 - 201 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier Ltd
01.12.2016
|
| Predmet: | |
| ISSN: | 0306-2619, 1872-9118 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | •The influence of 171 different kinds of features was analyzed using Random Forests.•Average energy use intensities of 1322 regions were set as the regression target.•The model built by Random Forest has lower MSE than Lasso and SVM.•An educational feature was found to be the most influential.•The study not only identifies the influential features, but also matches the areas.
Efficient and effective city planning in improving the energy performance of residential buildings requires a clear understanding of the influential features. Previous studies on modeling the relationships between influential features and the energy consumption have several gaps and limitations, such as the linear modeling methodology and insufficient consideration of particular features. This study therefore aims at investigating the influence of 171 possibly related features on the regional energy use intensity (EUI) of residential buildings using a non-linear regression algorithm, namely Random Forests (RF). The New York City (NYC) was focused on due to data availability. The 171 features covered seven different aspects, which are building, economy, education, environment, households, surrounding, and transportation. The average site EUI of the residential buildings in each Block Group (BG) was set as the dependent variable. The regression model was compared to the models using typical linear methods, such as Multiple Linear Regression and Lasso. The results show that the RF model achieved a lower mean square error. In addition, the top 20 influential features were identified based on the out-of-bag estimation in RF. Results show that less percentage of well-educated people, higher percentage of households heated by fuel oil, lower household income and more residential complaints per capita are correlated with higher average site EUI in NYC. Related suggestions on improving the energy performance in different regions are presented to the local government. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 0306-2619 1872-9118 |
| DOI: | 10.1016/j.apenergy.2016.08.096 |