Hotel Guest Length of Stay Prediction Using Random Forest Regressor
Uloženo v:
| Název: | Hotel Guest Length of Stay Prediction Using Random Forest Regressor |
|---|---|
| Autoři: | Yerik Afrianto Singgalen |
| Zdroj: | Journal of Information Systems and Informatics, Vol 6, Iss 4, Pp 3016-3034 (2024) |
| Informace o vydavateli: | Asosiasi Perguruan Tinggi Informatika dan Komputer (APTIKOM) Sumsel, 2024. |
| Rok vydání: | 2024 |
| Témata: | Electronic computers. Computer science, QA1-939, length of stay, random forest regression, predictive accuracy, operational optimization, machine learning in hospitality, feature importance, data-driven decision making, QA75.5-76.95, Mathematics |
| Popis: | This research offers a robust framework for integrating predictive analytics into hospitality operations, contributing to sustainable growth and competitive advantage in the industry. This research investigates the application of the Random Forest Regression model to predict the Length of Stay (LoS) of hotel guests, leveraging key features such as country, guest type, room type, and rating. The study addresses the need for precise forecasting to optimize resource allocation, improve operational efficiency, and support data-driven decision-making in the hospitality sector. The methodology involves data collection from a structured dataset of guest reviews, preprocessing through encoding categorical variables, converting target values into numeric forms, and standardizing features to ensure consistency and uniformity. The dataset is split into training (80%) and testing (20%) subsets, with hyperparameters such as n_estimators=100 and random_state=42 set to ensure stability and reproducibility during model training. The Random Forest Regression model demonstrated strong predictive performance, achieving an R-squared value of 0.85 and a Mean Absolute Error (MAE) of 1.06. Feature importance analysis identified "country" as the most significant variable (importance score: 0.5), followed by guest type (0.2), room type (0.15), and rating (0.15). The Predicted vs. Actual Plot and Error Distribution evaluation reveals that most errors cluster near zero, indicating high accuracy with minor deviations in extreme cases. These findings emphasize the model’s potential to enhance marketing strategies, optimize resource allocation, and improve guest satisfaction. This research offers a robust framework for integrating predictive analytics into hospitality operations, contributing to sustainable growth and competitive advantage in the industry. |
| Druh dokumentu: | Article |
| ISSN: | 2656-4882 2656-5935 |
| DOI: | 10.51519/journalisi.v6i4.959 |
| Přístupová URL adresa: | https://doaj.org/article/ae5846dc3bc246aa90cbc199aa4cbec5 |
| Rights: | CC BY |
| Přístupové číslo: | edsair.doi.dedup.....61fcc9736de7d960288dba647aabc95d |
| Databáze: | OpenAIRE |
| Abstrakt: | This research offers a robust framework for integrating predictive analytics into hospitality operations, contributing to sustainable growth and competitive advantage in the industry. This research investigates the application of the Random Forest Regression model to predict the Length of Stay (LoS) of hotel guests, leveraging key features such as country, guest type, room type, and rating. The study addresses the need for precise forecasting to optimize resource allocation, improve operational efficiency, and support data-driven decision-making in the hospitality sector. The methodology involves data collection from a structured dataset of guest reviews, preprocessing through encoding categorical variables, converting target values into numeric forms, and standardizing features to ensure consistency and uniformity. The dataset is split into training (80%) and testing (20%) subsets, with hyperparameters such as n_estimators=100 and random_state=42 set to ensure stability and reproducibility during model training. The Random Forest Regression model demonstrated strong predictive performance, achieving an R-squared value of 0.85 and a Mean Absolute Error (MAE) of 1.06. Feature importance analysis identified "country" as the most significant variable (importance score: 0.5), followed by guest type (0.2), room type (0.15), and rating (0.15). The Predicted vs. Actual Plot and Error Distribution evaluation reveals that most errors cluster near zero, indicating high accuracy with minor deviations in extreme cases. These findings emphasize the model’s potential to enhance marketing strategies, optimize resource allocation, and improve guest satisfaction. This research offers a robust framework for integrating predictive analytics into hospitality operations, contributing to sustainable growth and competitive advantage in the industry. |
|---|---|
| ISSN: | 26564882 26565935 |
| DOI: | 10.51519/journalisi.v6i4.959 |
Nájsť tento článok vo Web of Science