An Intelligent System for Identifying Influential Words in Real-Estate Classifieds

This paper focuses on the problem of quantifying how certain words in a text affect, positively or negatively, some numeric signal. These words can lead to important decisions for significant applications such as E-commerce. For example, consider the corpus of real-estate classifieds, which we devel...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of intelligent systems Ročník 27; číslo 2; s. 183 - 194
Hlavní autor: Abdallah Sherief
Médium: Journal Article
Jazyk:angličtina
Vydáno: De Gruyter 01.04.2018
Témata:
ISSN:0334-1860, 2191-026X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:This paper focuses on the problem of quantifying how certain words in a text affect, positively or negatively, some numeric signal. These words can lead to important decisions for significant applications such as E-commerce. For example, consider the corpus of real-estate classifieds, which we developed as a case study. Each classified has a description of a real-estate property, along with simple features such as the location and the number of bedrooms. The problem then is to identify which keywords influence the price of the property. Such identification is complicated due to the existence of simple features (numeric and nominal attributes) that also affect the price. In this research, we propose a two-stage regression model to solve this problem. To assess our contribution, we analyze, as a case study, four corpora of real-estate classifieds. The analysis shows that our model predicts the price of a real-estate unit more accurately using the accompanying text, compared to the prediction relying only on simple features. We also demonstrate the capability of our model to annotate (automatically) words that affect the price positively or negatively.
ISSN:0334-1860
2191-026X
DOI:10.1515/jisys-2016-0100