ST4248 Project Group C4
About
This project aims to predict HDB resale prices and identify key factors that affect the resale price. This was done by engineering additional features for each HDB and selecting crucial features using an ensemble of feature selections. Six regressor models were fitted on the data to predict both price and price/sqm. The performance of each model was evaluated and important features were obtained. The result shows that XGBoost price/sqm model performs the best and MRTs, malls, remaining lease, and total resales in town are the top key features.
Introduction
Singapore Housing and Development Board (HDB) flats are resold at various prices. The resale price can be affected by many factors, such as floor area, lease year, and many more. This project aims to perform prediction, which is to predict HDB resale prices, and inference, where we analyze how each feature contributes to the resale price. The model and findings can then be used to assist property agents and HDB owners in determining the resale price of HDB so as to attract buyers while still maximizing profits.
Dataset
The dataset is taken from government resale flat data at data.gov.sg/dataset/resale-flat-prices, managed by the Singapore Housing Development Board (HDB). Exactly 4410 resale transactions are taken from January to February 2023. Only these two months are taken for the dataset to eliminate the influence of time series on the data. The dataset has 11 variables: month, town, flat_type, block, street_name, storey_range, floor_area_sqm, flat_model, lease_commence_date, remaining_lease, and lastly resale_price as the response variable.
Further Observations
We performed EDA, feature engineering, feature selection, and then tried out several models which can be seen on the navigation bar on top of this page.
Results
Metrics | Model | LR | EN | NN | RF | GBR | XGB |
---|---|---|---|---|---|---|---|
RMSE | price | 54316 | 52786 | 43526 | 46939 | 50644 | 38256 |
price/sqm | 49426 | 49330 | 44579 | 42254 | 35003 | 34987 | |
MAPE | price | 7.81% | 7.53% | 5.3% | 5.53% | 6.19% | 4.53% |
price/sqm | 6.62% | 6.65% | 5.71% | 5.09% | 4.34% | 4.40% | |
Adjusted R^2 | price | 87.51% | 89.77% | 92.42% | 91.91% | 87.38% | 94.14% |
price/sqm | 89.40% | 91.07% | 92.06% | 93.45% | 94.75% | 95.00% |
Conclusion
We have collated the top 3 features (by importance) that add to price/sqm and subtract from price/sqm.
Add to overall price/sqm- Remaining lease
- Total nearby MRTs
- Floor number > 20
- Nearest MRT distance
- Nearest mall distance
- Total resales in town
Throughout this project, we have shown that we can quite accurately predict the price/sqm for HDB resale flats using the original and engineered features. We have also inferred the most important features that add to and subtract from the HDB resale price. With this model, we hope that property agents or sellers can have a better benchmark on the price they should set when reselling an HDB unit.