Machine learning Bureau score for Home Lending in an American finance company

Authors

DOI:

https://doi.org/10.33448/rsd-v13i10.47092

Keywords:

Machine learning, Predictive modeling, Home lending, Credit bureau, LMI consumers, Credit risk, Mortgage application.

Abstract

Our client is a leading provider of mortgage financing, originating loans and lines of credit to consumers in the US. Currently, they receive applications where applicants provide personal information and a soft pull of their FICO score is requested. That score is used to evaluate the applicant’s credit worthiness and determine conditional approval and the type of product available for the customer, including conventional, FHA or other mortgage loans. After conditional approval, a formal application is initiated, and underwriters review the information to determine the final application decision. When evaluating applications below regulatory and business thresholds, the company has the intention to approve more applications and increase loan volume, and there is an expectation that through the enhanced credit assessment, our client will improve the percentage of Low to Moderate Income (LMI) population able to obtain mortgage loans. Both aspects have a direct impact on the reputation and economic profits of the firm, so they are of pressing importance to the company. This project aims to build an applicant-level bureau-only score based on upgraded bureau internal attributes. This score will eventually serve as the basis for evaluating a customer’s credit risk before any loan structure or collateral information is considered. It will be used as a standalone score that can be used in the initial customer evaluation to identify better leads (mortgage inquiries for preapproval) and as input to a future application-level model.

Downloads

Download data is not yet available.

References

Bao, W.; Lianju, N.; & Yue, K. (2019) Integration of unsupervised and supervised machine learning algorithms for credit risk assessment. Expert Syst. Appl. 128, 301–315.

Barroso J. B. R. B., Silva T. C., & Souza S. R. S. d. (2018), Identifying systemic risk drivers in financial networks, Physica A: Statistical Mechanics and its Applications. 503, 650–674, https://doi.org/10.1016/j.physa.2018.02.144, 2-s2.0-85043784193.

Brownlee, J. (2020, Feb) How to calibrate probabilities for imbalanced classification. https://machinelearningmastery.com/probability-calibration-for-imbalanced-classification

Chaudhuri, T., & Yulei, F. (2020). Machine Learning Applications in Real Estate: Methods and Challenges. Journal of Real Estate Finance and Economics, 61(2), 192-210. https://doi.org/10.1007/s11146-019-09732-8

Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). New York, NY, USA: ACM. https://doi.org/10.1145/2939672.2939785.

Cox, D. R. (1958). The Regression Analysis of Binary Sequences. Journal of the Royal Statistical Society. Series B (Methodological), 20(2), 215–242. http://www.jstor.org/stable/2983890

Deepchecks Community Blog (2023). Understanding F1 Score, Accuracy, ROC-AUC, and PR-AUC Metrics for Models

Hodges, H., Garrity, C., & Pope, J. (2024). Deep Learning, Feature Selection, and Model Bias with Home Mortgage Loan Classification. In M. Castrillon-Santana, M. De Marsico, & A. Fred (Eds.), Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods (Vol. 1, pp. 248-255). (International Conference on Pattern Recognition Applications and Methods; Vol. 1). SciTePress. https://doi.org/10.5220/0012326800003654

Khemakhem, S.; & Boujelbene, Y. (2017) Artificial Intelligence for Credit Risk Assessment: Artificial Neural Network and Support Vector Machines. ACRN Oxf. J. Financ. Risk Perspect.6, 1–17.

Krasovytskyi, D., & Stavytskyy, A. (2024). Predicting Mortgage Loan Defaults Using Machine Learning Techniques. Ekonomika, 103(2), 140–160. https://doi.org/10.15388/Ekon.2024.103.2.8

Lundberg, S. M., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan & R. Garnett (ed.), Advances in Neural Information Processing Systems 30 (pp. 4765--4774). Curran Associates, Inc.

Lundberg, S.M., Erion, G.G., & Lee, S. (2018). Consistent Individualized Feature Attribution for Tree Ensembles. ArXiv, abs/1802.03888.

Mili M., Sahut J. M., & Teulon F. (2018), Modeling recovery rates of corporate defaulted bonds in developed and developing countries, Emerging Markets Review. 36, 28–44, https://doi.org/10.1016/j.ememar.2018.03.001, 2-s2.0-85045029245.

Niculescu-Mizil, A., & Caruana, R. (2005, July). Obtaining Calibrated Probabilities from Boosting. In UAI (Vol. 5, pp. 413-20).

Nielsen, D. (2016). Tree Boosting With XGBoost - Why Does XGBoost Win "Every" Machine Learning Competition?

Ozturkkal, B., & Wahlstrøm, R. (2022), Explaining mortgage defaults using SHAP and LASSO. http://dx.doi.org/10.2139/ssrn.4212836

Prado, J. W.; de Castro Alcântara, V.; de Melo Carvalho, F.; Vieira, K. C.; Machado, L. K. C.; & Tonelli, D. F. (2016) Multivariate Analysis of Credit Risk and Bankruptcy Research Data: A Bibliometric Study Involving Different Knowledge Fields (1968–2014). Scientometrics, 106, 1007–1029.

Roberts, A. (2022). What Is PR AUC? https://arize.com/blog/what-is-pr-auc/#:~:text=Amber%20Roberts,-Machine%20Learning%20Engineer&text=AUC%2C%20short%20for%20area%20under,the%20positive%20and%20negative%20classes.

Sirmans, G. S., MacDonald, L., & Macpherson, D. A. (2006). The Value of Housing Characteristics: A MetaAnalysis. Journal of Real Estate Finance and Economics, 33(3), 215-240. https://doi.org/10.1007/s11146-006-9983- 5

Wang, F.; Ding, L.; Yu, H.; & Zhao, Y. (2020) Big data analytics on enterprise credit risk evaluation of E-Business platform. Inf. Syst. E-Bus. Manag. 18, 311–350.

XGBoost developers (2018). xgboost, release 0.80, September, https://media.readthedocs.org/pdf/xgboost/latest/xgboost.pdf.

Zhang M. J. (2018) Risk and Prevention of Commercial Bank Mortgage Economic and Trade Practice 18 155-157

Downloads

Published

2024-10-11

Issue

Section

Engineerings

How to Cite

Machine learning Bureau score for Home Lending in an American finance company. Research, Society and Development, [S. l.], v. 13, n. 10, p. e34131047092, 2024. DOI: 10.33448/rsd-v13i10.47092. Disponível em: https://ojs34.rsdjournal.org/index.php/rsd/article/view/47092. Acesso em: 28 jun. 2025.