Hyperparameter Tuning Untuk Meningkatkan Akurasi Model Random Forest pada Klasifikasi Risiko Kredit

Nadea, Putri Nur Fauzi (2024) Hyperparameter Tuning Untuk Meningkatkan Akurasi Model Random Forest pada Klasifikasi Risiko Kredit. Undergraduate Thesis thesis, Institut Teknologi Telkom Purwokerto.

	Text Cover.pdf Download (734kB)
	Text Abstrak.pdf Download (31kB)
	Text Abstract.pdf Download (8kB)
	Text Bab I.pdf Download (116kB)
	Text Bab II.pdf Download (579kB)
	Text Bab III.pdf Download (917kB)
	Text Bab IV.pdf Restricted to Registered users only Download (489kB)
	Text Bab V.pdf Download (29kB)
	Text Daftar Pustaka.pdf Download (111kB)
	Text Lampiran.pdf Restricted to Registered users only Download (85kB)

Abstract

Credit risk analysis and modeling using machine learning is a special challenge because the data is complex, has a large size, consists of various types of features, contains outliers and imbalanced values. For this reason, research is needed using data mining techniques, especially through creating classification models using the Random Forest algorithm. The methodology used is SEMMA (Sample, Explore, Modify, Model, and Assess) which is a popular methodology in data mining. Through the application of the SEMMA methodology, this research specifically focuses on the use of feature engineering and hyperparameter tuning to improve model accuracy. Hyperparameter tuning is used to explore various parameter combinations in Random Forest, while feature engineering is used to improve data quality. The engineering features used include oversampling and standardization. Parameter selection by hyperparameter tuning the Random Forest model using the Random Search and Grid Search methods. The dataset used is credit risk data with a size of 32581 rows with 12 columns. The results of this study show a significant increase in model accuracy after implementing feature engineering and hyperparameter tuning. The best model obtained is a model that applies feature engineering and hyperparameter tuning with Grid Search CV to get an accuracy of 97.94%, an increase of 5.38% from the baseline model. Keywords: Classification, Random Forest, Hyperparameter Tuning, Feature Engineering, Imbalanced

Item Type:	Thesis (Undergraduate Thesis)
Subjects:	T Technology > T Technology (General)
Divisions:	Faculty of Informatics > Data Science
Depositing User:	repository staff
Date Deposited:	02 Oct 2024 06:47
Last Modified:	02 Oct 2024 06:47
URI:	http://repository.ittelkom-pwt.ac.id/id/eprint/11386

Actions (login required)

View Item