Electronic Theses and Dissertation
Universitas Syiah Kuala
SKRIPSI
PREDIKSI PENYAKITJANTUNG DENGAN TEKNIK ENSEMBLE BERBASIS RANDOM FOREST DAN CATBOOST
Pengarang
Iftahul Fadhlan - Personal Name;
Dosen Pembimbing
Zahnur - 196905291994031002 - Dosen Pembimbing I
Mahyus Ihsan - 197010051998021001 - Dosen Pembimbing II
Nomor Pokok Mahasiswa
2008107010024
Fakultas & Prodi
Fakultas MIPA / Informatika (S1) / PDDIKTI : 55201
Penerbit
Banda Aceh : Fakultas mipa., 2025
Bahasa
Indonesia
No Classification
005.1
Literature Searching Service
Hard copy atau foto copy dari buku ini dapat diberikan dengan syarat ketentuan berlaku, jika berminat, silahkan hubungi via telegram (Chat Services LSS)
Penyakit jantung merupakan salah satu penyebab kematian tertinggi di dunia
dan menjadi isu penting dalam bidang kesehatan global. Deteksi dini penyakit ini
menjadi sangat krusial untuk mencegah komplikasi yang lebih serius. Oleh karena itu,
diperlukan sistem prediksi yang akurat untuk mendeteksi potensi penyakit jantung.
Penelitian ini bertujuan untuk membangun model prediksi penyakit jantung dengan
menggunakan teknik ensemble yang menggabungkan algoritma Random Forest dan
CatBoost. Metode penelitian mencakup pengumpulan data dari sumber terbuka,
pemrosesan data, exploratory data analysis (EDA), pembagian data, serta
penyeimbangan data menggunakan metode Synthetic Minority Over-sampling
Technique (SMOTE). Model dilatih dengan pendekatan individual dan ensemble,
kemudian dievaluasi berdasarkan akurasi dan metrik performa lainnya. Penelitian ini
mengevaluasi kinerja berbagai model prediksi penyakit jantung, termasuk Random
Forest, CatBoost, dan model ensemble, dengan membandingkan akurasi pada data
pelatihan dan pengujian. Hasil evaluasi menunjukkan bahwa model Random Forest
memberikan performa terbaik setelah dilakukan seleksi fitur menggunakan feature
importance dan pelatihan dengan konfigurasi: n_estimators = 500, max_depth = 25,
min_sample_split = 2, min_samples_leaf = 2, max_features = log2, class_weight =
balanced, random_state = 42. Meskipun model CatBoost dan model ensemble juga
menunjukkan hasil yang kompetitif, Random Forest tetap unggul dalam hal akurasi dan
stabilitas performa.
Kata kunci :penyakit jantung, random forest, catboost, ensemble.
Heart disease is one of the leading causes of death worldwide and remains a major global health issue. Early detection is crucial to prevent more serious complications and improve patient outcomes. This study aims to develop an accurate heart disease prediction model using an ensemble technique that combines the Random Forest and CatBoost algorithms. The research methodology includes data collection from open sources, data preprocessing, exploratory data analysis (EDA), data splitting, and class balancing using the Synthetic Minority Over-sampling Technique (SMOTE). The models were trained using both individual and ensemble approaches, then evaluated based on accuracy and other performance metrics. The results show that the Random Forest model achieved the best performance after feature selection using feature importance, and was trained with the configuration: n_estimators = 600, max_depth = 20, min_samples_split = 3, min_samples_leaf = 3, max_features = log2, class_weight = balanced, and random_state = 42. Although the CatBoost and ensemble models also delivered competitive results, the Random Forest model remained superior in terms of accuracy and performance stability. Heart disease is one of the leading causes of death worldwide and remains a major global health issue. Early detection is crucial to prevent more serious complications and improve patient outcomes. This study aims to develop an accurate heart disease prediction model using an ensemble technique that combines the Random Forest and CatBoost algorithms. The research methodology includes data collection from open sources, data preprocessing, exploratory data analysis (EDA), data splitting, and class balancing using the Synthetic Minority Over-sampling Technique (SMOTE). The models were trained using both individual and ensemble approaches, then evaluated based on accuracy and other performance metrics. The results show that the Random Forest model achieved the best performance after feature selection using feature importance, and was trained with the configuration: n_estimators = 600, max_depth = 20, min_samples_split = 3, min_samples_leaf = 3, max_features = log2, class_weight = balanced, and random_state = 42. Although the CatBoost and ensemble models also delivered competitive results, the Random Forest model remained superior in terms of accuracy and performance stability. Keywords:heart disease, random forest, catboost, ensemble.
MODEL HYBRID MACHINE LEARNING BERBASIS SMOTEENN-SOFT VOTING ENSEMBLE DAN ANALISIS SHAP UNTUK PREDIKSI RISIKO STUNTING (Nuwairy El Furqany, 2026)
DETEKSI KOMENTAR SPAM PADA YOUTUBE MENGGUNAKAN ENSEMBLE MACHINE LEARNING (Ahmad Faqih Al Ghiffary, 2025)
PERBANDINGAN PERFORMA ALGORITMARNEXTREME GRADIENT BOOSTING (XGBOOST)RNDAN RANDOM FOREST DALAM PREDIKSI HARGARNMOBIL BEKAS (Rizka Nuzulia, 2024)
PEMODELAN KONSUMSI ENERGI LISTRIK DIRNKABUPATEN PIDIE MENGGUNAKAN ALGORITMARNRANDOM FOREST DAN GRID SEARCH CROSSRNVALIDATION (Pondes, 2026)
PERBANDINGAN KINERJA RANDOM FOREST, SUPPORT VECTOR MACHINE, DAN ADABOOST DALAM KLASIFIKASI ARITMIA (Muhammad Raja Al Sahhaf, 2025)