Electronic Theses and Dissertation

Universitas Syiah Kuala

SKRIPSI

PERBANDINGAN ALGORITMA TEXTRANK DAN GENERATIVE ADVERSARIAL NETWORK DALAM MERINGKAS DATA TEKS UNTUK PEMBANGUNAN MODEL KLASIFIKASI DATA TEKS

Pengarang

Giyaddy Ilmi Alavan - Personal Name;

Dosen Pembimbing

Taufik Fuadi Abidin - 197010081994031002 - Dosen Pembimbing I
Alim Misbullah - 198806032019031011 - Dosen Pembimbing II

Nomor Pokok Mahasiswa

1708107010018

Fakultas & Prodi

Fakultas MIPA / Informatika (S1) / PDDIKTI : 55201

Subject

Kata Kunci

Penerbit

Banda Aceh : Fakultas MIPA - Informatika., 2022

Bahasa

No Classification

Literature Searching Service

Hard copy atau foto copy dari buku ini dapat diberikan dengan syarat ketentuan berlaku, jika berminat, silahkan hubungi via telegram (Chat Services LSS)

Ukuran data teks yang terus meningkat membuat proses membangun model klasifikasi teks menjadi semakin sulit karena akan membutuhkan sumberdaya komputasi yang lebih banyak. Salah satu pendekatan yang paling memungkinkan untuk mengatasinya adalah dengan mengubah data teks tersebut menjadi versi yang lebih pendek namun tetap mempertahankan informasi teks aslinya, yakni melalui peringkasan teks otomatis. Penelitian ini dilakukan dengan cara meringkas data latih melalui dua metode peringkasan yang berbeda, yakni dengan algoritma TextRank dan model Generative Adversarial Network (GAN). Hasil ringkasan ini nantinya akan digunakan sebagai corpus data latih oleh tiga classifier yang berbeda dalam membangun model klasifikasi teks, yakni Naïve Bayes, Support Vector Machine (SVM), dan Decision Tree. Hasil pengujian menemukan bahwa corpus hasil ringkasan algoritma TextRank mampu mempersingkat waktu pembangunan model hingga 2-3 kali lebih cepat untuk ketiga classifier bahkan ketika ukuran data mencapai 20 kali lipat dari ukuran aslinya daripada data asli yang tidak diringkas. Dalam hal akurasi model, corpus hasil ringkasan dari model GAN mampu meningkatkan nilai akurasi terhadap dua classifier yang digunakan daripada data teks asli, yakni classifier Naïve Bayes dan SVM, dengan nilai akurasi berturut-turut sebesar 95,979% dan 97,678%, kecuali classifier Decision Tree yang hanya mampu memberikan nilai akurasi sebesar 86,974%, dimana nilai ini sedikit lebih kecil daripada model Decision Tree yang dilatih dengan corpus data teks asli, yakni sebesar 89,710%. Berdasarkan hasil pengujian, dapat disimpulkan bahwa setiap model klasifikasi teks yang menggunakan corpus hasil ringkasan sebagai data latihnya memiliki persentase akurasi yang hampir mendekati atau bahkan melampaui model klasifikasi teks yang dibangun dengan data latih asli, sehingga peringkasan teks dapat diimplementasikan dalam proses text preprocessing ketika hendak membangun model klasifikasi teks.

Abstrak Inggris

The size of the text data that continues to increase makes building a text classifier more difficult because it will consume more computational resources. One of the most feasible approaches to overcome this is converting the text data into a shorter version but retain its original information by using automatic text summarization. This research was conducted by summarizing the training data through two different methods, namely the TextRank algorithm and the Generative Adversarial Network (GAN) model. The results of this summary will later be used as a training data by three different classifiers in building a text classifier, namely Naïve Bayes, Support Vector Machine (SVM), and Decision Tree. The test results found that the summary corpus of the TextRank algorithm is the best training corpus because it can shorten the model development time up to 2-3 times faster for the three classifiers. In terms of model accuracy, the corpus of the summary results from the GAN model can increase the accuracy value of the two classifiers used rather than the original text data, namely the Naïve Bayes classifier and (SVM), with an accuracy value of 95.979% and 97.678%, respectively. An exception is given to the Decision Tree classifier which can only provide an accuracy value of 86.974%, where this value is slightly smaller than the same classifier trained with the original text data corpus, which is 89.710%. Hence, it can be concluded that each text classifier that uses the resulted summary as the training data has an accuracy that is almost similar to or even exceeds the text classifier built with the original data. Thus, text summarization can be implemented in the text preprocessing process when building a text classifier.

Tulisan Relevan

IMPLEMENTASI AUGMENTASI GENERATIVE ADVERSARIAL NETWORK PADA METODE MOBILENET UNTUK KLASIFIKASI ANAK STUNTING (CUT NANDA NURUL MEURISYAH, 2024)

PEMBANGKITAN TEKS HUMOR BERBAHASA INDONESIA MENGGUNAKAN MODEL GENERATIVE ARTIFICIAL INTELLIGENCE (Fatiya Humaira Yunaz, 2025)

IMPLEMENTASI VIRTUAL TRY-ON PADA PRODUK PAKAIAN MENGGUNAKAN METODE DEEP LEARNING (Cut Nurhidayanti, 2024)

PENGUJIAN DAN PEMBAHARUAN ALGORITMA INDOACRO UNTUK PENENTUAN PASANGAN AKRONIM DAN KEPANJANGANNYA DARI DATA TEKS DALAM BAHASA INGGRIS (ARIQ NAUFAL KAMIL, 2020)

PENERAPAN METODE SMOTE, CTGAN, DAN TABDDPM DALAM PENANGANAN KETIDAKSEIMBANGAN KELAS PADA DATASETS TRANSAKSI FRAUD MENGGUNAKAN CATBOOST (MUSLIADI, 2026)

APA Citation Style

Alavan, Giyaddy Ilmi .(2022). PERBANDINGAN ALGORITMA TEXTRANK DAN GENERATIVE ADVERSARIAL NETWORK DALAM MERINGKAS DATA TEKS UNTUK PEMBANGUNAN MODEL KLASIFIKASI DATA TEKS. Banda Aceh: Fakultas MIPA - Informatika.

Chicago/Turabian Citation Style

Alavan, Giyaddy Ilmi . PERBANDINGAN ALGORITMA TEXTRANK DAN GENERATIVE ADVERSARIAL NETWORK DALAM MERINGKAS DATA TEKS UNTUK PEMBANGUNAN MODEL KLASIFIKASI DATA TEKS. Banda Aceh: Fakultas MIPA - Informatika, 2022.

MLA Citation Style