Electronic Theses and Dissertation

Universitas Syiah Kuala

SKRIPSI

PERBANDINGAN PERFORMA INDOBERT DAN DEEPSEEK DALAM PENDETEKSIAN HUMOR PADA TEKS BERBAHASA INDONESIA

Pengarang

Najla Raihana Kamila - Personal Name;

Dosen Pembimbing

Rasudin - 197410011999031001 - Dosen Pembimbing I
Laina Farsiah - 198902032022032004 - Dosen Pembimbing II

Nomor Pokok Mahasiswa

2108107010067

Fakultas & Prodi

Fakultas MIPA / Informatika (S1) / PDDIKTI : 55201

Subject

Kata Kunci

Penerbit

Banda Aceh : Fakultas mipa., 2025

Bahasa

No Classification

Literature Searching Service

Hard copy atau foto copy dari buku ini dapat diberikan dengan syarat ketentuan berlaku, jika berminat, silahkan hubungi via telegram (Chat Services LSS)

Kemampuan mendeteksi humor dalam teks merupakan tantangan tersendiri dalam bidang Natural Language Processing (NLP), terutama untuk bahasa-bahasa dengan sumber daya terbatas seperti Bahasa Indonesia. Humor berperan penting dalam meningkatkan efektivitas komunikasi, tetapi kadar humor yang berlebihan dapat menurunkan kualitas penyampaian pesan. Penelitian ini bertujuan melatih dan mengevaluasi model IndoBERT untuk mendeteksi humor dalam teks berbahasa Indonesia serta membandingkannya dengan model generik multibahasa DeepSeek. Dataset dikembangkan melalui web scraping komentar YouTube yang mengandung unsur humor, kemudian melalui proses pelabelan, pra-pemrosesan, dan pelatihan model. Evaluasi dilakukan menggunakan metrik akurasi, precision, recall, serta F1-score. Hasil menunjukkan bahwa IndoBERT mencapai akurasi sebesar 93% dan F1-score sebesar 91%, dengan performa yang stabil dan seimbang pada kedua kelas (humor dan bukan humor). Namun, berdasarkan confusion matrix terhadap sampel kasus prediksi, DeepSeek mampu mendeteksi lebih banyak pola humor yang tidak terdeteksi oleh IndoBERT. Hal ini menunjukkan bahwa DeepSeek memiliki potensi dalam mengenali bentuk humor yang bersifat implisit dan kontekstual. Selain itu, penelitian ini juga menghasilkan sistem bernama Humorizer, yaitu alat berbasis IndoBERT yang mampu menghitung persentase humor dalam teks. Sistem ini bermanfaat bagi pembicara publik, moderator, maupun profesional komunikasi lainnya yang ingin menyisipkan humor secara proporsional.

Kata kunci : IndoBERT, DeepSeek, Deteksi Humor, NLP, Teks Bahasa Indonesia

Abstrak Inggris

The ability to detect humor in text presents a unique challenge in the field of Natural Language Processing (NLP), particularly for low-resource languages such as Indonesian. Humor plays a significant role in enhancing the effectiveness of communication, but excessive use may reduce the quality of message delivery. This study aims to train and evaluate the IndoBERT model to detect humor in Indonesian text and to compare its performance with the multilingual generative model DeepSeek. The dataset was developed through web scraping of YouTube comments containing humorous elements, followed by labeling, preprocessing, and model training. Evaluation was conducted using accuracy, precision, recall, and F1-score metrics. Results showed that IndoBERT achieved an accuracy of 93% and an F1-score of 91%, with balanced and stable performance across both classes (humor and non-humor). However, based on the confusion matrix from a sample of prediction cases, DeepSeek was able to detect more patterns of humor that were not identified by IndoBERT. This indicates that DeepSeek has the potential to recognize forms of humor that are implicit and contextually nuanced. Furthermore, this study produced a system called Humorizer, a tool based on IndoBERT that calculates the percentage of humor in a given text. This system is beneficial for public speakers, moderators, or communication professionals who aim to incorporate humor proportionally in their messages. Keywords : IndoBERT, DeepSeek, Humor Detection, NLP, Indonesian Text

Tulisan Relevan

PEMBANGKITAN TEKS HUMOR BERBAHASA INDONESIA MENGGUNAKAN MODEL GENERATIVE ARTIFICIAL INTELLIGENCE (Fatiya Humaira Yunaz, 2025)

PERBANDINGAN METODE SVM, NAIVE BAYES DAN INDOBERT DALAM MENDETEKSI UJARAN KEBENCIAN MENGGUNAKAN DATASET MULTI-LABEL BERBAHASA INDONESIA (Ricky Bagestra, 2024)

PERBANDINGAN PERFORMA METODE CNN DAN INDOBERT UNTUK KLASIFIKASI JUDUL BERITA DALAM BAHASA INDONESIA YANG HOAKS DAN TERPERCAYA (NUR ULFAH ATIQAH, 2024)

ANALISIS SENTIMEN MASYARAKAT TERHADAP TIM NASIONAL SEPAK BOLA INDONESIA DI INSTAGRAM MENGGUNAKAN ALGORITMA ROBERTA, DISTILBERT, DAN INDOBERT (Rahmi Najla, 2025)

ANALISIS SENTIMEN ULASAN PENGGUNA TWITTER DAN ARTIKEL BERITA ONLINE TERHADAP DAMPAK CHATGPT DALAM BIDANG PENDIDIKAN (Muhammad Faris Adzkia, 2024)

APA Citation Style

Kamila, Najla Raihana .(2025). PERBANDINGAN PERFORMA INDOBERT DAN DEEPSEEK DALAM PENDETEKSIAN HUMOR PADA TEKS BERBAHASA INDONESIA. Banda Aceh: Fakultas mipa.

Chicago/Turabian Citation Style

Kamila, Najla Raihana . PERBANDINGAN PERFORMA INDOBERT DAN DEEPSEEK DALAM PENDETEKSIAN HUMOR PADA TEKS BERBAHASA INDONESIA. Banda Aceh: Fakultas mipa, 2025.

MLA Citation Style

Kamila, Najla Raihana . PERBANDINGAN PERFORMA INDOBERT DAN DEEPSEEK DALAM PENDETEKSIAN HUMOR PADA TEKS BERBAHASA INDONESIA. Banda Aceh: Fakultas mipa, 2025. Print