Penerapan Cosine Similarity Dalam Deteksi Plagiasi Dokumen Teks Bahasa Indonesia Berdasarkan Fitur N-Gram Words

Anggyta Hanani, Rusyadi (2022) Penerapan Cosine Similarity Dalam Deteksi Plagiasi Dokumen Teks Bahasa Indonesia Berdasarkan Fitur N-Gram Words. Undergraduate Thesis thesis, Institut Teknologi Telkom Purwokerto.

[img] Text
COVER.pdf

Download (487kB)
[img] Text
ABSTRACT.pdf

Download (29kB)
[img] Text
ABSTRAK.pdf

Download (30kB)
[img] Text
BAB I.pdf

Download (191kB)
[img] Text
BAB II.pdf

Download (242kB)
[img] Text
BAB III.pdf

Download (194kB)
[img] Text
BAB IV.pdf
Restricted to Registered users only

Download (147kB) | Request a copy
[img] Text
BAB V.pdf

Download (31kB)
[img] Text
DAFTAR PUSTAKA.pdf

Download (76kB)

Abstract

In the absence of general tools for lecturers to detect plagiarism in student assignments at the Telkom Purwokerto Institute of Technology. Meanwhile, the library at the Telkom Purwokerto Institute of Technology only accommodates plagiarism detection for the Final Project. Students sometimes copy assignments from their friends in one class or another. Based on a survey of 41 lecturers at the Telkom Institute of Technology Purwokerto, 48.8% of lecturers often found plagiarism in student assignments. In addition, 73.3% of lecturers also strongly agree if there is a tool for detecting plagiarism in student assignments at the Telkom Purwokerto Institute of Technology. This study aims to create a plagiarism detection model using the Cosine Similarity method based on N-Gram words. Cosine similarity is used to find the similarity value in the document. While N-Gram words is a term/word feature that takes words according to a string of n numbers. The data used to build the model is Indonesian text document data from research methodology course assignments, totaling 114 documents in the form of doc, docx and pdf. From the research conducted, the results obtained are Cosine similarity and N-gram succeeded in detecting similarity in documents and the use of N values in N-Gram has an effect on model performance in detecting document similarity. And also the level of similarity on the N-Gram can also be influenced by the document that is entered. Keywords : Cosine Similarity, N-Gram, Plagiasi, Text Mining

Item Type: Thesis (Undergraduate Thesis)
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Informatics > Informatics Engineering
Depositing User: pustakawan ittp
Date Deposited: 08 Jul 2022 09:18
Last Modified: 08 Jul 2022 09:18
URI: http://repository.ittelkom-pwt.ac.id/id/eprint/7417

Actions (login required)

View Item View Item