ADE, RIYANI (2019) PENERAPAN METODE WORD2VEC UNTUK MENDETEKSI KEMIRIPAN DOKUMEN. Undergraduate Thesis thesis, Institut Telkom Purwokerto.
|
Text
cover.pdf - Accepted Version Download (2MB) | Preview |
|
|
Text
Abstract.pdf - Accepted Version Download (417kB) | Preview |
|
|
Text
Abstrak.pdf - Accepted Version Download (432kB) | Preview |
|
|
Text
Bab i.pdf - Accepted Version Download (804kB) | Preview |
|
|
Text
Bab ii.pdf - Accepted Version Download (1MB) | Preview |
|
|
Text
Bab iii.pdf - Accepted Version Download (688kB) | Preview |
|
Text
Bab iv.pdf - Accepted Version Restricted to Registered users only Download (1MB) |
||
|
Text
Bab v.pdf - Accepted Version Download (370kB) | Preview |
|
|
Text
Daftar Pustaka.pdf - Accepted Version Download (497kB) | Preview |
Abstract
ABSTRACT Plagiarism is the act of taking part or all of people’s ideas in the form of documents or texts without attaching the sources of information retrieval. Therefore plagiarism detection is necessary to reduce plagiarism and keep the originality of people’s work. This research aims to detect the similarity of text documents using the Word2vec method and TF-IDF extraction fiture to determine the difference in values. The document used for comparison of this text is containing of 116 Indonesian abstracts. From the result, when stemming is applied the result was on average 5%, which is higher when stemming isn’t applied. Produces a similarity value over 50% for documents with a high level of similarity. Meanwhile for documents with a low level of similarity or not plagiarism produces a similarity value under 30%. The step of preprocessing is consisting of folding cases, tokenizing, removeal stopwords, and stemming. After the preprocessing process, the next step is weighting TFIDF and Word2vec. Than the next step was the similarity value uses Cosine Similarity to get percentage of similarity value. Based on the results of the experiment, Word2vec results the similarity value higher by an average of 28% compared to the TF-IDF weighting value. Keyword: Cosine Similarity, Document, plagiarism, preprocessing, TF-IDF, Word2vec
Item Type: | Thesis (Undergraduate Thesis) |
---|---|
Subjects: | T Technology > T Technology (General) |
Divisions: | Faculty of Industrial Engineering and Informatics > Informatics Engineering |
Depositing User: | Users 218 not found. |
Date Deposited: | 26 Jun 2020 01:42 |
Last Modified: | 23 Apr 2021 06:44 |
URI: | http://repository.ittelkom-pwt.ac.id/id/eprint/5700 |
Actions (login required)
View Item |