A new nearest neighbor-based framework for diabetes detection

Suyanto, suyanto and Meliana, Selly and Wahyuningrum, Tenia and Khomsah, Siti (2022) A new nearest neighbor-based framework for diabetes detection. Expert Systems With Applications an International Journal, 199. pp. 1-10. ISSN 0957-4174

[img] Text
Expert Systems With Applications an International Journal-plagiarisme A new nearest neighbor-based framework for diabetes detection.pdf.pdf

Download (1MB)
[img] Text
Expert Systems With Applications an International Journal-A new nearest neighbor-based framework for diabetes detection.pdf

Download (1MB)
Official URL: https://www.sciencedirect.com/science/article/pii/...

Abstract

Diabetes is one of the deadliest and costliest diseases. Today, automatic diabetes detection systems are primarily developed using deep learning (DL) approaches, which give high accuracy in classifying patients into two classes: have diabetes or not. Unfortunately, DL is a high-complexity and unexplainable black-box model. This paper proposes a new nearest neighbor-based framework to tackle those issues in classifying two diabetes datasets: binary-class Pima India Diabetes Dataset (PIDD) and multiclass Diabetes Type. A -means clustering (KMC) is first carried out to remove the noises or outliers and keep the competent data in the training set. The dimension of the competent data is then reduced using an autoencoder (AE) to minimize the distances of the intra-class data but maximize that of the inter-class. A -nearest neighbor (KNN) classifier and two variants: pseudo nearest neighbor rule (PNNR) and local mean-based pseudo nearest neighbor (LMPNN), are used to detect diabetes. In addition, a new variant named multi-voter multi-commission nearest neighbor (MVMCNN) is introduced. An investigation based on 5-fold cross-validation (FCV) informs that, for binary-class PIDD, the proposed combination of KMC, AE, and MVMCNN achieves the highest accuracy of 99.13%, which is slightly higher than the state-of-the-art DL-based detection model that produces 98.07%. An evaluation based on 10-FCV also indicates that, for the multiclass Diabetes Type, it obtains a higher accuracy of 95.24% than the DL-based model for predicting diabetes that gives 94.02%

Item Type: Article
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Informatics
Depositing User: Tenia Wahyuningrum
Date Deposited: 06 Sep 2022 03:59
Last Modified: 06 Sep 2022 03:59
URI: http://repository.ittelkom-pwt.ac.id/id/eprint/7986

Actions (login required)

View Item View Item