Search In this Thesis
   Search In this Thesis  
العنوان
Diseases Classification System Using Data Mining and Machine Learning Algorithms /
المؤلف
El sayed، Omnia Hosny Mohamed.
هيئة الاعداد
باحث / امنية حسنى محمد السيد
مشرف / رانيا احمد ابو السعود
مشرف / اسلام عيد على محمد المغربى
مناقش / اسلام عيد على محمد المغربى
الموضوع
Qrmak
تاريخ النشر
2024
عدد الصفحات
143 p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
الهندسة المدنية والإنشائية
تاريخ الإجازة
8/2/2024
مكان الإجازة
جامعة الفيوم - كلية الهندسة - الهندسة المدنية
الفهرس
Only 14 pages are availabe for public view

from 143

from 143

Abstract

Disease diagnosis is an important task that must be performed with great accuracy. Recently, Medical data mining is becoming increasingly popular in diseases datasets to produce reliable, evidence-based medical information for doctors and researchers to trust. Classification is a medical data mining strategy that involves classifying datasets or complex data items, whereas data preprocessing is a data mining approach used to turn the raw data into an efficient and usable format. Preprocessing raw data effectively can increase its accuracy, which can raise model quality and reliability. Within the field of medical science, diseases are frequently defined and classified in order to gain a more comprehensive understanding of their unique characteristics and implications. Diabetes, a widespread metabolic disease, impacts a significant global population of over 422 million individuals, primarily in low or middle-income countries, presenting a significant public health obstacle. Annually, this ailment results in approximately 1.5 million deaths, highlighting the pressing want for readily available and affordable treatment choices, particularly insulin, to alleviate the dangers associated with obesity and hindered development. The objective of this thesis is to develop a classification model that can accurately identify diabetes in patients based on diagnostic data. Additionally, methods will be explored to enhance the accuracy and effectiveness of the model. This research study introduces a new framework that combines deep learning and standard feature selection approaches to determine whether individuals have type 2 diabetes or not. An analysis has been performed to assess the effectiveness and comparison of a deep autoencoder in conjunction with conventional feature selection methods, including pearsonr correlation coefficient, relief, recursive feature elimination, naïve bayes, and support vector machine. The collected features from all feature selection methods have been tested. The combination of features derived from the pearsonr correlation coefficient and deep autoencoders has exhibited the utmost ideal performance. The study employed the pima Indian diabetes dataset for benchmarking testing. The effectiveness of the proposed hybrid methodology is assessed by evaluating its influence on the accuracy of different classification algorithms, such as neural network, XGBoost, naïve bayes, novel K-nearest neighbor, and stacking, in comparison to alternative feature selection strategies. An extensive statistical analysis has been conducted on the input features to assess their variability and significance. shapley additive explanations is applied in the field of Explainable AI to identify and emphasize the most significant features obtained by the pearsonr correlation coefficient feature selection approach. The hybrid technique in the study demonstrated superior performance and competitiveness, achieving an accuracy rate of 83.7%, emphasizing the importance of these features in determining the final result. In addition, the proposed system has precision and recall metrics of 83.9% and 83.7% correspondingly. The recommended model for diabetes classification shows potential as a significant tool in disease diagnosis, based on data from studies and observations. The model’s high precision and recall rates demonstrate its ability to effectively reduce both false positives and false negatives, thereby guaranteeing that patients receive accurate diagnoses and appropriate treatment