The surge in machine learning (ML) and artificial intelligence has revolutionized medical diagnosis, utilizing data from chest ct-scans, COVID-19, lung cancer, brain tumor, and alzheimer parkinson diseases. However, the intricate nature of medical data necessitates robust classification models. This study compares support vector machine (SVM), naïve Bayes, k-nearest neighbors (K-NN), artificial neural networks (ANN), and stochastic gradient descent on multi-class medical datasets, employing data collection, Canny image segmentation, hu moment feature extraction, and oversampling/under-sampling for data balancing. Classification algorithms are assessed via 5-fold cross-validation for accuracy, precision, recall, and F-measure. Results indicate variable model performance depending on datasets and sampling strategies. SVM, K-NN, ANN, and SGD demonstrate superior performance on specific datasets, achieving accuracies between 0.49 to 0.57. Conversely, naïve Bayes exhibits limitations, achieving precision levels of 0.46 to 0.47 on certain datasets. The efficacy of oversampling and under-sampling techniques in improving classification accuracy varies inconsistently. These findings aid medical practitioners and researchers in selecting suitable models for diagnostic applications.
Keywords
Balancing; Machine learning; Medical images; Multiclass; Performance