Optimized Cancer Subtype Classification and Clustering Using Cat Swarm Optimization and Support Vector Machine Approach for Multi-Omics Data

Authors

  • Ali Mahmoud Ali Informatics Institute for Postgraduate Studies, Iraqi Commission for Computers & Informatics, Baghdad, Iraq;
  • Mazin Abed Mohammed

Keywords:

Cancer; Omics; Multi-omics; Cat Swarm Optimization; Cancer Subtype; K-Means

Abstract

No standard approach has been established to define cancer subtypes, making it a challenging task due to the high dimensionality of the data and the limited sample sizes. The addition of multiple levels of data increases the dimensionality, and interpreting the predictions of the machine learning (ML) model introduces an additional layer of complexity. Some prior studies have failed to explain aspects of certain characteristics that affect the classification results. The aim of this work is to improve the feature selection method, thereby increasing the significance and accuracy of characterizing cancer subtypes by using the powerful framework of clustering multi-omics data. With regard to feature selection, we propose a rigorous cat swarm optimization feature selection to isolate relevant features for prediction, K-means for clustering the dataset, and a nonlinear SVM for multiclassification. The performance is evaluated using the ML model’s known measures of accuracy, F1-score, precision, and recall. The silhouette metric is then used to quantify the quality of clusters generated by the shortlisted features. The initial use of this model achieved an accuracy of approximately 81%. After the feature selection, the accuracy significantly increased to an impressive 100%, surpassing the performance of current models. The silhouette metric highlighted the effectiveness of our feature selection strategy, demonstrating a distinct and significant improvement in classification accuracy. In addition to improving classification accuracy, this method enhances the interpretability, one of the initial steps toward understanding the molecular mechanisms that govern cancer subtypes. Therefore, this work establishes a solid foundation for further biological studies, contributing to the generation of new data and advancements in cancer research.

Downloads

Download data is not yet available.

Downloads

Published

18-12-2024

Issue

Section

Articles

How to Cite

Ali, A. M. ., & Mohammed, M. A. . (2024). Optimized Cancer Subtype Classification and Clustering Using Cat Swarm Optimization and Support Vector Machine Approach for Multi-Omics Data. Journal of Soft Computing and Data Mining, 5(2), 223-244. https://penerbit.uthm.edu.my/ojs/index.php/jscdm/article/view/18412