Optimized Cancer Subtype Classification and Clustering Using Cat Swarm Optimization and Support Vector Machine Approach for Multi-Omics Data
Keywords:
Cancer; Omics; Multi-omics; Cat Swarm Optimization; Cancer Subtype; K-MeansAbstract
No standard approach has been established to define cancer subtypes, making it a challenging task due to the high dimensionality of the data and the limited sample sizes. The addition of multiple levels of data increases the dimensionality, and interpreting the predictions of the machine learning (ML) model introduces an additional layer of complexity. Some prior studies have failed to explain aspects of certain characteristics that affect the classification results. The aim of this work is to improve the feature selection method, thereby increasing the significance and accuracy of characterizing cancer subtypes by using the powerful framework of clustering multi-omics data. With regard to feature selection, we propose a rigorous cat swarm optimization feature selection to isolate relevant features for prediction, K-means for clustering the dataset, and a nonlinear SVM for multiclassification. The performance is evaluated using the ML model’s known measures of accuracy, F1-score, precision, and recall. The silhouette metric is then used to quantify the quality of clusters generated by the shortlisted features. The initial use of this model achieved an accuracy of approximately 81%. After the feature selection, the accuracy significantly increased to an impressive 100%, surpassing the performance of current models. The silhouette metric highlighted the effectiveness of our feature selection strategy, demonstrating a distinct and significant improvement in classification accuracy. In addition to improving classification accuracy, this method enhances the interpretability, one of the initial steps toward understanding the molecular mechanisms that govern cancer subtypes. Therefore, this work establishes a solid foundation for further biological studies, contributing to the generation of new data and advancements in cancer research.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Journal of Soft Computing and Data Mining

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.









