Maximal Covariance Complexity-Based Penalized Likelihood Method in High Dimensional Data
Isah Aliyu Kargi1, Norazlina Bint Ismail2, Ismail Bin Mohamad3

1Isah Aliyu Kargi*, Department of Mathematics, Faculty of Science, Universiti Teknologi Malaysia 81310 UTM Skudai, Johor, Malaysia. 2 Department of Mathematics and statistics Nuhu Bamalli Polytechnic p.m.b 1061, Zaria.
2Norazlina Bint Ismail, Department of Mathematics, Faculty of Science, Universiti Teknologi Malaysia 81310 UTM Skudai, Johor, Malaysia.
3Ismail Bin Mohamad, Department of Mathematics, Faculty of Science, Universiti Teknologi Malaysia 81310 UTM Skudai, Johor, Malaysia.
Manuscript received on June 02, 2020. | Revised Manuscript received on June 17, 2020. | Manuscript published on June 30, 2020. | PP: 24-32 | Volume-4, Issue-10, June 2020. | Retrieval Number: I0904054920/2020©BEIESP | DOI: 10.35940/ijmh.I0904.0641020
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Classification of cancer and selection of genes is one of the most important application of DNA microarray data. As a result of the higher dimensionality of microarray data, classification and selection of gene techniques are frequently employed to support the professional systems in the diagnosing ability of cancer with higher precision in classification. Least absolute shrinkage and selection operator (LASSO) is one of the most popular method for cancer classification and gene selection in high dimensional data. However, Lasso has limitations of being biased and cannot select variables more than the sample size (n) in gene selection and classification of high dimensional microarray data. To address this problems, LASSO-C1F was proposed using scale invariant measure of maximal information complexity of covariance matrix denoted with weight modifications as data-adaptive alternative to the fairly arbitrary choice of the regularization term in the least absolute shrinkage and selection operator (LASSO). The results indicated the effectiveness of the proposed method LASSO-C1F over the classical LASSO. The evaluation criteria result shows that the proposed method, LASSO-C1F has a better performance in terms of AUC and number of genes selected.
Keywords: Lasso, Maximal Complexity, Information Measure, Theoretic Measure, Penalized Likelihood Method, Scale-Invariant Complexity.