Revamped Grey Wolf Optimized Feature Selection for Genomic Predictive Pattern Analytics

Marrynal S Eastaff and Dr.V. Saravanan

Big Data analytics is the term used to assess the valuable information from the large dataset. The data size in bioinformatics increases significantly and it difficult for extracting the useful data. Therefore, the feature selection is an essential process for extracting the valuable information by selecting the relevant ones from the big dataset for genomic predictive pattern analytics. Several techniques have been introduced for handling the big gene expression data but it has high complexity. In order to minimize the complexity while handling the large volume of data, a Revamped Meta-Heuristic Grey Wolf Optimization based Feature Selection (RMHGWO-FS) technique is developed. The objective of RMHGWO-FS technique is to select the optimal features from the gene expression dataset with high accuracy and lesser time. The ā€˜nā€™ numbers of wolves (i.e., features) positions are randomly initialized. The fitness value of each wolf is calculated based on the similarity measure to identify the individual gray wolf position. The Jaccard index is used in RMHGWO-FS technique to measure the similarity between the features and genomic patterns. Based on the fitness value, the leadership order of wolves in RMHGWO-FS technique is specified as the alpha, beta and delta. The remaining candidate solutions are taken as omega. After finding the leadership orders, the other wolf position gets updated along with the position of best three wolves (alpha, bets and delta). Then, the new fitness value is calculated based on the new position of wolves. This process gets repeated until a termination condition is met. By this way, the wolf with higher fitness value is selected as optimal with higher accuracy and lesser time complexity. Experimental evaluation of proposed RMHGWO-FS technique and existing methods are carried out using a gene expression dataset. The result shows that the proposed RMHGWO-FS technique obtains better results in terms of feature selection accuracy, false positive rate and time complexity. Based on the observations, RMHGWO-FS technique is more efficient in predictive pattern analytics than the other methods.

Volume 11 | 10-Special Issue

Pages: 524-535

DOI: 10.5373/JARDCS/V11SP10/20192839