论文部分内容阅读
Feature selection aims to reduce the dimensionality of patterns for classificatory analysis by selecting the most informative instead of irrelevant and/or redundant features.In this study,two novel information-theoretic measures for feature ranking are presented:one is an improved formula to estimate the conditional mutual information between the candidate feature f_i and the target class C given the subset of selected features S,i.e.,I(C;f_i|S),under the assumption that information of features is distributed uniformly;the other is a mutual information(MI)based constructive criterion that is able to capture both irrelevant and redundant input features under arbitrary distributions of information of features.With these two measures,two new feature selection algorithms, called the quadratic MI-based feature selection(QMIFS)approach and the MI-based constructive criterion(MICC)approach, respectively,are proposed,in which no parameters likeβin Battiti’s MIFS and(Kwak and Choi)’s MIFS-U methods need to be preset.Thus,the intractable problem of how to choose an appropriate value forβto do the tradeoff between the relevance to the target classes and the redundancy with the already-selected features is avoided completely.Experimental results demonstrate the good performances of QMIFS and MICC on both synthetic and benchmark data sets.
Feature selection aims to reduce the dimensionality of patterns for classificatory analysis by selecting the most informative instead of irrelevant and / or redundant features. In this study, two novel information-theoretic measures for feature ranking are presented: one is an improved formula to estimate the conditional mutual information between the candidate feature f_i and the target class C given the subset of selected features S, ie, I (C; f_i | S), under the assumption that information of features is distributed uniformly; the other is a mutual information MI) based constructive criterion that is able to capture both irrelevant and redundant input features under arbitrary distributions of information of features. Two these new features selection algorithms, called the quadratic MI-based feature selection (QMIFS) approach and the MI -based constructive criterion (MICC) approach, respectively, are proposed, in which no parameters like βin Battiti’s MIFS and (Kwak and Choi) ’s MIFS-U me thods need to be preset.Thus, the intractable problem of how to choose an appropriate value forβto do the tradeoff between the relevance to the target classes and the redundancy with the already-selected features is ever completely. EXPERIMENTAL RESULTS demonstrate the good performances of QMIFS and MICC on both synthetic and benchmark data sets.