论文部分内容阅读
Data discretization contributes much to the induction of classification rules or trees by machine learning methods.The rough set theory is a valid tool for discretizing continuous information systems.Herein,a new method is proposed to improve those typical rough set based heuristic algorithms for data discretization,by utilizing decision information to reduce the scales of candidate cuts,and by more reasonably measuring cut significance with a new conception of cut selection probability.Simulations demonstrate that compared with other typical discretization algorithms based on the rough set theory,the proposed method is more capable and valid to discretize continuous information systems.It can effectively improve the predictive accuracies of information systems while still conceptually keeping their consistency.
Data discretization contributes much to the induction of classification rules or trees by machine learning methods. The rough set theory is a valid tool for discretizing continuous information systems. Herein, a new method is proposed to improve those typical rough set based heuristic algorithms for data discretization , by utilizing decision information to reduce the scales of candidate cuts, and by more reasonably measuring cut significance with a new conception of cut selection probability. simulations demonstrate that compared with other typical discretization algorithms based on the rough set theory, the proposed method is more capable and valid to discretize continuous information systems. It can effectively improve the predictive accuracies of information systems while still conceptually keeping their consistency.