论文部分内容阅读
用户流失预测问题广泛应用在银行、金融、电信等多种领域。对用户行为进行有效的预测和分析有助于企业的竞争和了解瞬息万变的市场规律。采用3种混合的数据挖掘模型对用户流失问题进行了研究,以形成一个准确高效的用户流失预测模型。这3种模型应用于数据挖掘的两个阶段:聚类阶段和预测分析阶段。在第1阶段中,对用户的数据进行过滤。第2阶段对用户行为进行预测。第1个模型采用了二分k-means算法进行数据过滤和多层感知人工神经网络(MLP-ANN)相结合进行预测。第2个模型采用层次化聚类与MLP–ANN相结合进行预测。第3个模型使用自组织映射(Self-Organizing Maps)与MLP-ANN进行预测。这3种模型预测分析基于真实数据,用户流失率采用3种模型混合计算的方式得出结果并同真实值进行比较。分析结果表明采用多模型的混合数据挖掘模型的数据准确度优于普通的单一模型。
User loss prediction problems are widely used in banking, finance, telecommunications and other fields. Effective prediction and analysis of user behavior will help companies compete and understand the ever-changing market rules. Three kinds of mixed data mining models are used to study the user churn problem, so as to form an accurate and efficient user churn prediction model. These three models are applied to two stages of data mining: clustering stage and predictive analysis stage. In Phase 1, the user’s data is filtered. Stage 2 predicts user behavior. The first model uses a two-point k-means algorithm for data filtering and multi-layer perceived artificial neural network (MLP-ANN) to predict. The second model uses hierarchical clustering with MLP-ANN to predict. The third model uses Self-Organizing Maps and MLP-ANN to predict. Based on the real data and the user churn rate, the three model predictions and analyzes are based on a mixture of three models and the result is compared with the real value. The analysis results show that the data accuracy of the hybrid data mining model using multi-model is better than the common single model.