Studying cost-sensitive learning for multi-class imbalance in Internet traffic classification

来源 :The Journal of China Universities of Posts and Telecommunica | 被引量 : 0次 | 上传用户:hpsjsj
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Cost-sensitive learning has been applied to resolve the multi-class imbalance problem in Internet traffic classification and it has achieved considerable results.But the classification performance on the minority classes with a few bytes is still unhopeful because the existing research only focuses on the classes with a large amount of bytes.Therefore,the class-dependent misclassification cost is studied.Firstly,the flow rate based cost matrix(FCM) is investigated.Secondly,a new cost matrix named weighted cost matrix(WCM) is proposed,which calculates a reasonable weight for each cost of FCM by regarding the data imbalance degree and classification accuracy of each class.It is able to further improve the classification performance on the difficult minority class(the class with more flows but worse classification accuracy).Experimental results on twelve real traffic datasets show that FCM and WCM obtain more than 92% flow g-mean and 80% byte g-mean on average;on the test set collected one year later,WCM outperforms FCM in terms of stability. Cost-sensitive learning has been applied to resolve the multi-class imbalance problem in Internet traffic classification and it has achievedible results.But the classification performance on the minority classes with a few bytes is still unhopeful because the existing research only focuses on the classes with a large amount of bytes.Therefore, the class-dependent misclassification cost is studied. Firstly, the flow rate based cost matrix (FCM) is investigated. Secondarily, a new cost matrix named weighted cost matrix (WCM) is proposed, which calculates a reasonable weight for each cost of FCM by regarding the data imbalance degree and classification accuracy of each class. It is able to further improve the classification performance on the difficult minority class (the class with more flows but worse classification accuracy). twelve real traffic datasets show that FCM and WCM obtain more than 92% flow g-mean and 80% byte g-mean on average; on the test set collected on e year later, WCM outperforms FCM in terms of stability.
其他文献
妊娠高血压综合征(妊高征)是妊娠期特有的疾病,临床工作中,我们发现部分妊高征患者出现短暂或长期的听力损失。为此,我们对69例妊高征患者系统监测,观察其听阈值的变化规律及
Distributed compressed sensing(DCS) is an emerging research field which exploits both intra-signal and inter-signal correlations.This paper focuses on the recov
期刊
一、引言清洁机器人路径规划技术用于指导机器人选择什么路径在工作区域内行走;尽可能减少重复行走的同时全面地覆盖工作区域;还需要应对各种复杂多变的地面情况,如木地板、地板砖、地毯、不规则的工作区域等。其技术发展主要体现在地图构建、定位导航、环境物体检测识别、危险躲避策略、以及遥控路径规划等。二、专利技术详析按照结构类型可将路径规划技术专利分为七个部分,分别为传感器、摄像机、建立地图、特殊行走图案、遥控
期刊
在全球化生产讲求分散模式、个性定制等特点的前提下,只有站在智能工厂的全局思考问题,才能满足客户最大的需求。诸如近日在市场上火起来的智能MES显示出与传统MES迥然不同的面
期刊
近日从常州立方能源技术有限公司获悉,该公司研发的石墨烯基超级电容器取得重大突破。经相关部门检测,具备环保、百万次充放和不燃、不爆、抗低温等功能,系列产品即将批量投产。
期刊
在我国经济发展与建设中,化工行业是其重要的组成部分,直接影响到我国社会的可持续发展,制约到人类文明的进步.化工行业需要运用各类化学物质,具有一定的危险性与特殊性,多数
当今社会各行各业对信息的依赖愈来愈大,要求通信网络能及时准确的传递信息.随着网上传输的信息越来越多,传输信号的速率越来越快,一旦网络出现故障,将对整个社会造成极大的
目的 研究环氧合酶(COX)-2和尿激酶型纤溶酶原激活物(uPA)在胃癌中的表达,分析其与临床病理特征及生存期的关系.方法 用绀织芯片技术和免疫组化法检测甘肃省武威地区192例胃痛组织、56例癌旁组织中COX-2、uPA的表达,免疫组化双染测微血管密度(MVD)和微淋巴管密度(MLD).取30例当地同期胃镜检查的正常胃黏膜标本作为对照.结果 COX-2在胃痛和癌旁组的阴性率(67.7%和62.5%
为全面落实“十三五”规划,创新升级装备制造业,促进新材料技术交流,探讨汽车制造新材料发展的趋势,由中国设备管理协会工业集成服务中心主办,顺益体系(集团)、科慕化学(上海)有限公司
期刊
当前面临着石油资源紧缺和对轻质原油的高需求,而开采出的石油质量却日益向重质化和劣质化趋势蔓延的复杂形势,因此在环保压力日趋加大的背景下,如何在石油炼制中进行加氢脱