On Density-Based Data Streams Clustering Algorithms: A Survey

来源 :Journal of Computer Science & Technology | 被引量 : 0次 | 上传用户:svennis
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Clustering data streams has drawn lots of attention in the last few years due to their ever-growing presence. Data streams put additional challenges on clustering such as limited time and memory and one pass clustering. Furthermore, discovering clusters with arbitrary shapes is very important in data stream applications. Data streams are infinite and evolving over time, and we do not have any knowledge about the number of clusters. In a data stream environment due to various factors, some noise appears occasionally. Density-based method is a remarkable class in clustering data streams, which has the ability to discover arbitrary shape clusters and to detect noise. Furthermore, it does not need the number of clusters in advance. Due to data stream characteristics, the traditional density-based clustering is not applicable. Recently, a lot of density-based clustering algorithms are extended for data streams. The main idea in these algorithms is using densitybased methods in the clustering process and at the same time overcoming the constraints, which are put out by data stream’s nature. The purpose of this paper is to shed light on some algorithms in the literature on density-based clustering over data streams. We not only summarize the main density-based clustering algorithms on data streams, discuss their uniqueness and limitations, but also explain how they address the challenges in clustering data streams. Moreover, we investigate the evaluation metrics used in validating cluster quality and measuring algorithms’ performance. It is hoped that this survey will serve as a steppingstone for researchers studying data streams clustering, particularly density-based algorithms. Clustering data streams has drawn lots of attention in the last few years due to their ever-growing presence. Data stream are infinite and evolving over time, and we do not have any any knowledge about the number of clusters. Density-based method is a remarkable class in Clustering data streams, which have the ability to discover arbitrary shape clusters and to detect noise. Furthermore, it does not need the number of clusters in advance. Due to data stream characteristics, the traditional density-based clustering is not applicable. lot of density-based clustering algorithms are extended for data streams. The main idea in these algorithms is using densitybased methods in the clusterin g process and at the same time overcoming the constraints, which are put out by data stream’s nature. The purpose of this paper is to shed light on some algorithms in the literature on density-based clustering over data streams. We not only summarize the main density-based clustering algorithms on data streams, discuss their uniqueness and limitations, but also explain how they address the challenges in clustering data streams. It is hoped that this survey will serve as a steppingstone for researchers studying data streams clustering, particularly density-based algorithms.
其他文献
相对于传统的水稻栽培方法,杂交水稻Ⅱ优辐819直播高产栽培技术可节省大量劳力,缓解劳力季节性紧张的矛盾,有利于农民的增产增收.本文将从水稻直播的品种选择、大田准备和浸
影响黑加仑嫩枝扦插育苗生长、质量的因素很多,黑加仑育苗技术较为复杂,其成活率的高低不仅受到天气,沙盘基质的影响,并受苗木扦插时间、插穗质量及水分等诸多因素的影响.但
甜玉米是近几年来市场上非常流行的一种玉米品种,深受消费者的喜爱,但甜玉米与普通玉米的生产目的和生长发育等均存在较大差异,这使得甜玉米的栽培受到了一定的影响.本文通过
本文阐述了机插秧轻简化桨泥育秧技术,以为生产上推广机插秧提供技术参考.
口腔微生态平衡是维持口腔健康的重要因素,对人体有着非常重要的影响,许多因素都可以使口腔微生态失衡,严重者甚至可以导致疾病的发生.本文就近年来国内外学者在口腔微生态的
现浇套衬渠道防渗工程是民勤县实施的渠道防渗的主要工程形式之一,其施工方法简单,造型美观,造价低,节水效果明显,且经久耐用。工程实施后,大大提高了渠道的输水损失,取得了较好
以含取代基团的苯酚和氯代环氧丙烷为原料,经多步反应合成了4种含芳香结构的化学修饰单体,针对靶向Mdm2 mRNA序列的siRNA,在其3’末端加入不同化学修饰的单体制备获得9个siRN
生态文明建设是党的十七大提出的一项新的奋斗目标,实现这一目标意义重大.本文在深刻分析生态文明建设内涵的基础上,总结了生态文明建设过程中存在的误区,提出了通过大力弘扬
把噁二唑结构引入吲哚环中,合成一类结构新颖的吲哚衍生物,以期为新药筛选提供先导化合物.在微波辐射条件下,以较高的产率得到9个未见文献报道的新化合物,其结构均经1HNMR,IR
通过在富HF的HF-HNO3溶液体系中加入新的添加剂NH3·H2O,对多晶硅片进行了腐蚀试验研究.在溶液配比为HF:HNO3:NH3·H2O:H2O=12:1:1:4(体积比),腐蚀时间为10min时得到的效果最