论文部分内容阅读
编者按随着网络用户的高速增长,以及网络终端日益丰富多样化,网络服务的内容与范围不断扩充,用户对互联网的依赖度极大提高,同时这些变化也大大增加了互联网产生的数据量。日益生成和积累用户网络行为数据,增长如此之快,以至于难以使用现有的数据库管理工具来驾驭。这些数据量是如此之大,已经不是以我们所熟知的多少G和多少T为单位来衡量,而是以P(1000个T),E(一百万个T)或Z(10亿个T)为计量单位。目前互联网的形态已经不是单方面接受信息,海量的UGC(用户产生信息)产生前所未有的庞大数据:过去三年里产生的数据量比以往12年的数据还要多,大数据时代的来临已经毋庸置疑。
Editor’s note With the rapid growth of Internet users and the increasingly rich and diversified network terminals, the content and scope of network services are constantly expanding, and the user’s reliance on the Internet has greatly increased. At the same time, these changes have greatly increased the amount of data generated by the Internet. The increasing generation and accumulation of user network behavior data has grown so fast that it is difficult to manage using existing database management tools. These amounts of data are so large that they are not measured in terms of how much G and how many T we know, but in terms of P (1000 T), E (one million T), or Z (one billion T ) As the unit of measurement. At present, the form of the Internet is not unilaterally accepted information. Massive UGC (user generated information) produces a huge amount of data as never before: the amount of data generated in the past three years is more than the data in the past 12 years, and the advent of the era of big data is no longer necessary Questioning.