论文部分内容阅读
针对隐藏在混淆JavaScript代码中的drive-by-download攻击很难被检测的问题,深入分析了混淆JavaScript代码以及drive-by-download攻击的静态和动态行为特征,设计并实现了只需正常行为数据进行训练、静态分析与动态分析相结合的异常检测原型系统.首先,静态分析以代码混淆度为特征,利用主成分分析(PCA)、最近邻(K-NN)和one-class支持向量机(SVM)三种算法检测出混淆JavaScript代码.其次,动态分析从JavaScript代码中获取的变量初值和变量终值,以变量初值和变量终值中提取的9个特征作为检测混淆代码中具有drive-by-download攻击的动态行为特征.从实际环境中收集了JavaScript正常与混淆恶意代码共7.046 3×104条.实验结果表明:选用PCA算法时,在误报率为0.1%的情况下,系统对混淆drive-by-download攻击能达到99.0%的检测率.
In response to the hard-to-detect drive-by-download attacks that are hidden in confusing JavaScript code, in-depth analysis of the static and dynamic behavioral features that confuse JavaScript code with drive-by-download attacks, the design and implementation of data that requires only normal behavioral data The prototype system of anomaly detection combining training, static analysis and dynamic analysis is used.Firstly, the static analysis is characterized by the confusion of codes, using principal component analysis (PCA), nearest neighbor (K-NN) and one-class support vector machine SVM) three algorithms to detect confusing JavaScript code.Secondly, dynamic analysis of the variables obtained from the JavaScript code initial value and the final value of the variable to the final value of the variable and the final value of the extracted nine features as a confusion in the code with drive -by-download dynamic behavior of attacks from the actual environment to collect JavaScript normal and confused malicious code a total of 7.046 3 × 104. Experimental results show that: the choice of PCA algorithm, the false alarm rate of 0.1% of the cases, the system Achieve a 99.0% detection rate on obfuscated drive-by-download attacks.