论文部分内容阅读
传统的语音分析都是建立在短时平稳假定的基础上,采用固定窗傅立叶变换获取语音信号的时-频局部化信息,与非平稳的语音信号不完全吻合,为此提出一种新颖的鲁棒性的谱图融合的小波变换方法。在图像特征提取过程中,对二字汉语词汇语音的语谱图进行特征分析,首先采用二维离散db4小波基分别对宽窄带语谱图进行6层小波包分解,并计算出每层的水平细节能量值、垂直细节能量值和对角细节能量值。接着,将窄带语谱图提取出的水平细节能量值、垂直细节能量值和对角细节能量值,分别作为窄带语谱图的第1~3个特征集合。然后将宽带语谱图提取出的水平细节能量值作为第4个特征集合。上述4个特征集合作为识别的特征向量,以支持向量机为分类器对特定人二字汉语词汇整体识别。采用1000个语音样本进行仿真实验,结果表明,该算法是利用语谱图的整体特征逐字逐词进行语音识别,能够凸显语音信号的整体时频特性,正确识别率可达98%。利用语谱图的特性,针对汉语的自身特性,将每一条语音指令作为一幅图像进行词汇研究,保证了语句的整体性,同时有助于提高识别率,增强鲁棒性。
The traditional speech analysis is based on the assumption of short-term stationary. The fixed-window Fourier transform is used to obtain the time-frequency localization information of the speech signal, which is not completely consistent with the non-stationary speech signal. Therefore, WAVELET TRANSFORM METHOD FOR COMBINATION OF STRONG FIELDS. In the process of image feature extraction, the eigenanalysis of the word spectrum of two-word Chinese vocabularies is carried out. Firstly, a two-dimensional discrete db4 wavelet basis is used to decompose the 6-layer wavelet packet into wideband and narrowband spectrum respectively, and the level of each layer is calculated Detail energy value, vertical detail energy value, and diagonal detail energy value. Then, the horizontal detail energy, the vertical detail energy and the diagonal detail energy extracted from the narrow-band spectrogram are respectively used as the first to third feature sets of the narrow-band spectrogram. Then, the horizontal detail energy value extracted from the broadband speech spectrum is taken as the fourth feature set. The above four feature sets are used as recognition feature vectors, and SVMs are used as classifiers to recognize the whole person’s L2 Chinese words. The experimental results show that the proposed algorithm uses the word-by-word of the whole spectrum of the spectrum to recognize the whole time-frequency characteristic of the speech signal, and the correct recognition rate can reach 98%. Utilizing the characteristics of the spectrogram, in accordance with the characteristics of Chinese, each vowel instruction is used as an image to study the vocabulary, which ensures the integrity of the sentence and helps to improve the recognition rate and enhance the robustness.