论文部分内容阅读
探索一种从语流中自动提取伪音节的新方法,该方法可以用于自动语种识别(ALI).整个过程分为特征提取、模型建立和识别测试3个阶段.为了从语流中自动提取伪音节,将紧邻的一个辅音段和一个元音段结合在一起构成一个伪音节,并称之为CV音节.提出了一种自动提取CV音节的算法,利用该算法可以提取出每个CV音节的特征矢量.采用高斯混合模型(GMM)和语言模型(LM)构建语种识别系统.对汉语普通话及6种少数民族语言的实验证明了提出的方法能够有效地识别语种,而且训练速度快、抗噪声性能强.
A new method of extracting pseudo-syllables from speech stream is explored, which can be used in automatic language recognition (ALI) .The whole process is divided into three stages: feature extraction, model building and recognition testing.In order to extract automatically from speech stream A pseudo-syllable combines a consonant segment and a vowel segment next to each other to form a pseudo-syllable, which is called a CV syllable. An algorithm for automatically extracting CV syllables is proposed, in which each CV syllable can be extracted (GMM) and linguistic model (LM) to build a language recognition system.Experiments on Chinese Mandarin and six minority languages prove that the proposed method can effectively identify the language, and the training speed is fast, the anti-virus Strong noise performance.