,Mismatched feature detection with finer granularity for emotional speaker recognition

来源 :Journal of Zhejiang University-Science C(Computers & Electro | 被引量 : 0次 | 上传用户:yangyuwu21
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
The shapes of speakers’ vocal organs change under their different emotional states, which leads to the deviation of the emotional acoustic space of short-time features from the neutral acoustic space and thereby the degradation of the speaker recognition performance. Features deviating greatly from the neutral acoustic space are considered as mismatched features, and they negatively affect speaker recognition systems. Emotion variation produces different feature deformations for different phonemes, so it is reasonable to build a finer model to detect mismatched features under each phoneme. However, given the difficulty of phoneme recognition, three sorts of acoustic class recognition—phoneme classes, Gaussian mixture model(GMM) tokenizer, and probabilistic GMM tokenizer—are proposed to replace phoneme recognition. We propose feature pruning and feature regulation methods to process the mismatched features to improve speaker recognition performance. As for the feature regulation method, a strategy of maximizing the between-class distance and minimizing the within-class distance is adopted to train the transformation matrix to regulate the mismatched features. Experiments conducted on the Mandarin affective speech corpus(MASC) show that our feature pruning and feature regulation methods increase the identification rate(IR) by 3.64% and 6.77%, compared with the baseline GMM-UBM(universal background model) algorithm. Also, corresponding IR increases of 2.09% and 3.32% can be obtained with our methods when applied to the state-of-the-art algorithm i-vector. The shapes of speakers’ vocal organs change under their different emotional states, which leads to the deviation of the emotional acoustic space of short-time features from the neutral acoustic space and thereby the degradation of the speaker recognition performance. Features deviating greatly from the neutral acoustic space are considered as mismatched features, and they negatively affect speaker recognition systems. and they negatively affect speaker recognition systems. recognition, three sorts of acoustic class recognition-phoneme classes, Gaussian mixture model (GMM) tokenizer, and probabilistic GMM tokenizer-are proposed to replace phoneme recognition. We propose feature pruning and feature regulation methods to process the mismatched features to improve speaker recognition performance As for the feature regulation met a strategy of maximizing the between-class distance and minimizing the within-class distance is taken to train the transformation matrix to regulate the mismatched features. Experiments conducted on the Mandarin affective speech corpus (MASC) show that our feature pruning and feature regulation compared with the baseline GMM-UBM (universal background model) algorithm. Also, corresponding IR increases of 2.09% and 3.32% can be obtained with our methods when applied to the state-of-the-art algorithm i-vector.
其他文献
本试验以6个不同穗型水稻品种为材料,在吉林省延边朝鲜族自治州稻区生态条件下,围绕不同穗型水稻品种的瞬时光合速率(IAPS)与光合速率高值持续期(APD)的变化、功能叶片叶绿素含量变化及籽粒灌浆动态变化等方面进行研究,以期为水稻高光效品种的选育和合理利用提供理论依据和技术途径。主要研究结果如下:1.通过试验结果表明,不同穗型水稻品种的瞬时光合速率随着剑叶的发育呈单峰曲线变化,在剑叶全展后一周左右达到
学位
我们研制的人参加工机械及工艺,经几年来的应用观察,人参加工过程机械化程度达到80%,洗净度达96%以上,破损率在3.1%以下,成品参(红参)含水率在14%以下。 We have developed gins
学校德育工作的基本任务是把全体学生培养成为热爱社会主义祖国、具有社会公德、文明行为习惯、遵纪守法的公民,引导学生树立科学的人生观和世界观。思想品德课是一门德育课
A DC hybrid power source composed of photovoltaic cells as the main power source,Li-ion battery storage as the secondary power source,and power electronic inter
为了明确华北寒旱区阶段性土壤水分胁迫对糖用甜菜生产的影响,2013-2014年在张家口市河北农业大学张北实验站,采用人工防雨棚池栽和人工灌溉补水控制土壤水分,模拟自然干旱环境,设置了不同时期水分胁迫及胁迫后复水试验。监测不同生育时段缺水与复水背景下的甜菜生长、光合、品质及产量效应,为甜菜田水分管理提供依据,主要研究结果如下:1水分胁迫与复水对甜菜耗水的影响2013年砂质栗钙土田对照甜菜生育期田间耗
学位
为使农村学生健康成长,在德、智、体、美、劳等各方面得到和谐发展,能充分发挥学校德育优势,使他们成为新一代“四有新人”。抓好当前农村小学德育工作就必须:提高认识,确立
学位