Joint CTC-Attention End-to-End Speech Recognition with a Triangle Recurrent Neural Network Encoder

来源 :上海交通大学学报(英文版) | 被引量 : 0次 | 上传用户:hofox
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Traditional speech recognition model based on deep neural network (DNN) and hidden Markov model (HMM) is a complex and multi-module system.In other words,optimization goals may differ between modules in traditional model.Besides,additional language resources are required,such as pronunciation dictionary and language model.To eliminate the drawbacks of traditional model,we hereby propose an end-to-end speech recognition method,where connectionist temporal classification (CTC) and attention are integrated for decoding.In our model,the complex modules are replaced by a single deep network.Our model mainly consists of encoder and decoder.The encoder is constructed by bidirectional long short-term memory (BLSTM) with a triangular structure for feature extraction.The decoder based on CTC-attention decoding utilizes advanced features extracted by shared encoder for training and decoding.The experimental results on the VoxForge dataset indicate that end-to-end method is superior to basic CTC and attention-based encoder-decoder decoding,and the character error rate (CER) is reduced to 12.9% without using any language model.
其他文献
期刊
徐则臣的短篇小说《如果大雪封门》讲述的主体是几个来自南方乡村的“京漂”,和以往写“京漂”的小说不同的是,《如果大雪封门》写的绝不仅仅只是“京漂”生活中的艰难与挣扎
学位
作业本是语文教学的配套练习,语文教师要提高课堂作业本的使用效率,帮助学生巩固语文学科基础,进而提高综合能力.然而,在语文课堂作业本的使用上,很多课堂中尚存在许多不足之
An accurate traffic prediction on various service is of great importance to the channel resource man agement of geostationary earth orbit (GEO) satellites.There
采用碳热还原方法,选择性地同时回收转炉污泥中的 Fe和多种有价金属,作为制备锂离子电池正极材料——多元掺杂磷酸铁锂的原料,详细探讨了制备过程中的相关因素、控制方法对产物
近年来,我国染料中间体生产工业得到了长足的发展,并已成为支撑世界相关产品消费的生产大国,然而其加工过程中产生的含芳烃类等有机废水,由于成分复杂、毒性大、含盐量高以及所含