I. INTRODUCTION Speech recognition is an advanced technique that enables machines to translate speech signals into texts or commands by recognizing and understanding processes involving a variety of fields such as physiology, psychology, linguistics, computer science, and signal processing. In recent years, speech recognition has appeared many applications in the field of video, such as transliteration, fixed audio retrieval, language recognition, audio feature extraction, keyword search and so on. The application of automatic speech recognition technology, will greatly improve efficiency and significantly reduce costs. Speech recognition as a cross-discipline, after years of accumulated research, has made tremendous progress. Especially in the past 20 years, speech recognition technology has made significant progress