语言资源与智能大讲坛第十一讲-北京语言大学语言资源高精尖创新中心

语言资源与智能大讲坛第十一讲

作者：
2017年07月06日

贝叶斯自适应深度声学模型在鲁棒性语音识别中的应用（Bayesian Adaptation Deep Acoustic Models with Applications to Robust Automatic Speech Recognition）

主讲人：Chin-Hui Lee（李锦辉）, School of ECE, Georgia Tech

Chin-Hui Lee is a professor at School of Electrical and Computer Engineering, Georgia Institute of Technology. Before joining academia in 2001, he had accumulated 20 years of industrial experience ending in Bell Laboratories, Murray Hill, as a Distinguished Member of Technical Staff and Director of the Dialogue Systems Research Department. Dr. Lee is a Fellow of the IEEE and a Fellow of ISCA. He has published over 500 papers and patents, with more than 25,000 citations and an h-index of 75 on Google Scholar. He received numerous awards, including the Bell Labs President's Gold Award in 1998. He won the SPS's 2006 Technical Achievement Award for ``Exceptional Contributions to the Field of Automatic Speech Recognition''. In 2012 he gave an ICASSP plenary talk on the future of automatic speech recognition. In the same year he was awarded the ISCA Medal in scientific achievement for ``pioneering and seminal contributions to the principles and practice of automatic speech and speaker recognition''.

讲座简介：The discriminative nature of deep neural networks (DNNs) makes adaptation using a small amount of data for a large number of DNN parameters quite challenging. This is also known as a catastrophic forgetting problem in DNN-based transfer learning. In this talk, we show that a Bayesian formulation to be effective in addressing this problem while maintaining its satisfactory theoretical properties. Leveraging upon the successes of Bayesian adaptation in GMM-HMM, we propose two completely different Bayesian adaptation frameworks for DNN-HMM, called direct and indirect DNN adaptation. The former adds a prior term to any DNN-based learning objective function, and the latter utilizes a bottleneck layer to learn a GMM for each shared tied state, or senone, at the outputs of a DNN. Tested on the WSJ and Switchboard tasks, we found that both MAP and structural MAP (SMAP) for speaker adaptation improves performances over the already-good speaker independent systems.

讲座时间：2017年7月11日（星期二）10:00-11:30

地点：北京语言大学综合楼1211室