![]() ![]() A number of published models sit within this framework, including, semi-continuous HMMs, subspace GMMs and the HMM error model. This paper describes a general class of model where the context-dependent state parameters are a transformed version of one, or more, canonical states. Though highly successful, the standard form of model does not exploit any relationships between the states, they each have separate model parameters. Kai Yu (Cambridge University Engineering Department)Ĭurrent speech recognition systems are often based on HMMs with state-clustered Gaussian Mixture Models (GMMs) to represent the context dependent output distributions. Mark Gales (Cambridge University Engineering Department) An improvement of more than 10% relative over a discriminatively trained baseline system on the Wall Street Journal corpus suggests that the proposed approach is promising.Ĭanonical State Models for Automatic Speech Recognition In this paper we develop a splitting criterion based on the minimization of the classification error. Although discriminative training has become a major line of research in speech recognition and all state-of-the-art acoustic models are trained discriminatively, the conventional phonetic decision tree approach still relies on the maximum likelihood principle. Phonetic decision trees are a key concept in acoustic modeling for large vocabulary continuous speech recognition. Markus Nußbaum-Thom (RWTH Aachen University) Index Terms: spoken dialogue systems, reinforcement learning, speech understanding, speech synthesis, natural language generationĪ Discriminative Splitting Criterion for Phonetic Decision Trees ![]() The potential advantages of a fully statistical SDS are the ability to train from data without hand-crafting, increased robustness to environmental noise and user uncertainty, and the ability to adapt and learn on-line. This requires techniques for statistical inference and policy optimisation using reinforcement learning. However, the dialogue management component must track the state of the dialogue and optimise a reward accumulated over time. Most of the components in an SDS are essentially classifiers which can be trained using supervised learning. This overview article reviews the structure of a fully statistical spoken dialogue system (SDS), using as illustration, various systems and components built at Cambridge over the last few years. Steve Young (Cambridge University Engineering Department) Still Talking to Machines (Cognitively Speaking) Keynote 1: Steve Young - Still Talking to Machines (Cognitively Speaking) Time:
0 Comments
Leave a Reply. |