Speech Understanding and Dialogue Systems



 

Introduction

Now that continuous speech recognition is available commercially (and cheaply), there is an explosion of interest in applications that can make use of it. A particularly straightforward idea is to have a system capable of answering questions and taking instructions from a user, usually someone calling on the telephone.

However, there is a lot more to this than simply recognizing the words in the user's speech. Even leaving aside the complicated and still evolving matter of how users will react when they suspect or know that they are in conversation with a machine, a speech dialogue system has to:

  • recognize the intent of the user's speech, which may be expressed in pieces of dialogue rather than a single sentence, and indeed in what is not said,

  • construct a suitable answer (or supplementary question) out of the always limited set of data and strategies that the system commands,

  • select an appropriate intonation pattern for the answer.

 

Where the Progress is Being Made

To meet these challenges fully is a vast task, which is being undertaken in universities and research labs all over the world. There are notable centres in the USA, in Japan, in Europe at large and here in the United Kingdom.

See for example:

British Telecommunications

Defence Evaluation and Research Agency

and the massive reference:

Speechlinks

 

Sources for Products

Details of the major speech recognition products can be found at  comp.speech.

And here and there, developers have progressed all the way to products which are now in use. Try them out for yourself  here.

 

Things to Watch Out for

  • A lot of the action is expected to lie in special-purpose products for specific markets. So some companies are exerting themselves to provide dialogue components that will allow those familiar with a market to set up a useful system. There may be scope for many layers of middle-men here!

  • One of the most useful, and trickiest, things in building these systems is to measure how certain the system can be that what it understood is in fact what the user said. Only when you know that can you cut out one of the most annoying things for a user, endless queries to confirm what he or she said.

  • It is vital for these systems to be able to cope with interruptions, both to hear what was said, and (almost as important) to know what it was saying when the user cut in.

 

If you'd like to learn more about the potential of this technology, from an experienced but completely impartial source, it's time you got in touch with  Linguacubun Ltd  itself.



Linguacubun Ltd. Batheaston Villa, Bailbrook Lane, Bath BA1 7AA UK Tel:+44(0)1225 852865 Fax: +44(0)1225 859258