Advanced Simulation Technology inc.
Automatic Speech Recognition and Speech Synthesis

ASR Overview

Automated speech recognition (ASR) technology has evolved rapidly over the last 10 years from a lab curiosity and home PC amusement to something we are meeting more often in day-to-day situations. Where it was once hard to conceive of a professional training application that could be reliant on such a capability, the introduction of ASR into aircraft systems (JSF and Eurofighter for example) and general application in mobile devices of all kinds, has led to a maturing of such systems, making such use more than just a possibility. ASTi, while still disbelieving the “throw an application on a PC and hope for the best” approach (in the classic Windows® way of doing things), has embraced the intelligent integration of ASR into the Telestra 4 product line, and will begin rolling out applications requiring this technology through 2008. To be useful in the training and simulation marketplace, an ASR system must present a certain capability set and meet certain performance criteria. The ASTi ASR solution was required to meet the following:
  • Speaker independent (no training required)
  • Continuous recognition (phrases and sentences will be understood)
  • Background noise robustness
  • Dynamic grammar switching
  • Multi-channel support
  • Recognition success rates in the high 90's percentile
The ASTi ASR solution is an integrated component of the Telestra 4 (T4) product suite, which results in a significant reduction in system complexity. Compare ASTi's solution to a solution using a mix of different vendors equipment to provide the core communications and ASR solution, the latter inevitably leads to a spider's web of analog cable runs.
ASTi ASR Solution

ASR Features

ASTi is also able to leverage the power of the T4 ACENet audio distribution architecture, to support ASR installations requiring high ASR seat counts by passing the incoming audio over the ACENet network in the digital domain to a T4 Target running as an ASR server.
The core ASR capability is an enabling technology, and ASTi has identified the following as potential areas of product development for application examples:
  • Instructor workstation voice control
  • Cockpit systems voice control
  • Automated Air Traffic Control and radio environment simulation
  • Talk-on-target training systems

Speech Synthesis Overview

As a complement to the ASR capability, the ability to automatically generate speech is invaluable in situations where it is required to represent a number of external agencies or operators within a simulated environment. Common examples of such capabilities are the use of synthesized speech to represent ATIS or GCA stations.
The voice quality of synthesized speech systems had, up until recently, left something to be desired. However, significant improvements in the underlying algorithms used to create the speech, coupled with on-going increases in the computing power available, have resulted in more natural sounding speech.
ASTi introduced off-line speech message generation as a package option to the T4 product suite early in 2007, allowing the generation of synthetic message libraries for inclusion in run-time models. This is ideally suited to the previously mentioned ATIS and GCA applications, where it is often desired to add an additional airport or a specific NOTAM condition to an existing library.
One problem with earlier generations of synthesized speech was that the number of available voices was extremely limited, and in many cases this was restricted to one voice. Recent advances now make this limitation a thing of the past, with options for up to 12 or more voices and/or accents increasing the realism by enabling alternate voices for different locations and roles.
The tight integration of ASR capabilities and synthetic speech, now allows ASTi to look toward the development of exciting new capabilities such as Automated Air Traffic Control systems. With this system, synthetic ATC controllers can direct live man-in-the-loop pilots sitting in the cockpit of a simulator through an air space crowded with CGF generated traffic, while listening to context relevant radio chatter.
Ultimately, the ability to tightly integrate these newly matured technologies into the ASTi Telestra 4 framework, coupled with the already significant modeling and simulation functionality of the core architecture, allows ASTi to offer exciting new and innovative capabilities to an industry that is always looking for new angles.