Speech Recognition and Text-to-Speech
Phone Password Manager supports both speech-to-text (speech recognition) as well as text-to-speech (TTS). Speech recognition converts spoken words to text, and TTS can playback text information as spoken words.
To support these functions, Phone Password Manager requires a speech engine to be installed on the system. Phone Password Manager uses Microsoft Speech API as the programming interface, and supports SAPI versions 5.1+. However, all SAPI-compliant speech engines can be utilized by Phone Password Manager.
Speech recognition is provided by the Speech Service, which is installed during a complete installation or if selected during a custom installation.
The Speech Service can only be installed once on a Phone Password Manager server. If you install a second instance of Phone Password Manager on the same server, then the Speech Service will be unable to run on the new instance.
When speech recognition is enabled, users can enunciate their profile IDs, new password values, and perform key recovery strings without having to use the numeric keypad.
To set up speech recognition:
On the Phone Password Manager server, copy
psynch.speech.psl
andspeech.psl
from the samples* directory to the \<instance>\script\ directory.Modify the
idtel.cfg
file, located in the <instance>\service\ directory, by changing ScriptName as follows:ScriptName = "psynch.speech.psl"
Configure the Speech Service.
Modify
idtel.cfg
by changing the SpeechService Dll line as follows:For local Speech Service, specify
speechapi.dll
:SpeechService "" = { Dll = "speechapi.dll" //Server = <server> //Port = <port> //Timeout = <timeout> }
For remote Speech Service, specify speechapix.dll :
SpeechService "" = { Dll = "speechapix.dll" Server = <speech service server name or IP address> Port = <speech service port> }
Restart the Phone Password Manager service.
Speech recognition is now configured and ready to use. To test speech recognition, place a phone call to the IVR server and try to use speech instead of the numeric keypad to enter your details.
Configuring the speech service
The Speech Service can be configured using the following options, which are located in the idtel.cfg
file:
Option | Description |
---|---|
VoiceActivityDetectThreshold | Controls the sensitivity of the input threshold for the Speech Service. The range of possible values for this option is between -54 and +3; the default value is -40. Lowering the numeric value lowers the input threshold, which increases the sensitivity of the Speech Service. Raising the numeric value raises the input threshold, which decreases the sensitivity of the Speech Service. For example, a value of -54 recognizes even the quietest sounds, whereas a value of +3 only recognizes louder sounds. |
SpeechRecognitionMode | Controls which speech recognition mode is used. Possible values: 0 – enables "File based mode", which creates a file in the temp directory before processing the audio file for speech recognition. 1 – enables "Stream mode", which does not create a file, but simply analyzes the stream of audio for speech recognition. This was the only mode available in releases before Bravura Security Fabric version 8.0. By default, stream mode is enabled. |
KeepIntermediateSpeechFiles | Controls whether or not to save the audio files created when SpeechRecognitionMode is set to "File based mode." Possible values: 0 – files are not saved; they are deleted after speech recognition is complete. 1 – files are saved in the temp directory: C:\Documents and Settings\psadmin\Local Settings\temp |
Building .wav files using SAPI
Use the voicebuild
program to create audio .wav files based on a vocal script .txt file using SAPI.