mission_statement
Phone - Australia: 1300 553 313
Hotline - New Zealand: 0800 450 168
 
Chant Logo
 

SpeechKit

Integrate Speech Technology for Hands-free Operation

Efficiently manage speech recognition and synthesis with SpeechKit

You really don't have to sit in front of a computer with a mouse and keyboard to use information technology. Your applications can be enhanced to speak and listen to you from where ever you need them to.

Speech recognition is the process of converting an acoustic signal (i.e. audio data), captured by a microphone or a telephone, to a set of words. These words can be used for controlling computer functions, data entry, and application processing.

Speech synthesis is the process of converting words to phonetic and prosodic symbols and generating synthetic speech audio data. Synthesized speech can be used for answering questions, event notification, and reading documents aloud.

What is Speech Management?

Speech management enables you to:

  • control application functions by speaking rather than having to use a mouse or keyboard,
  • capture data by speaking rather than typing, and
  • prompt and confirm data capture with spoken or audio acknowledgement.

Applications benefits include:

  • enhanced speed and accuracy of data capture,
  • added flexibility of running applications in a variety of environments, and
  • expanded operating scenarios for hands-free computing.

What is SpeechKit?

Chant SpeechKit handles the complexities of speech recognition and speech synthesis to minimise the programming necessary to develop software that speaks and listens.

It simplifies the process of managing Apple Speech, Google android.speech, Microsoft SAPI 5, Microsoft Speech Platform, Microsoft WindowsMedia (UWP and WinRT), and Nuance Dragon NaturallySpeaking recognizers, and managing Acapela, Apple AVFoundation, Cepstral, CereProc, Google android.speech.tts, Microsoft SAPI 5, Microsoft Speech Platform, and Microsoft WindowsMedia (UWP and WinRT) synthesizers.

SpeechKit includes Android, C++, C++Builder, Delphi, Java,.NET Framework, Objective-C (iOS and macOS), and Swift (iOS and macOS) class libraries to support all your programming languages and sample projects for popular IDEs—such as the latest Visual Studio from Microsoft, RAD Studio from Embarcadero, Android Studio from Google, and Xcode from Apple.

The class libraries can be integrated with 32-bit and 64-bit applications for Android, iOS, macOS, and Windows platforms.

Speech Recognition and Synthesis Architecture

SpeechKit provides a productive way to develop software that listens. Applications set properties and invokes methods through a speech recognition management class. This class handles the low-level functions with speech recognition engines (i.e., recognizers).

Applications establish a session with a recognizer through which spoken language captured live via a microphone or from recorded audio is processed and converted to text. Applications use SpeechKit to manage the activities for speech recognition on their behalf. SpeechKit manages the resources and interacts directly with the speech application program interface (API). SpeechKit supports the following speech APIs for speech recognition:

Applications receive recognised speech as text and notification of other processing states through event callbacks.

SpeechKit Architecture for Speech Recognition

SpeechKit encapsulates all of the technologies necessary to make the process of recognizing speech simple and efficient for applications.

Speech Synthesis Management

SpeechKit provides a productive way to develop software that speaks. Applications set properties and invoke methods through the speech synthesis management class. This class handles the low-level functions with text-to-speech engines (i.e., synthesizers or voices).

Applications establish a session with a synthesizer through which speech is synthesized from text. Applications use SpeechKit to manage the synthesizer resources on their behalf. SpeechKit manages the resources and interacts directly with a speech application program interface (API). SpeechKit supports the following speech APIs for speech synthesis:

Application receive notification of other processing states through event callbacks.

SpeechKit Architecture for Speech Synthesis

The ChantTTS class encapsulates all of the technologies necessary to make the process of synthesizing speech simple for your application. Optionally, it can save the session properties for your application to ensure they persist across application invocations.

SpeechKit simplifies the process of synthesizing speech by handling the low-level activities directly with a synthesizer.

Instantiate SpeechKit to synthesize speech within the application and destroy SpeechKit to release its resources when speech synthesis is no longer needed.

Feature Summary

Chant SpeechKit handles the complexities of speech recognition and speech synthesis. The classes minimise the programming efforts necessary to construct software that speaks and listens.

A SpeechKit application can:

  • Control application functions by speaking rather than having to use a mouse or keyboard;
  • Prompt users for applicable data capture;
  • Capture data by speaking rather than typing;
  • Confirm data capture with spoken or audio acknowledgement;
  • Transcribe audio buffers, files, and streams to text; and
  • Synthesize speech to audio buffers, files, and streams.

Recognizers provide proprietary programming interfaces (i.e., APIs). SpeechKit supports the following recognizers and their APIs:

Speech API Platforms
Apple Speech ARM, x64, x86
Google android.speech ARM
Microsoft SAPI 5 x64, x86
Microsoft Speech Platform x64, x86
Microsoft .NET System.Speech x64, x86
Microsoft .NET Microsoft.Speech x64, x86
Microsoft WindowsMedia (UWP) ARM, x64, x86
Microsoft WindowsMedia (WinRT) x64, x86
Nuance Dragon NaturallySpeaking x64, x86

Synthesizers are accessed via proprietary application programming interfaces (APIs). SpeechKit supports the following speech APIs for speech synthesis:

Speech API Platforms
Acapela TTS x64, x86
Apple AVFoundation ARM, x64, x86
Cepstral Swift x64, x86
CereProc CereVoice x64, x86
Google android.speech.tts ARM
Microsoft SAPI 5 x64, x86
Microsoft Speech Platform x64, x86
Microsoft .NET System.Speech x64, x86
Microsoft .NET Microsoft.Speech x64, x86
Microsoft WindowsMedia (UWP) ARM, x64, x86
Microsoft WindowsMedia (WinRT) x64, x86

Within Chant Developer Workbench, you can:

  • Enumerate speech engines for testing recognizer-, and synthesizer-specific features;
  • Trace recognition and synthesis events;
  • Support grammar activation and testing; and
  • Support TTS markup playback.

Recognizer Management:Enumerate and test recognizers with command, dictation, and grammar vocabularies.

Synthesizer Management: Enumerate and test synthesizers. Use the Speech Synthesis window to synthesize text. Trace synthesis events in the Events window.

 


Call MicroWay on 1300 553 313 or email for more information.

 


For more information please contact the MicroWay sales team: buynow
Head Office
MicroWay Pty Ltd
PO Box 84,
Braeside, Victoria, 3195, Australia
Ph: 1300 553 313
Fax: 1300 132 709
email: sales@microway.com.au
ABN: 56 129 024 825
Sydney Sales Office
MicroWay Pty Ltd
PO Box 1733,
Crows Nest, NSW 1585, Australia
Tel: 1300 553 313
Fax: 1300 132 709
email: sales@microway.com.au
ABN: 56 129 024 825
New Zealand Sales Office
MicroWay Pty Ltd (NZ)
PO Box 912026
Victoria Street West
Auckland 1142, New Zealand
Tel: 0800 450 168
email: sales@microway.co.nz

International: call +61 3 9580 1333, fax +61 3 9580 8995

 
© 1995-2023 MicroWay Pty Ltd. All Rights Reserved. Terms and Privacy Policy.