Web Speech API

Orate supports the native Web Speech API for speech-to-text.

The Web Speech API, built into all modern browsers, enables you to incorporate voice data into web apps. It has two parts: SpeechSynthesis and SpeechRecognition, the latter of which we use for transcription.

Setup

The Web Speech API provider is available by default in Orate. To import it, you can use the following code:

import { native } from 'orate/native';

Configuration

The Native provider does not require any configuration.

Usage

The Native provider provides a single interface for all of the Web Speech API's functionality.

Speech to Text

The Native provider provides a stt function that allows you to create a speech-to-text transcription function using the Web Speech API.

import { transcribe } from 'orate';
import { native } from 'orate/native';
 
const text = await transcribe({
  model: native.stt(),
  audio: someArrayBuffer,
});

You can specify a language to use by passing it as an option to the stt function. If not specified, this defaults to the HTML lang attribute value, or the user agent's language setting if that isn't set either.

const text = await transcribe({
  model: native.stt({ lang: 'en-US' }),
  audio: someArrayBuffer,
});

On this page