Transcriptions - Google STT

To add the realtime audio transcriptions in a Dyte meeting you can use Google Cloud Speech-to-Text

These Google services are paid, a Google Cloud account is required to proceed.

note This integration is Web only at the moment

Integration Steps

You must have a service account with GCP (Google Cloud Platform) to use Google transcriptions. Please create a project in that account that allows Google Media Translations and Google Translations API.

Once done, download the keys for the service account.

Setup a server to forward the Audio Data from client to Google Cloud. You don't want to put your GCP credentials on client side and therefore need a server which forwards audio data to Google Cloud

For this, we have provided a sample in NodeJS for you to checkout (dyte-io/google-transcription) [https://github.com/dyte-io/google-transcription/tree/main/server] . Please find it here. Currently, we only have NodeJS samples; if you're working on a different backend, feel free to port this code or connect with us to help you port it.

To use this sample, please clone this using the following command.

git clone git@github.com:dyte-io/google-transcription.git



cp .env.example .env



Edit the .env file as per your GCP service account credentials and Save it.

Note: PRIVATE_KEY should be in a single line. Try picking the value from the service account's key's JSON file as is.

npm install



This would automatically install @google-cloud/media-translation, @google-cloud/speech, and @google-cloud/translate.

npm run dev



The HTTP endpoint where this server is accessible will now be called backend_url for remaining section of the guide

npm install @dytesdk/google-transcription



Source available at (dyte-io/google-transcription)(https://github.com/dyte-io/google-transcription/tree/main/client)

The second step is to look for the place in your codebase where you are initiating a Dyte meeting.

Once you have found the place and got a hold of the meeting object, add the following code to the file to import the SDK.

import DyteGoogleSpeechRecognition from '@dytesdk/google-transcription';



Add the following code just after the point where you have access to the meeting object.

const speech = new DyteGoogleSpeechRecognition ( {

meeting ,

target : 'hi' ,

source : 'en-US' ,

baseUrl : < backend - url > ,

} ) ;



speech . on ( 'transcription' , async ( data ) => {



} ) ;



speech . transcribe ( ) ;



Here you are setting up the GoogleSpeechRecognition with the values that the current user would prefer and activating the recognition just afterward using speech.transcribe(). Then we listen to every new transcription using speech.on('transcription', aJsCallbackFunction)

To see the support languages, please refer to

With this, you would now be able to receive the live transcriptions. Feel free to put them in UI as per your need.

If you need a sample of this guide, please refer to https://github.com/dyte-io/google-transcription/blob/main/client/demo/index.ts