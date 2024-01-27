Transcriptions - Google STT
To add the realtime audio transcriptions in a Dyte meeting you can use Google Cloud Speech-to-Text
These Google services are paid, a Google Cloud account is required to proceed.
This integration is Web only at the moment
Integration Steps
1. Setup Google Cloud Credentials
You must have a service account with GCP (Google Cloud Platform) to use Google transcriptions. Please create a project in that account that allows Google Media Translations and Google Translations API.
Once done, download the keys for the service account.
2. Setup a Server
Setup a server to forward the Audio Data from client to Google Cloud. You don't want to put your GCP credentials on client side and therefore need a server which forwards audio data to Google Cloud
For this, we have provided a sample in NodeJS for you to checkout (dyte-io/google-transcription)[https://github.com/dyte-io/google-transcription/tree/main/server]. Please find it here. Currently, we only have NodeJS samples; if you're working on a different backend, feel free to port this code or connect with us to help you port it.
To use this sample, please clone this using the following command.
git clone git@github.com:dyte-io/google-transcription.git
2.1 Environment Setup
cp .env.example .env
Edit the
.env file as per your GCP service account credentials and Save it.
Note: PRIVATE_KEY should be in a single line. Try picking the value from the service account's key's JSON file as is.
2.2 Run the server
npm install
This would automatically install @google-cloud/media-translation, @google-cloud/speech, and @google-cloud/translate.
npm run dev
The HTTP endpoint where this server is accessible will now be called
backend_url for remaining section of the guide
Frontend Setup
3.1 Installation
npm install @dytesdk/google-transcription
Source available at (dyte-io/google-transcription)(https://github.com/dyte-io/google-transcription/tree/main/client)
3.2 Integrate
The second step is to look for the place in your codebase where you are initiating a Dyte meeting.
Once you have found the place and got a hold of the meeting object, add the following code to the file to import the SDK.
import DyteGoogleSpeechRecognition from '@dytesdk/google-transcription';
Add the following code just after the point where you have access to the meeting object.
const speech = new DyteGoogleSpeechRecognition({
meeting, // Dyte meeting object from DyteClient.init
target: 'hi', // Language that the current user wants to see
source: 'en-US', // Language that the current user would speak in
baseUrl: <backend-url>, // Backend URL from step 2.4
});
speech.on('transcription', async (data) => {
// ... do something with transcription
});
speech.transcribe();
Here you are setting up the GoogleSpeechRecognition with the values that the current user would prefer and activating the recognition just afterward using speech.transcribe(). Then we listen to every new transcription using speech.on('transcription', aJsCallbackFunction)
To see the support languages, please refer to
- https://cloud.google.com/speech-to-text/docs/speech-to-text-supported-languages
- https://cloud.google.com/translate/docs/languages
With this, you would now be able to receive the live transcriptions. Feel free to put them in UI as per your need.
If you need a sample of this guide, please refer to https://github.com/dyte-io/google-transcription/blob/main/client/demo/index.ts