Skip to main content

Meeting Transcription

beta

The meeting transcription feature is currently in beta, which means it is still being tested and evaluated, and may undergo some change. This feature is not accessible to the public at the moment and will be activated solely upon request, subject to our team's assessment of your usage and needs. If you wish to have this feature enabled for your organization, please get in touch with us.

Dyte's meeting transcription allows you to transcribe your Dyte meetings in real-time, making it easy to capture important discussions and refer back to them later. This guide will walk you through how to use this feature effectively.

Control transcriptions for participants using presets​

You can control whether or not a participant's audio will be transcribe with the help of the transcription_enabled flag in the participant's preset. All participants with the transcription_enabled turned on in their preset will be able to generate transcripts in real-time in a Dyte meeting.

You can create a new preset on our Developer Portal, or using our REST API.

Modify Transcription Behavior with AI Config​

You can control transcription behavior through a configurable AI setup. When creating a meeting using the REST API, you can pass an AI configuration for transcriptions. This allows for greater control over the transcription process.

Supported Languages​

You can specify the language for transcription to ensure accurate and relevant results. The following languages are supported:

  • English (United States): en-US
  • English (India): en-IN
  • German: de
  • Hindi: hi
  • Swedish: sv
  • Russian: ru
  • Polish: pl
  • Greek: el
  • French: fr
  • Dutch: nl
"language": "en-US"

Keywords​

Keywords can be added to help the transcription engine accurately detect and transcribe specific terms, such as names, technical jargon, or other context-specific words. This is particularly useful in meetings where certain terms are frequently used and need to be recognized correctly.

"keywords": ["Dyte", "Mary", "Sue"]

Profanity Filter​

You can enable or disable the profanity filter based on your needs. This feature ensures that any offensive language is either included or excluded from the transcriptions, depending on your preference.

"profanity_filter": false

By utilizing these new features, you can customize the transcription experience to better suit your needs.

Example Configuration​

Here is an example of how to pass the AI configuration in the meeting creation API:

{
"title": "Meeting Transcriptions",
"ai_config": {
"transcription": {
"keywords": ["Dyte"],
"language": "en-US",
"profanity_filter": false
}
}
}

Consuming transcripts​

There are 3 ways in which these transcripts can be consumed.

  1. Client core SDK: The transcripts can be consumed on the client-side using the Dyte SDK that's suitable for your platform. These transcripts are generated on the server in real-time.
  2. Webhooks: The meeting transcript can be consumed via a webhook after the meeting ends.
  3. REST API: The meeting transcript can also be fetched via the rest API.

Consuming transcripts in real-time​

For consuming transcripts in real-time on the client SDK of your choice, you just need to ensure that the transcription_enabled flag is enabled in the preset. Transcripts for all the participants having this flag set will be broadcasted in the meeting.

You can use the meeting.ai object to access the transcripts.

console.log(meeting.ai.transcripts);

The transcripts are also emitted by the meeting.ai object, so a listener can be attached to it.

meeting.ai.on('transcript', (transcriptData) => {
console.log('Transcript:', transcriptData);
});

As participants speak during the meeting, you'll receive partial transcripts, giving you real-time feedback even before they finish their sentences. The isPartialTranscript flag in the transcript data shows whether the transcript is partial or final.

{
"id": "1a2b3c4d-5678-90ab-cdef-1234567890ab",
"name": "Alice",
"peerId": "4f5g6h7i-8j9k-0lmn-opqr-1234567890st",
"userId": "uvwxyz-1234-5678-90ab-cdefghijklmn",
"customParticipantId": "abc123xyz",
"transcript": "Hello?",
"isPartialTranscript": true,
"date": "Wed Aug 07 2024 10:15:30 GMT+0530 (India Standard Time)"
}

In the example above, isPartialTranscript is true, indicating the transcript is still in progress. Once the participant finishes speaking, the final transcript will be sent with isPartialTranscript set to false. This helps you distinguish between ongoing speech and completed transcriptions, making the transcription process more dynamic and responsive.

Consume transcript via a post-meeting webhook​

You can configure a webhook with the meeting.transcript event enabled to receive the meeting transcript after the meeting has ended. You can do this either on our Developer Portal, or using a REST API.

You can see the webhook format here.

Fetch the meeting transcript​

You do not need to rely on the webhook to get the transcript for a meeting. Dyte provides a REST API using which you can obtain the transcripts for a particular session. You can use this API to get the transcript for a meeting at a later time. Dyte stores the transcript of a meeting for 7 days since the start of the meeting.

The transcript is received in the form of a CSV. Here is the format of the said CSV:

Timestamp, Participant ID, User ID, Custom Participant ID, Participant Name, Transcript

The following is a description of all the fields specified in the above CSV.

  1. Timestamp: An ISO 8601 format string indicating the time of utterance (or the time of speech).
  2. Participant ID: An identifier for individual peers in the meeting. For instance, if the participant joins the meeting twice, both the "peers" will have the same User ID but different Participant IDs.
  3. User ID: An identifier for a participant in the meeting, as returned by the add participant API call.
  4. Custom Participant ID: An identifier that you can specify to identify a user. This can be sent in the request body of the add participant API call.
  5. Participant Name: The display name of the user.
  6. Transcript: The transcribed utterance.

Testing transcription​

Once you have configured a preset and a webhook according to the instructions above, you can proceed to test whether meeting transcription is working for your organization. To test if meeting transcription has been configured for your organization, perform the following steps.

  1. Create a meeting.
  2. Add a participant to the meeting. Make sure that the preset you use was configured according to this guide.
  3. Join the meeting with the authToken you just obtained. As you unmute and speak, your speech should be getting transcribed in real-time for all the participants in the meeting.
  4. Once the meeting ends, you will be getting a webhook with the event meeting.transcript. The body of this webhook will consist of the entire meeting transcript.