Skip to contentSkip to navigationSkip to topbar
Page toolsOn this page
Looking for more inspiration?Visit the

Real-time Transcriptions for Video


(warning)

Legal notice

Real-time Transcriptions uses artificial intelligence or machine learning technologies. If you enable or use any of the features or functionalities within Twilio Video that Twilio identifies as using artificial intelligence or machine learning technology, you acknowledge and agree that your use of these features or functionalities is subject to the terms of the Predictive and Generative AI/ML Features Addendum(link takes you to an external page).

Real-time Transcriptions converts speech from any participant in a Video Room into text and sends that text to the Video Client SDKs (JavaScript, iOS, and Android). Your application can render the text in any style and format. Twilio supports multiple speech models from which you can choose the model that best fits your use case.

Your app can implement transcriptions in two ways:

  • Start when your app creates a Video Room.
  • Start, stop, or restart on demand while the Room is active.

You turn on Real-time Transcriptions at the Video Room level, so every participant's speech gets transcribed. You configure the spoken language and speech model, and the settings remain in effect until the Room ends.

While active, Twilio delivers the transcribed text, along with the Participant SID, to every participant in the Room.

If you turn on partial results, the transcription engine delivers interim results so that your app can refresh its interface in near real-time.

You can also set a default configuration in the Twilio Console.


REST APIs

rest-apis page anchor

Transcriptions subresource

transcriptions-subresource page anchor

Transcriptions exists as a subresource of the Room resource that represents a Room's transcript.

Transcriptions resource URI

transcriptions-resource-uri page anchor
/v1/Rooms/{RoomNameOrSid}/Transcriptions/{TranscriptionTtid}
PropertyTypeDescription
ttidttidThe Twilio Type ID of the Transcriptions resource. It is assigned at creation time using the following format: video_transcriptions_{uuidv7-special-encoded}.
room_sidSID<RM>The ID of the room instance parent to the Transcriptions resource.
account_sidSID<AC>The account ID that owns the Transcriptions resource.
statusenum<string>The status of the Transcriptions resource. It can be: started, stopped or failed. The resource is created with status started by default.
configurationobject<string-map>Key-value map with configuration parameters applied to audio tracks transcriptions. To review the list of properties, see the Parameters for the configuration object section.
date_createdstring<date-time>The date and time in UTC when the resource was created, specified in ISO 8601 format.
date_updatedstring<date-time>The date and time in UTC when the resource was last updated, specified in ISO 8601 format. Null if the resource hasn't been updated since creation.
start_timestring<date-time>The date and time in UTC when the resource was in state started and the first participant joined the Room, specified in ISO 8601 format.
end_timestring<date-time>The date and time in UTC when transcription processing has paused in the Room, specified in ISO 8601 format. This happens when the resource is set to stopped or the last participant has left the Room.
durationintegerThe cumulative time in seconds that the transcripion resource has been in state started and at least one participant is in the Room. This is independent of whether audio tracks are published or audio tracks are muted.
urlstring<uri>The absolute URL of the resource.

If Twilio detects an internal error that prevents generation of transcriptions, the Transcriptions resource changes its status to failed. The Twilio Console receives a debug event with the details of the failure. Once it detects a failure, you can't restart that resource.

Properties for the configuration object

properties-for-the-configuration-object page anchor
NameTypeNecessityDefaultDescription
transcriptionEnginestringOptional"google"The supported transcription engine Twilio uses. To learn about the possible values, see the transcription engine table.
speechModelstringOptional"telephony"The provider-supported recognition model that the transcription engine uses. To learn about the possible values, see the speech model table.
languageCodestringOptional"en-US"The language code that the transcription engine uses, specified in BCP-47 format. This attribute ensures that the transcription engine understands and processes the spoken language.
partialResultsBooleanOptionalfalseIndicates whether to send partial results. When true, the transcription engine sends interim results as the transcription progresses, providing more immediate feedback before the final result could display.
profanityFilterBooleanOptionaltrueIndicates if the server tries to filter profanities. This replaces all but the initial character in each filtered word with asterisks. Google provides this feature(link takes you to an external page).
hintsstringOptionalNoneA list of words or phrases that the transcription provider can expect to encounter during a transcription. Using the hints attribute can improve the transcription provider's recognition of words or phrases that are expected during the video call. Up to 500 words or phrases can be provided in the list of hints, each entry separated with a comma. Each word or phrase may be up to 100 characters each. Separate each word in a phrase with a space.
enableAutomaticPunctuationBooleanOptionaltrueThe provider adds punctuation to the transcribed text. Default is true. When enabled, the transcription engine inserts punctuation marks such as periods, commas, and question marks, improving the readability of the transcribed text.

Transcription engines and speech models

transcription-engines-and-speech-models page anchor

The following table lists the possible values for the transcriptionEngine and the associated speechModel properties.

Transcription engineSpeech modelUse caseExample
googletelephony(link takes you to an external page)Use this model for telephone call audio.
googlemedical_conversation(link takes you to an external page)Use this model for conversations between a medical provider and a patient.
googlelong(link takes you to an external page)Use this model for any type of long-form content.Media, spontaneous speech, and conversations
googleshort(link takes you to an external page)Use this model for short utterances that are a few seconds in length. Consider using this model instead of the command and search model.Commands or other single, short, directed speech
googletelephony_short(link takes you to an external page)Use this model for short or even single-word utterances for audio that originated from a phone call.Customer service, teleconferencing, and kiosk applications.
googlemedical_dictation(link takes you to an external page)Use this model to transcribe notes dictated by a medical professional.
googlechirp_telephony(link takes you to an external page)Use this model for telephone call audio with multiple languages. It uses the Google Universal large Speech Model (USM).
googlechirp(link takes you to an external page)Use this model for content for audio in multiple languages. It uses the Google Universal large Speech Model (USM).
deepgramnova-3(link takes you to an external page)Recommended for meetings, captioning, noisy, or far-field audio.
deepgramnova-2(link takes you to an external page)Recommended with languages that nova-3 doesn't support.
(information)

Info

Create a Room with transcriptions enabled

create-a-room-with-transcriptions-enabled page anchor

To create a Video Room with Real-time Transcriptions enabled, add the following two parameters to the Room POST(link takes you to an external page) request:

ParameterTypeDescription
TranscribeParticipantsOnConnectBooleanWhether to start real-time transcriptions when Participants connect. Default is false.
TranscriptionsConfigurationobjectKey-value configuration settings for the transcription engine. To learn more, see Transcription configuration properties.

To turn on transcriptions for the Video Room, set the TranscribeParticipantsOnConnect parameter to true.

Create a Room with transcriptions enabled example request

create-a-room-with-transcriptions-enabled-example-request page anchor
1
curl -X POST "https://video.twilio.com/v1/Rooms" \
2
-u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN \
3
--data-urlencode 'TranscriptionsConfiguration={"languageCode": "EN-us", "partialResults": true}' \
4
--data-urlencode "TranscribeParticipantsOnConnect=true" \

Create a Room with transcriptions enabled example response

create-a-room-with-transcriptions-enabled-example-response page anchor
1
{
2
"account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
3
"audio_only": false,
4
"date_created": "2025-06-12T15:42:32Z",
5
"date_updated": "2025-06-12T15:42:32Z",
6
"duration": null,
7
"empty_room_timeout": 5,
8
"enable_turn": true,
9
"end_time": null,
10
"large_room": false,
11
"links": {
12
"participants": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Participants",
13
"recording_rules": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/RecordingRules",
14
"recordings": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Recordings",
15
"transcriptions": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions"
16
},
17
"max_concurrent_published_tracks": 170,
18
"max_participant_duration": 14400,
19
"max_participants": 50,
20
"media_region": "us1",
21
"record_participants_on_connect": false,
22
"sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
23
"status": "in-progress",
24
"status_callback": null,
25
"status_callback_method": "POST",
26
"type": "group",
27
"unique_name": "test",
28
"unused_room_timeout": 5,
29
"url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
30
"video_codecs": [
31
"VP8",
32
"H264"
33
]
34
}

Start transcriptions on an existing Room

start-transcriptions-on-an-existing-room page anchor

To create a Transcription, send a POST request to the following URI:

Create Transcriptions resource URI

create-transcriptions-resource-uri page anchor
/v1/Rooms/{RoomNameOrSid}/Transcriptions
ParameterTypeDescription
RoomSidSID<RM>The ID of the parent room where you created the Transcriptions resource.
ParameterTypeDescription
Configurationobject<string-map>Object with key-value configurations.

Start a Transcriptions for an existing Room example request

start-a-transcriptions-for-an-existing-room-example-request page anchor
1
curl -X POST "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions" \
2
-u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN \
3
--data-urlencode 'Configuration={"languageCode": "EN-us", "partialResults": true, "profanityFilter": true, "speechModel": "long"}'

Start a Transcriptions for an existing Room example response

start-a-transcriptions-for-an-existing-room-example-response page anchor
1
{
2
"account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
3
"configuration": {
4
"languageCode": "EN-us",
5
"partialResults": "true",
6
"profanityFilter": "true",
7
"speechModel": "long"
8
},
9
"date_created": "2025-07-22T14:14:35Z",
10
"date_updated": null,
11
"duration": null,
12
"end_time": null,
13
"room_sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
14
"start_time": null,
15
"status": "started",
16
"ttid": "video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
17
"url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
18
}

Stop transcription on a Room

stop-transcription-on-a-room page anchor

To stop a Transcriptions resource, send a POST request to the following resource instance URI:

Stop Transcriptions resource URI

stop-transcriptions-resource-uri page anchor
/v1/Rooms/{RoomSid}/Transcriptions/{ttid}

To stop transcriptions on a Room, set the Status parameter to stopped.

ParameterTypeDescription
ttidttidA single TTID of the Transcriptions resource being updated.
RoomSidSID<RM>The ID of the parent room where you updated the Transcriptions resource.
ParameterTypeDescription
Statusenum<string>New status of the Transcriptions resource. Can be: started, stopped. There is no state transition if the resource property status already has the same value or if the parameter is missing.

Stop a Transcription in an existing Room example request

stop-a-transcription-in-an-existing-room-example-request page anchor
1
curl -X POST "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX" \
2
-u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN \
3
--data-urlencode "Status=stopped"

Stop a Transcription in an existing Room example response

stop-a-transcription-in-an-existing-room-example-response page anchor
1
{
2
"account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
3
"configuration": {
4
"languageCode": "EN-us",
5
"partialResults": "true"
6
},
7
"date_created": "2025-07-22T12:55:30Z",
8
"date_updated": "2025-07-22T12:56:02Z",
9
"duration": null,
10
"end_time": null,
11
"room_sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
12
"start_time": null,
13
"status": "stopped",
14
"ttid": "video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
15
"url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
16
}

Restart Transcriptions on a Room

restart-transcriptions-on-a-room page anchor

To restart a stopped Transcriptions resource in a Room, send a POST request to the following resource instance URI:

Restart Transcriptions resource URI

restart-transcriptions-resource-uri page anchor
/v1/Rooms/{RoomSid}/Transcriptions/{ttid}

To restart transcription, set the Status to started.

ParameterTypeDescription
ttidttidThe TTID of the Transcriptions resource being updated. Current implementation supports a single Transcriptions resource, but this might change in future implementations.
RoomSidSID<RM>The ID of the parent room where the Transcriptions resource is updated.
ParameterTypeDescription
Statusenum<string>The status of the Transcriptions resource. To restart transcriptions, set to started. If this parameter has the same or no value, the state makes no transition.

Restart transcription example request

restart-transcription-example-request page anchor
1
curl -X POST "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX" \
2
-u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN \
3
--data-urlencode "Status=started"

Restart transcription example response

restart-transcription-example-response page anchor
1
{
2
"account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
3
"configuration": {
4
"languageCode": "EN-us",
5
"partialResults": "true"
6
},
7
"date_created": "2025-07-22T12:57:24Z",
8
"date_updated": null,
9
"duration": null,
10
"end_time": null,
11
"room_sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
12
"start_time": null,
13
"status": "started",
14
"ttid": "video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
15
"url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
16
}

Fetch a list of the Transcriptions resources for a Room

fetch-a-list-of-the-transcriptions-resources-for-a-room page anchor

To fetch a list of Transcriptions resources in a Room, send a GET request to the following resource instance URI:

Fetch a list of Transcriptions resource URI

fetch-a-list-of-transcriptions-resource-uri page anchor
/v1/Rooms/{RoomSid}/Transcriptions

Real-time Transcriptions supports only a single instance of the Transcriptions resource per Room, so the list only has a single item.

ParameterTypeDescription
RoomSidSID<RM>The ID of the parent room that has the Transcriptions resources.

Retrieve a transcription example request

retrieve-a-transcription-example-request page anchor
1
curl -X GET "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions" \
2
-u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN

Retrieve a transcription example response

retrieve-a-transcription-example-response page anchor
1
{
2
"meta": {
3
"first_page_url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions?PageSize=50&Page=0",
4
"key": "transcriptions",
5
"next_page_url": null,
6
"page": 0,
7
"page_size": 50,
8
"previous_page_url": null,
9
"url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions?PageSize=50&Page=0"
10
},
11
"transcriptions": [
12
{
13
"account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
14
"configuration": {},
15
"date_created": "2025-07-22T11:05:41Z",
16
"date_updated": null,
17
"duration": null,
18
"end_time": null,
19
"room_sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
20
"start_time": null,
21
"status": "started",
22
"ttid": "video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
23
"url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
24
}
25
]
26
}

Fetch a single Transcriptions resource

fetch-a-single-transcriptions-resource page anchor

To fetch a Transcriptions resource in a Room, send a GET request to the following resource instance URI:

Fetch one specific Transcriptions resource URI

fetch-one-specific-transcriptions-resource-uri page anchor
/v1/Rooms/{RoomSid}/Transcriptions/{ttid}
ParameterTypeDescription
ttidttidThe TTID of the Transcriptions resource being requested.
RoomSidSID<RM>The ID of the parent room where you fetch the Transcriptions resource.

Fetch one transcription example request

fetch-one-transcription-example-request page anchor
1
curl -X https://video.twilio.com/v1/Rooms/{room_sid}/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX \
2
-u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN

Fetch one transcription example response

fetch-one-transcription-example-response page anchor
1
{
2
"account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
3
"configuration": {
4
"LanguageCode": "EN-us",
5
"ProfanityFilter": "true"
6
},
7
"date_created": null,
8
"date_updated": null,
9
"duration": null,
10
"end_time": null,
11
"links": {
12
"transcriptions": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
13
},
14
"room_sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
15
"start_time": null,
16
"status": "created",
17
"ttid": "video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
18
}

Transcribed text delivery

transcribed-text-delivery page anchor

Twilio delivers transcribed text to the client SDKs through callback events.

The schema of the JSON delivery format contains a version number. Each event contains the transcription of a single utterance and details of the participant who generated the audio.

1
properties:
2
type:
3
const: extension_transcriptions
4
5
version:
6
description: |
7
Version of the transcriptions protocol used by this message. It is semver compliant
8
9
track:
10
$ref: /Server/State/RemoteTrack
11
description: |
12
Audio track from where the transcription has been generated.
13
14
participant:
15
$ref: /Server/State/Participant
16
description: |
17
The participant who published the audio track from where the
18
transcription has been generated.
19
20
sequence_number:
21
type: integer
22
description: |
23
Sequence number. Starts with one and increments monotonically. A sequence
24
counter is defined for each track to allow the receiver to identify
25
missing messages.
26
27
timestamp:
28
type: string
29
description: |
30
Absolute time from the real-time-transcription. It is
31
conformant with UTC ISO 8601.
32
33
partial_results:
34
type: boolean
35
description: |
36
Whether the transcription is a final or a partial result.
37
38
stability:
39
type: double
40
description: |
41
Indicates how likely it is that this partial result transcript won't be updated again. The range is from `0.0` (unstable) to `1.0` (stable). This field is only provided when `partialResults` is `true`.
42
43
language_code:
44
type: string
45
description: |
46
Language code of the transcribed text. It is conformant with BCP-47.
47
48
transcription:
49
type: string
50
description: |
51
Utterance transcription

Transcription example response

transcription-example-response page anchor
1
{
2
"version": "1.0",
3
"language_code": "en-US",
4
"partial_results": false,
5
"participant": "PA00000000000000000000000000000000",
6
"sequence_number": 3,
7
"timestamp": "2025-01-01T12:00:00.000000000Z",
8
"track": "MT00000000000000000000000000000000",
9
"transcription": "This is a test",
10
"type": "extension_transcriptions"
11
}

Partial results and stability

partial-results-and-stability page anchor

When you set partialResults parameter to true, the transcription engine provides a series of partial results as it determines the text corresponding to the spoken utterance.

The stability property indicates the probability that the partial result provided changes before the delivery of the final result. This value ranges from 0.0 (unstable) to 1.0 (stable). In general, consider partial results with stability less than 0.9 as preliminary and temporary. When building an app element to display the transcribed text as captions or subtitles, filter out partial results with a stability value less than 0.9. This avoids text flickering as the app receives partial results.

Start and stop transcript events with the JavaScript SDK

start-and-stop-transcript-events-with-the-javascript-sdk page anchor

To turn on the flow of transcript events, set the receiveTranscriptions parameter in connectOptions to true. This parameter defaults to false. With Real-time Transcriptions enabled for the Room and receiveTranscriptions set to true, callback events containing the transcribed text start to flow.

JavaScript transcription event example

javascript-transcription-event-example page anchor
1
import { connect } from 'twilio-video';
2
3
const room = await connect(token, {
4
name: 'my-room',
5
receiveTranscriptions: true
6
});
7
8
room.on('transcription', (transcriptionEvent) => {
9
console.log(`${transcriptionEvent.participant}: ${transcriptionEvent.transcription}`);
10
});

Start and stop transcript events with the iOS SDK

start-and-stop-transcript-events-with-the-ios-sdk page anchor

To receive transcription events, set the receiveTranscriptions parameter in TVIConnectOptions to true. This parameter defaults to false. To fetch this value, use the isReceiveTranscriptionsEnabled getter.

With Real-time Transcriptions enabled for the Room and receiveTranscriptions set to true, the transcriptionReceived(room:transcription:) method in the RoomDelegate protocol delivers callback events containing the transcribed text.

Swift transcription event example

swift-transcription-event-example page anchor
1
let options = ConnectOptions(token: accessToken, block: { (builder) in
2
builder.roomName = "test"
3
builder.isReceiveTranscriptionsEnabled = true
4
})

Start and stop transcript events with the Android SDK

start-and-stop-transcript-events-with-the-android-sdk page anchor

To receive transcription events, set the receiveTranscriptions parameter in ConnectOptions to true. This parameter defaults to false. To check the setting, call isReceiveTranscriptionsEnabled().

With Real-time Transcriptions enabled for the Room and ConnectOptions set to true, the onTranscription(@NonNull Room room, @NonNull JSONObject json) method of the Room.Listener interface delivers callback events containing the transcribed text.

Java transcription event example

java-transcription-event-example page anchor
1
ConnectOptions connectOptions = new ConnectOptions.Builder(accessToken)
2
.receiveTranscriptions(true)
3
.build();
4
5
Video.connect(context, connectOptions, roomListener);

Twilio Console configuration

twilio-console-configuration page anchor

To enable and configure Real-time Transcriptions in the Twilio Console, complete the following steps.

  1. Log in to the Twilio Console.
  2. Go to Video > Manage > Room Settings(link takes you to an external page).
  3. Scroll to Realtime Transcriptions.
  4. Click Accept for the Predictive and Generative AI/ML Features Addendum(link takes you to an external page).
  5. Click Enabled for the Automatically turn on Realtime Transcriptions by default in Rooms.
  6. Click Save.

(information)

AI Nutrition Facts

Real-time Transcriptions for Video uses third-party artificial technology and machine learning technologies.

To improve your understanding how AI handles your data, Twilio's AI Nutrition Facts(link takes you to an external page) provide an overview of the AI feature you're using. The following Speech to Text Transcriptions—Nutrition Facts label outlines the AI qualities of Real-time Transcriptions for Video.

AI Nutrition Facts

Speech to Text Transcriptions - Programmable Voice, Twilio Video, and Conversational Intelligence

Description
Generate speech to text voice transcriptions (real-time and post-call) in Programmable Voice, Twilio Video, and Conversational Intelligence.
Privacy Ladder Level
N/A
Feature is Optional
Yes
Model Type
Generative and Predictive - Automatic Speech Recognition
Base Model
Deepgram Speech-to-Text, Google Speech-to-Text, Amazon Transcribe

Trust Ingredients

Base Model Trained with Customer Data
No

Conversational Intelligence, Programmable Voice, and Twilio Video only use the default Base Model provided by the Model Vendor. The Base Model is not trained using customer data.

Customer Data is Shared with Model Vendor
No

Conversational Intelligence, Programmable Voice, and Twilio Video only use the default Base Model provided by the Model Vendor. The Base Model is not trained using customer data.

Training Data Anonymized
N/A

Base Model is not trained using any customer data.

Data Deletion
Yes

Transcriptions are deleted by the customer using the Conversational Intelligence API or when a customer account is deprovisioned.

Human in the Loop
Yes

The customer views output in the Conversational Intelligence API or Transcript Viewer.

Data Retention
Until the customer deletes

Compliance

Logging & Auditing
Yes

The customer can listen to the input (recording) and view the output (transcript).

Guardrails
Yes

The customer can listen to the input (recording) and view the output (transcript).

Input/Output Consistency
Yes

The customer is responsible for human review.

Other Resources
https://www.twilio.com/docs/conversational-intelligence

  • To use the Google medical_conversation model, set enableAutomaticPunctuation to true.
  • When a Room reaches the MaxParticipantDuration(link takes you to an external page) time limit, Transcriptions stop. As a workaround, set the MaxParticipantDuration parameter of the Room exceeds than the expected lifetime of the Room. This value defaults to four hours.