Real-time Transcriptions for Video

(warning)

Legal notice

Real-time Transcriptions uses artificial intelligence or machine learning technologies. If you enable or use any of the features or functionalities within Twilio Video that Twilio identifies as using artificial intelligence or machine learning technology, you acknowledge and agree that your use of these features or functionalities is subject to the terms of the Predictive and Generative AI/ML Features Addendum.

Real-time Transcriptions converts speech from any participant in a Video Room into text and sends that text to the Video Client SDKs (JavaScript, iOS, and Android). Your application can render the text in any style and format. Twilio supports multiple speech models from which you can choose the model that best fits your use case.

Your app can implement transcriptions in two ways:

Start when your app creates a Video Room.
Start, stop, or restart on demand while the Room is active.

You turn on Real-time Transcriptions at the Video Room level, so every participant's speech gets transcribed. You configure the spoken language and speech model, and the settings remain in effect until the Room ends.

While active, Twilio delivers the transcribed text, along with the Participant SID, to every participant in the Room.

If you turn on partial results, the transcription engine delivers interim results so that your app can refresh its interface in near real-time.

You can also set a default configuration in the Twilio Console.

REST APIs

Transcriptions subresource

Transcriptions exists as a subresource of the Room resource that represents a Room's transcript.

Transcriptions resource URI

/v1/Rooms/{RoomNameOrSid}/Transcriptions/{TranscriptionTtid}

Property	Type	Description
`ttid`	`ttid`	The Twilio Type ID of the Transcriptions resource. It is assigned at creation time using the following format: `video_transcriptions_{uuidv7-special-encoded}`.
`room_sid`	`SID<RM>`	The ID of the room instance parent to the Transcriptions resource.
`account_sid`	`SID<AC>`	The account ID that owns the Transcriptions resource.
`status`	`enum<string>`	The status of the Transcriptions resource. It can be: `started`, `stopped` or `failed`. The resource is created with status `started` by default.
`configuration`	`object<string-map>`	Key-value map with configuration parameters applied to audio tracks transcriptions. To review the list of properties, see the Parameters for the configuration object section.
`date_created`	`string<date-time>`	The date and time in UTC when the resource was created, specified in ISO 8601 format.
`date_updated`	`string<date-time>`	The date and time in UTC when the resource was last updated, specified in ISO 8601 format. Null if the resource hasn't been updated since creation.
`start_time`	`string<date-time>`	The date and time in UTC when the resource was in state `started` and the first participant joined the Room, specified in ISO 8601 format.
`end_time`	`string<date-time>`	The date and time in UTC when transcription processing has paused in the Room, specified in ISO 8601 format. This happens when the resource is set to `stopped` or the last participant has left the Room.
`duration`	`integer`	The cumulative time in seconds that the transcripion resource has been in state `started` and at least one participant is in the Room. This is independent of whether audio tracks are published or audio tracks are muted.
`url`	`string<uri>`	The absolute URL of the resource.

If Twilio detects an internal error that prevents generation of transcriptions, the Transcriptions resource changes its status to failed. The Twilio Console receives a debug event with the details of the failure. Once it detects a failure, you can't restart that resource.

Properties for the configuration object

Name	Type	Necessity	Default	Description
`transcriptionEngine`	string	Optional	`"google"`	The supported transcription engine Twilio uses. To learn about the possible values, see the transcription engine table.
`speechModel`	string	Optional	`"telephony"`	The provider-supported recognition model that the transcription engine uses. To learn about the possible values, see the speech model table.
`languageCode`	string	Optional	`"en-US"`	The language code that the transcription engine uses, specified in BCP-47 format. This attribute ensures that the transcription engine understands and processes the spoken language.
`partialResults`	Boolean	Optional	`false`	Indicates whether to send partial results. When `true`, the transcription engine sends interim results as the transcription progresses, providing more immediate feedback before the final result could display.
`profanityFilter`	Boolean	Optional	`true`	Indicates if the server tries to filter profanities. This replaces all but the initial character in each filtered word with asterisks. Google provides this feature.
`hints`	string	Optional	None	A list of words or phrases that the transcription provider can expect to encounter during a transcription. Using the `hints` attribute can improve the transcription provider's recognition of words or phrases that are expected during the video call. Up to 500 words or phrases can be provided in the list of hints, each entry separated with a comma. Each word or phrase may be up to 100 characters each. Separate each word in a phrase with a space.
`enableAutomaticPunctuation`	Boolean	Optional	`true`	The provider adds punctuation to the transcribed text. Default is `true`. When enabled, the transcription engine inserts punctuation marks such as periods, commas, and question marks, improving the readability of the transcribed text.

Transcription engines and speech models

The following table lists the possible values for the transcriptionEngine and the associated speechModel properties.

Transcription engine	Speech model	Use case	Example
`google`	`telephony`	Use this model for telephone call audio.
`google`	`medical_conversation`	Use this model for conversations between a medical provider and a patient.
`google`	`long`	Use this model for any type of long-form content.	Media, spontaneous speech, and conversations
`google`	`short`	Use this model for short utterances that are a few seconds in length. Consider using this model instead of the command and search model.	Commands or other single, short, directed speech
`google`	`telephony_short`	Use this model for short or even single-word utterances for audio that originated from a phone call.	Customer service, teleconferencing, and kiosk applications.
`google`	`medical_dictation`	Use this model to transcribe notes dictated by a medical professional.
`google`	`chirp_telephony`	Use this model for telephone call audio with multiple languages. It uses the Google Universal large Speech Model (USM).
`google`	`chirp`	Use this model for content for audio in multiple languages. It uses the Google Universal large Speech Model (USM).
`deepgram`	`nova-3`	Recommended for meetings, captioning, noisy, or far-field audio.
`deepgram`	`nova-2`	Recommended with languages that `nova-3` doesn't support.

(information)

Info

The google transcription engine corresponds to the Google Speech-to-Text V2 API.
Speech models support a limited range of languages. For valid combinations, see the following provider documentation:
- Google Speech-to-Text V2 supported languages
- Deepgram Nova 2 supported languages

Create a Room with transcriptions enabled

To create a Video Room with Real-time Transcriptions enabled, add the following two parameters to the Room POST request:

Parameter	Type	Description
`TranscribeParticipantsOnConnect`	Boolean	Whether to start real-time transcriptions when `Participants` connect. Default is `false`.
`TranscriptionsConfiguration`	object	Key-value configuration settings for the transcription engine. To learn more, see Transcription configuration properties.

To turn on transcriptions for the Video Room, set the TranscribeParticipantsOnConnect parameter to true.

Create a Room with transcriptions enabled example request

1curl -X POST "https://video.twilio.com/v1/Rooms" \
2     -u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN \
3     --data-urlencode 'TranscriptionsConfiguration={"languageCode": "EN-us", "partialResults": true}' \
4     --data-urlencode "TranscribeParticipantsOnConnect=true" \

Create a Room with transcriptions enabled example response

1{
2  "account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
3  "audio_only": false,
4  "date_created": "2025-06-12T15:42:32Z",
5  "date_updated": "2025-06-12T15:42:32Z",
6  "duration": null,
7  "empty_room_timeout": 5,
8  "enable_turn": true,
9  "end_time": null,
10  "large_room": false,
11  "links": {
12    "participants": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Participants",
13    "recording_rules": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/RecordingRules",
14    "recordings": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Recordings",
15    "transcriptions": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions"
16  },
17  "max_concurrent_published_tracks": 170,
18  "max_participant_duration": 14400,
19  "max_participants": 50,
20  "media_region": "us1",
21  "record_participants_on_connect": false,
22  "sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
23  "status": "in-progress",
24  "status_callback": null,
25  "status_callback_method": "POST",
26  "type": "group",
27  "unique_name": "test",
28  "unused_room_timeout": 5,
29  "url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
30  "video_codecs": [
31    "VP8",
32    "H264"
33  ]
34}

Start transcriptions on an existing Room

To create a Transcription, send a POST request to the following URI:

Create Transcriptions resource URI

/v1/Rooms/{RoomNameOrSid}/Transcriptions

Path parameters

Parameter	Type	Description
`RoomSid`	`SID<RM>`	The ID of the parent room where you created the Transcriptions resource.

Request body parameters

Parameter	Type	Description
`Configuration`	`object<string-map>`	Object with key-value configurations.

Start a Transcriptions for an existing Room example request

1curl -X POST "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions" \
2     -u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN \
3     --data-urlencode 'Configuration={"languageCode": "EN-us", "partialResults": true, "profanityFilter": true, "speechModel": "long"}'

Start a Transcriptions for an existing Room example response

1{
2  "account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
3  "configuration": {
4    "languageCode": "EN-us",
5    "partialResults": "true",
6    "profanityFilter": "true",
7    "speechModel": "long"
8  },
9  "date_created": "2025-07-22T14:14:35Z",
10  "date_updated": null,
11  "duration": null,
12  "end_time": null,
13  "room_sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
14  "start_time": null,
15  "status": "started",
16  "ttid": "video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
17  "url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
18}

Stop transcription on a Room

To stop a Transcriptions resource, send a POST request to the following resource instance URI:

Stop Transcriptions resource URI

/v1/Rooms/{RoomSid}/Transcriptions/{ttid}

To stop transcriptions on a Room, set the Status parameter to stopped.

Path parameters

Parameter	Type	Description
`ttid`	`ttid`	A single TTID of the Transcriptions resource being updated.
`RoomSid`	`SID<RM>`	The ID of the parent room where you updated the Transcriptions resource.

Request body parameters

Parameter	Type	Description
`Status`	`enum<string>`	New status of the Transcriptions resource. Can be: `started`, `stopped`. There is no state transition if the resource property status already has the same value or if the parameter is missing.

Stop a Transcription in an existing Room example request

1curl -X POST "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX" \
2     -u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN \
3     --data-urlencode "Status=stopped"

Stop a Transcription in an existing Room example response

1{
2  "account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
3  "configuration": {
4    "languageCode": "EN-us",
5    "partialResults": "true"
6  },
7  "date_created": "2025-07-22T12:55:30Z",
8  "date_updated": "2025-07-22T12:56:02Z",
9  "duration": null,
10  "end_time": null,
11  "room_sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
12  "start_time": null,
13  "status": "stopped",
14  "ttid": "video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
15  "url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
16}

Restart Transcriptions on a Room

To restart a stopped Transcriptions resource in a Room, send a POST request to the following resource instance URI:

Restart Transcriptions resource URI

/v1/Rooms/{RoomSid}/Transcriptions/{ttid}

To restart transcription, set the Status to started.

Path parameters

Parameter	Type	Description
`ttid`	`ttid`	The TTID of the Transcriptions resource being updated. Current implementation supports a single Transcriptions resource, but this might change in future implementations.
`RoomSid`	`SID<RM>`	The ID of the parent room where the Transcriptions resource is updated.

Request body parameters

Parameter	Type	Description
`Status`	`enum<string>`	The status of the Transcriptions resource. To restart transcriptions, set to `started`. If this parameter has the same or no value, the state makes no transition.

Restart transcription example request

1curl -X POST "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX" \
2     -u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN \
3     --data-urlencode "Status=started"

Restart transcription example response

1{
2  "account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
3  "configuration": {
4    "languageCode": "EN-us",
5    "partialResults": "true"
6  },
7  "date_created": "2025-07-22T12:57:24Z",
8  "date_updated": null,
9  "duration": null,
10  "end_time": null,
11  "room_sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
12  "start_time": null,
13  "status": "started",
14  "ttid": "video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
15  "url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
16}

Fetch a list of the Transcriptions resources for a Room

To fetch a list of Transcriptions resources in a Room, send a GET request to the following resource instance URI:

Fetch a list of Transcriptions resource URI

/v1/Rooms/{RoomSid}/Transcriptions

Real-time Transcriptions supports only a single instance of the Transcriptions resource per Room, so the list only has a single item.

Path parameters

Parameter	Type	Description
`RoomSid`	`SID<RM>`	The ID of the parent room that has the Transcriptions resources.

Retrieve a transcription example request

1curl -X GET "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions" \
2     -u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN

Retrieve a transcription example response

1{
2  "meta": {
3    "first_page_url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions?PageSize=50&Page=0",
4    "key": "transcriptions",
5    "next_page_url": null,
6    "page": 0,
7    "page_size": 50,
8    "previous_page_url": null,
9    "url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions?PageSize=50&Page=0"
10  },
11  "transcriptions": [
12    {
13      "account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
14      "configuration": {},
15      "date_created": "2025-07-22T11:05:41Z",
16      "date_updated": null,
17      "duration": null,
18      "end_time": null,
19      "room_sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
20      "start_time": null,
21      "status": "started",
22      "ttid": "video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
23      "url": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
24    }
25  ]
26}

Fetch a single Transcriptions resource

To fetch a Transcriptions resource in a Room, send a GET request to the following resource instance URI:

Fetch one specific Transcriptions resource URI

/v1/Rooms/{RoomSid}/Transcriptions/{ttid}

Path parameters

Parameter	Type	Description
`ttid`	`ttid`	The TTID of the Transcriptions resource being requested.
`RoomSid`	`SID<RM>`	The ID of the parent room where you fetch the Transcriptions resource.

Fetch one transcription example request

1curl -X https://video.twilio.com/v1/Rooms/{room_sid}/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX \
2     -u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN

Fetch one transcription example response

1{
2  "account_sid": "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
3  "configuration": {
4    "LanguageCode": "EN-us",
5    "ProfanityFilter": "true"
6  },
7  "date_created": null,
8  "date_updated": null,
9  "duration": null,
10  "end_time": null,
11  "links": {
12    "transcriptions": "https://video.twilio.com/v1/Rooms/RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Transcriptions/video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
13  },
14  "room_sid": "RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
15  "start_time": null,
16  "status": "created",
17  "ttid": "video_extension_XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
18}

Transcribed text delivery

Twilio delivers transcribed text to the client SDKs through callback events.

JSON schema

The schema of the JSON delivery format contains a version number. Each event contains the transcription of a single utterance and details of the participant who generated the audio.

1properties:
2  type:
3    const: extension_transcriptions
4
5  version:
6    description: |
7      Version of the transcriptions protocol used by this message. It is semver compliant
8
9  track:
10    $ref: /Server/State/RemoteTrack
11    description: |
12      Audio track from where the transcription has been generated.
13
14  participant:
15    $ref: /Server/State/Participant
16    description: |
17      The participant who published the audio track from where the
18      transcription has been generated.
19
20  sequence_number:
21    type: integer
22    description: |
23      Sequence number. Starts with one and increments monotonically. A sequence
24      counter is defined for each track to allow the receiver to identify 
25      missing messages.
26
27  timestamp:
28    type: string
29    description: |
30      Absolute time from the real-time-transcription. It is
31      conformant with UTC ISO 8601.
32
33  partial_results:
34    type: boolean
35    description: |
36      Whether the transcription is a final or a partial result.
37
38  stability:
39    type: double
40    description: |
41       Indicates how likely it is that this partial result transcript won't be updated again. The range is from `0.0` (unstable) to `1.0` (stable). This field is only provided when `partialResults` is `true`.
42
43  language_code:
44    type: string
45    description: |
46      Language code of the transcribed text. It is conformant with BCP-47.
47
48  transcription:
49    type: string
50    description: |
51      Utterance transcription

Transcription example response

1{
2  "version": "1.0",
3  "language_code": "en-US",
4  "partial_results": false,
5  "participant": "PA00000000000000000000000000000000",
6  "sequence_number": 3,
7  "timestamp": "2025-01-01T12:00:00.000000000Z",
8  "track": "MT00000000000000000000000000000000",
9  "transcription": "This is a test",
10  "type": "extension_transcriptions"
11}

Partial results and stability

When you set partialResults parameter to true, the transcription engine provides a series of partial results as it determines the text corresponding to the spoken utterance.

The stability property indicates the probability that the partial result provided changes before the delivery of the final result. This value ranges from 0.0 (unstable) to 1.0 (stable). In general, consider partial results with stability less than 0.9 as preliminary and temporary. When building an app element to display the transcribed text as captions or subtitles, filter out partial results with a stability value less than 0.9. This avoids text flickering as the app receives partial results.

JavaScript SDK

Start and stop transcript events with the JavaScript SDK

To turn on the flow of transcript events, set the receiveTranscriptions parameter in connectOptions to true. This parameter defaults to false. With Real-time Transcriptions enabled for the Room and receiveTranscriptions set to true, callback events containing the transcribed text start to flow.

JavaScript transcription event example

1import { connect } from 'twilio-video';
2
3const room = await connect(token, {
4  name: 'my-room',
5  receiveTranscriptions: true
6});
7
8room.on('transcription', (transcriptionEvent) => {
9  console.log(`${transcriptionEvent.participant}: ${transcriptionEvent.transcription}`);
10});

iOS SDK

Start and stop transcript events with the iOS SDK

To receive transcription events, set the receiveTranscriptions parameter in TVIConnectOptions to true. This parameter defaults to false. To fetch this value, use the isReceiveTranscriptionsEnabled getter.

With Real-time Transcriptions enabled for the Room and receiveTranscriptions set to true, the transcriptionReceived(room:transcription:) method in the RoomDelegate protocol delivers callback events containing the transcribed text.

Swift transcription event example

1let options = ConnectOptions(token: accessToken, block: { (builder) in
2    builder.roomName = "test"
3    builder.isReceiveTranscriptionsEnabled = true
4})

Android SDK

Start and stop transcript events with the Android SDK

To receive transcription events, set the receiveTranscriptions parameter in ConnectOptions to true. This parameter defaults to false. To check the setting, call isReceiveTranscriptionsEnabled().

With Real-time Transcriptions enabled for the Room and ConnectOptions set to true, the onTranscription(@NonNull Room room, @NonNull JSONObject json) method of the Room.Listener interface delivers callback events containing the transcribed text.

Java transcription event example

1ConnectOptions connectOptions = new ConnectOptions.Builder(accessToken)
2        .receiveTranscriptions(true)
3        .build();
4
5Video.connect(context, connectOptions, roomListener);

Twilio Console configuration

To enable and configure Real-time Transcriptions in the Twilio Console, complete the following steps.

Log in to the Twilio Console.
Go to Video > Manage > Room Settings.
Scroll to Realtime Transcriptions.
Click Accept for the Predictive and Generative AI/ML Features Addendum.
Click Enabled for the Automatically turn on Realtime Transcriptions by default in Rooms.
Click Save.

AI nutrition facts

(information)

AI Nutrition Facts

Real-time Transcriptions for Video uses third-party artificial technology and machine learning technologies.

To improve your understanding how AI handles your data, Twilio's AI Nutrition Facts provide an overview of the AI feature you're using. The following Speech to Text Transcriptions—Nutrition Facts label outlines the AI qualities of Real-time Transcriptions for Video.

AI Nutrition Facts

Speech to Text Transcriptions - Programmable Voice, Twilio Video, and Conversational Intelligence

Description: Generate speech to text voice transcriptions (real-time and post-call) in Programmable Voice, Twilio Video, and Conversational Intelligence.
Privacy Ladder Level: N/A
Feature is Optional: Yes
Model Type: Generative and Predictive - Automatic Speech Recognition
Base Model: Deepgram Speech-to-Text, Google Speech-to-Text, Amazon Transcribe
Base Model Trained with Customer Data: No
Customer Data is Shared with Model Vendor: No
Training Data Anonymized: N/A
Data Deletion: Yes
Human in the Loop: Yes
Data Retention: Until the customer deletes
Logging & Auditing: Yes
Guardrails: Yes
Input/Output Consistency: Yes
Other Resources: https://www.twilio.com/docs/conversational-intelligence

Learn more about this label at nutrition-facts.ai

Known issues

To use the Google medical_conversation model, set enableAutomaticPunctuation to true.
When a Room reaches the MaxParticipantDuration time limit, Transcriptions stop. As a workaround, set the MaxParticipantDuration parameter of the Room exceeds than the expected lifetime of the Room. This value defaults to four hours.

Real-time Transcriptions for Video

Legal notice

REST APIs

Transcriptions subresource

Transcriptions resource URI

Properties for the configuration object

Transcription engines and speech models

Info

Create a Room with transcriptions enabled

Create a Room with transcriptions enabled example request

Create a Room with transcriptions enabled example response

Start transcriptions on an existing Room

Create Transcriptions resource URI

Path parameters

Request body parameters

Start a Transcriptions for an existing Room example request

Start a Transcriptions for an existing Room example response

Stop transcription on a Room

Stop Transcriptions resource URI

Path parameters

Request body parameters

Stop a Transcription in an existing Room example request

Stop a Transcription in an existing Room example response

Restart Transcriptions on a Room

Restart Transcriptions resource URI

Path parameters

Request body parameters

Restart transcription example request

Restart transcription example response

Fetch a list of the Transcriptions resources for a Room

Fetch a list of Transcriptions resource URI

Path parameters

Retrieve a transcription example request

Retrieve a transcription example response

Fetch a single Transcriptions resource

Fetch one specific Transcriptions resource URI

Path parameters

Fetch one transcription example request

Fetch one transcription example response

Transcribed text delivery

JSON schema

Transcription example response

Partial results and stability

JavaScript SDK

Start and stop transcript events with the JavaScript SDK

JavaScript transcription event example

iOS SDK

Start and stop transcript events with the iOS SDK

Swift transcription event example

Android SDK

Start and stop transcript events with the Android SDK

Java transcription event example

Twilio Console configuration

AI nutrition facts

AI Nutrition Facts

AI Nutrition Facts

Speech to Text Transcriptions - Programmable Voice, Twilio Video, and Conversational Intelligence

Trust Ingredients

Known issues