A Voice Intelligence Transcript resource represents a transcribed voice conversation. To initiate the transcription process of a specific recording's audio, you'll need to call the Create a new Voice Intelligence Transcript endpoint. You can transcribe recordings created by Twilio or those that are externally created or stored.
If automatic transcription is enabled, Twilio creates a Voice Intelligence Transcript resource whenever a Call within your Account has been recorded.
A Transcript resource contains links to the associated subresources:
Voice Intelligence supports various audio formats, each suited for different needs:
We recommend using dual-channel recordings to improve transcription accuracy, especially in scenarios requiring speaker differentiation.
The unique SID identifier of the Account.
^AC[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
The unique SID identifier of the Service.
^GA[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
A 34 character string that uniquely identifies this Transcript.
^GT[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
The date that this Transcript was created, given in ISO 8601 format.
The date that this Transcript was updated, given in ISO 8601 format.
The Status of this Transcript. One of queued
, in-progress
, completed
, failed
or canceled
.
queued
in-progress
completed
failed
canceled
Data logging allows Twilio to improve the quality of the speech recognition & language understanding services through using customer data to refine, fine tune and evaluate machine learning models. Note: Data logging cannot be activated via API, only via www.twilio.com, as it requires additional consent.
The date that this Transcript's media was started, given in ISO 8601 format.
If the transcript has been redacted, a redacted alternative of the transcript will be available.
POST https://intelligence.twilio.com/v2/Transcripts
When you use automatic transcription, you don't need this API request to create new Voice Intelligence Transcripts.
application/x-www-form-urlencoded
The unique SID identifier of the Service.
^GA[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
Used to store client provided metadata. Maximum of 64 double-byte UTF8 characters.
The date that this Transcript's media was started, given in ISO 8601 format.
The Channel
parameter object contains information about the recording. When creating a new Transcript resource, specify the recording you want to transcribe via the Channel
parameter's media_properties
object. You can also customize participant/channel labeling via the Channel
object's participants
array. See the "Specify participant information" section below for more information.
The table below describes the properties of the Channel
parameter object. Click Show child properties to show the media_properties
and participants
fields.
Object representing the media channel. It has information about the source of the media and the participants information.
You can optionally provide a CustomerKey
parameter to map a Transcript to an internal identifier known within your system. This unique identifier helps track the Transcript, and it's included in webhook callback when the results for Transcripts and Operators are available. Note that CustomerKey
doesn't replace the Transcript SID in Voice Intelligence API calls.
To transcribe Recordings made via Twilio and stored within Twilio's infrastructure, provide the Recording SID in the Channel
object's media_properties.source_sid
property as shown below. REXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
represents a Recording SID.
In this scenario, the Channel
information appears as follows:
1{2"media_properties":{3"source_sid": "REXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"4}5}
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function createTranscript() {11const transcript = await client.intelligence.v2.transcripts.create({12channel: {13media_properties: {14source_sid: "REXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",15},16},17serviceSid: "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",18});1920console.log(transcript.accountSid);21}2223createTranscript();
1{2"account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",3"service_sid": "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",4"sid": "GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",5"date_created": "2010-08-31T20:36:28Z",6"date_updated": "2010-08-31T20:36:28Z",7"status": "queued",8"channel": {9"media_properties": {10"media_url": "http://foobar.test/ClusterTests/call1.wav"11}12},13"data_logging": false,14"language_code": "en-US",15"media_start_time": null,16"duration": 0,17"customer_key": "aaaaaaaa",18"url": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",19"redaction": true,20"links": {21"sentences": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Sentences",22"media": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Media",23"operator_results": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/OperatorResults"24}25}
MediaUrl
parameter. The SourceSid
parameter isn't supported for externally-stored Twilio Recordings.To transcribe a recording stored externally, for example, a recording stored in your own S3 bucket, provide the recording's URL in the Channel
object's media_properties.media_url
property.
The following limitations apply when transcribing an external recording (specified by a MediaUrl
):
MediaUrl
s isn't supported for external recordings. If you store the recordings on S3, use a presigned URL. And when storing them on Azure Blob Storage, use a Shared Access Signature (SAS).MediaUrl
s that respond with a non-200 HTTP status code will result in a failed request.To transcribe the audio of a Twilio Video recording, it needs additional processing to become compatible with Voice Intelligence.
First, create a dual-channel audio recording by transcoding a separate audio-only composition for each participant in the Video Room.
1curl -X POST "https://video.twilio.com/v1/Compositions" \ --data-urlencode "AudioSources=PAXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"2\ --data-urlencode "StatusCallback=https://www.example.com/callbacks"3\ --data-urlencode "Format=mp4"4\ --data-urlencode "RoomSid=RMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"5\ -u $TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN
Next, download the media from these compositions and merge them into a single stereo audio file.
ffmpeg -i speaker1.mp4 -i speaker2.mp4 -filter_complex "[0:a][1:a]amerge=inputs=2[a]" -map "[a]" -f flac -bits_per_raw_smaple 16 -ar 441000 output.flac
If the recording duration for each participant differs, you can avoid overlapping audio tracks. Use ffmpeg
to create a single-stereo audio track with delay to cover the difference in track length. For example, if one audio track lasts 63 seconds and the other 67 seconds, use ffmpeg
to create a stereo file with the first track, with four seconds of delay to match the length of the second track.
ffmpeg -i speaker1.wav -i speaker2.wav -filter_complex "aevalsrc=0:d=${second_to_delay}[s1];[s1][1:a]concat=n=2:v=0:a=1[ac2];[0:a]apad[ac1];[ac1][ac2]amerge=2[a]" -map "[a]" -f flac -bits_per_raw_sample 16 -ar 441000 output.flac
Finally, use the Create a new Voice Intelligence Transcript endpoint with the Channel
parameter's media_properties.media_url
property set to a publicly accessible URL of the audio file.
Recordings must be publicly accessible during transcription. The recordings can be hosted or used on a time-limited pre-signed URL. To share a recording on an existing AWS S3 bucket, read the "Sharing objects with pre-signed URLs" guide from AWS.
Twilio attempts to download an external recording for up to 10 minutes. After 10 minutes, the transcription fails.
You can't transcribe encrypted recordings.
Voice Intelligence doesn't perform speaker diarization on recordings, meaning it doesn't differentiate between different speakers. Additionally, using mono recordings can lead to reduced transcription accuracy. For improved transcription accuracy and participant differentiation, use dual-channel recordings.
Voice Intelligence supports both mono and stereo audio formats for the following media formats:
The following limits apply to the media files:
In this scenario, the Channel
information appears as follows:
1{2"media_properties":{3"media_url": "http://www.example.com/recording/call.wav"4}5}
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function createTranscript() {11const transcript = await client.intelligence.v2.transcripts.create({12channel: {13media_properties: {14media_url: "https://example.com/your-recording.wav",15},16},17serviceSid: "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",18});1920console.log(transcript.accountSid);21}2223createTranscript();
1{2"account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",3"service_sid": "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",4"sid": "GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",5"date_created": "2010-08-31T20:36:28Z",6"date_updated": "2010-08-31T20:36:28Z",7"status": "queued",8"channel": {9"media_properties": {10"media_url": "http://foobar.test/ClusterTests/call1.wav"11}12},13"data_logging": false,14"language_code": "en-US",15"media_start_time": null,16"duration": 0,17"customer_key": "aaaaaaaa",18"url": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",19"redaction": true,20"links": {21"sentences": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Sentences",22"media": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Media",23"operator_results": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/OperatorResults"24}25}
If you include both MediaUrl
and SourceSid
in the Transcript creation request, Twilio uses the MediaUrl
.
By default, Voice Intelligence labels the left channel (channel one) as Agent
and the right channel (channel two) as Customer
. Depending on your call flow and the recorded call leg, this may not accurately reflect the participant/channel relationships on your recording. If needed, specify which participant is on a given channel via the Channel parameter's participants
array.
If the default behavior doesn't align with your application's recording implementation, you can do one of the following:
Only two participants can be overridden in the Channel
object of the Transcript resource.
The code sample below demonstrates an example request that overrides the default Voice Intelligence labels.
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function createTranscript() {11const transcript = await client.intelligence.v2.transcripts.create({12channel: {13media_properties: {14media_url: "https://example.com/your-recording",15},16participants: [17{18user_id: "id1",19channel_participant: 1,20media_participant_id: "+1555959545",21email: "veronica.meyer@example.com",22full_name: "Veronica Meyer",23image_url:24"https://images.unsplash.com/photo-1438761681033-6461ffad8d80",25role: "Customer",26},27{28user_id: "id2",29channel_participant: 2,30media_participant_id: "+1555959505",31email: "lauryn.trujillo@example.com",32full_name: "Lauryn Trujillo",33image_url:34"https://images.unsplash.com/photo-1554384645-13eab165c24b",35role: "Agent",36},37],38},39serviceSid: "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",40});4142console.log(transcript.accountSid);43}4445createTranscript();
1{2"account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",3"service_sid": "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",4"sid": "GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",5"date_created": "2010-08-31T20:36:28Z",6"date_updated": "2010-08-31T20:36:28Z",7"status": "queued",8"channel": {9"media_properties": {10"media_url": "http://foobar.test/ClusterTests/call1.wav"11}12},13"data_logging": false,14"language_code": "en-US",15"media_start_time": null,16"duration": 0,17"customer_key": "aaaaaaaa",18"url": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",19"redaction": true,20"links": {21"sentences": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Sentences",22"media": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Media",23"operator_results": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/OperatorResults"24}25}
GET https://intelligence.twilio.com/v2/Transcripts/{Sid}
Use the webhook callback to know when a Create a new Voice Intelligence Transcript request has completed and when the results are available. This is preferable to polling the Fetch a Voice Intelligence Transcript endpoint.
The webhook callback URL can be configured on the Voice Intelligence Service's settings.
A 34 character string that uniquely identifies this Transcript.
^GT[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function fetchTranscript() {11const transcript = await client.intelligence.v212.transcripts("GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")13.fetch();1415console.log(transcript.accountSid);16}1718fetchTranscript();
1{2"account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",3"service_sid": "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",4"sid": "GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",5"date_created": "2010-08-31T20:36:28Z",6"date_updated": "2010-08-31T20:36:28Z",7"status": "queued",8"channel": {},9"data_logging": false,10"language_code": "en-US",11"media_start_time": null,12"duration": 0,13"customer_key": null,14"url": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",15"redaction": true,16"links": {17"sentences": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Sentences",18"media": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Media",19"operator_results": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/OperatorResults"20}21}
GET https://intelligence.twilio.com/v2/Transcripts
The unique SID identifier of the Service.
^GA[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
How many resources to return in each list page. The default is 50, and the maximum is 1000.
1
Maximum: 1000
The page token. This is provided by the API.
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function listTranscript() {11const transcripts = await client.intelligence.v2.transcripts.list({12limit: 20,13});1415transcripts.forEach((t) => console.log(t.accountSid));16}1718listTranscript();
1{2"transcripts": [3{4"account_sid": "ACaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",5"service_sid": "GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",6"sid": "GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",7"date_created": "2010-08-31T20:36:28Z",8"date_updated": "2010-08-31T20:36:28Z",9"status": "queued",10"channel": {},11"data_logging": false,12"language_code": "en-US",13"media_start_time": null,14"duration": 0,15"customer_key": null,16"url": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",17"redaction": true,18"links": {19"sentences": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Sentences",20"media": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/Media",21"operator_results": "https://intelligence.twilio.com/v2/Transcripts/GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/OperatorResults"22}23}24],25"meta": {26"key": "transcripts",27"page": 0,28"page_size": 50,29"first_page_url": "https://intelligence.twilio.com/v2/Transcripts?LanguageCode=en-US&SourceSid=REaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa&ServiceSid=GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa&AfterDateCreated=2019-11-22T23%3A46%3A00Z&PageSize=50&Page=0",30"next_page_url": null,31"previous_page_url": null,32"url": "https://intelligence.twilio.com/v2/Transcripts?LanguageCode=en-US&SourceSid=REaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa&ServiceSid=GAaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa&AfterDateCreated=2019-11-22T23%3A46%3A00Z&PageSize=50&Page=0"33}34}
DELETE https://intelligence.twilio.com/v2/Transcripts/{Sid}
A 34 character string that uniquely identifies this Transcript.
^GT[0-9a-fA-F]{32}$
Min length: 34
Max length: 34
1// Download the helper library from https://www.twilio.com/docs/node/install2const twilio = require("twilio"); // Or, for ESM: import twilio from "twilio";34// Find your Account SID and Auth Token at twilio.com/console5// and set the environment variables. See http://twil.io/secure6const accountSid = process.env.TWILIO_ACCOUNT_SID;7const authToken = process.env.TWILIO_AUTH_TOKEN;8const client = twilio(accountSid, authToken);910async function deleteTranscript() {11await client.intelligence.v212.transcripts("GTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")13.remove();14}1516deleteTranscript();