Twilio Changelog | Jul. 01, 2025
Twilio Real-Time Transcriptions now generally available
TL;DR: Twilio Real-Time Transcriptions product has reached General Availability! Now you can capture what customers say when talking to human Agents or IVRs – or prompt and automate customer self-service interactions by sending transcribed to text to LLMs and AI Agents for them to respond – using just a simple <Start><Transcription> TwiML instruction or API call.
We are excited to announce that Twilio Real-Time Transcriptions has reached General Availability.
What is Twilio Real-Time Transcriptions?
Twilio Real-Time Transcriptions allows you to transcribe live calls in real-time. When Twilio executes the <Start><Transcription> instruction during a call, the Twilio platform forks the raw audio stream to the developer’s choice of speech-to-text transcription engines, which can provide streamed responses back with each of the caller’s uttered phrases. Developers can choose to send the stream of speech recognition results to their downstream app through Twilio Programmable Voice, using either webhooks (where results are sent to a statusCallback URL as configured by the developer, as GA’d today) – or to a configured persisted transcript resource on the Twilio Platform, which can also be used in conjunction with Twilio’s Conversational Intelligence capabilities to analyze the transcript, post-call (persisted transcripts and Conversational Intelligence integration with Real-Time Transcriptions is in Public Beta).
What’s New with GA?
With the GA of Real-Time Transcriptions using webhooks, we’ve added Deepgram as a second provider option for developers for speech-to-text transcription, giving developers more and updated choices of modern speech models to use in more accurately transcribing customer speech.
Additionally, as part of being GA’d, Real-Time Transcriptions using webhooks has also become a HIPAA Eligible Service, and is PCI-compliant, to safeguard customer interactions regarding health information and credit card transactions in sessions that the Twilio platform transcribes.
Customer benefits
With the streaming speech recognition capabilities of <Start><Transcription>, businesses can capture the full text of what all their customers are saying – whether to a human agent or an automated self-service AI agent or LLM – for doing any of the following (and more):
Capturing crucial customer conversations, and adding that data to a caller’s customer record, be that in a CRM or another application/system built by the developer.
Analyzing caller-agent interactions, for near real-time escalation to supervisors, prompting for upsells, or other taking other interventional or incremental steps with the customer, while they are still on the phone.
After sending the caller’s transcribed speech to an AI Agent / LLM, coming back to prompt a human agent with recommended actions or requested product information based on what the caller has said.
Automating customer data collection via programmable outbound calling applications, for follow-up, post-service, or post-care surveys, etc.
Twilio Real-Time Transcriptions allows developers to automate the capturing of customer speech data, programmatically, for each and every call (instead just having the data for an ad hoc sampling of calls), create a repository of structured data for those voice conversations with customers, and easily and cost-effectively stream the speech results to downstream applications during calls with customers.
More Information:
https://www.twilio.com/en-us/speech-recognition
https://www.twilio.com/en-us/voice/pricing/us -- see “Conversational Intelligence - Transcription, Streaming (Real-Time) Transcription"
https://www.twilio.com/docs/voice/twiml/transcription
https://www.twilio.com/docs/voice/api/realtime-transcription-resource