To collect input during a call, use the <Gather>
verb in the TwiML language. This input could include speech, digits pressed on the keypad, or both.
Between March and June 2025, Twilio is updating the speech models for <Gather>
speech-to-text (STT) in waves. If you don't want to change your existing <Gather>
speech models, contact support. This update offers you three choices for STT:
If you don't select a specific STT provider and speech model, Twilio takes the following actions:
<Gather>
.If you need pricing information, consult Programmable Voice Pricing.
To implement <Gather>
, you can try either of the following.
Send a plain TwiML document to Twilio using curl
or an API application.
At a minimum, nest the <Gather>
verb inside the <Response>
in the TwiML document. The following example shows this TwiML document.
1<?xml version="1.0" encoding="UTF-8"?>2<Response>3<Gather/>4</Response>
Add a Twilio helper library to your programming language of choice and generate TwiML from your web application.
1const VoiceResponse = require('twilio').twiml.VoiceResponse;234const response = new VoiceResponse();5response.gather();67console.log(response.toString());
1<?xml version="1.0" encoding="UTF-8"?>2<!-- page located at http://example.com/simple_gather.xml -->3<Response>4<Gather/>5</Response>
When Twilio executes the instructions in the preceding TwiML document, it performs the <Gather>
using the default attribute values.
Twilio pauses and waits for the caller to enter digits on their keypad. At this point, the caller can make one of two choices:
Choice | Caller action | Twilio action |
---|---|---|
1 | Enter digits then the # symbol on the keypad. | Twilio sends these digits as a parameter of a POST request to the URL that hosts this <Gather> TwiML. |
2 | Do nothing and wait for 5 seconds to pass. | The request includes no more verbs, so Twilio ends the call. |
The generic <Gather>
TwiML document lacks some capabilities. To expand on what <Gather>
can do, you need to add attributes, nest other verbs, or both.
<Gather>
verb, consult its list of attributes.The <Gather>
verb supports the following attributes. Though this verb requires none of these attributes, including the action attribute prevents undesired looping behavior. A <Gather>
tag can include zero or more attributes.
Attribute name | Accepted values | Default value |
---|---|---|
action | URL (relative or absolute) | current document URL |
actionOnEmptyResult | true , false | false |
enhanced | true , false Limited to the phone_call model in Google STT V1. | false |
finishOnKey | 0 -9 , # , * , and '' (the empty string) | # |
hints | "words, phrases that have many words". Supported class tokens or keywords vary according to your provider and version. | none |
input | dtmf , speech , dtmf speech | dtmf |
language | Supported languages depend on your chosen speechModel: Google STT V1, Google STT V2, or Deepgram. | en-US |
method | GET , POST | POST |
numDigits | positive integer | unlimited |
partialResultCallback | URL (relative or absolute) | none |
partialResultCallbackMethod | GET , POST | POST |
profanityFilter | true , false | true |
speechModel | Generic: default , numbers_and_commands , phone_call , experimental_conversations , experimental_utterances Google STT V2: googlev2_long , googlev2_short , googlev2_telephony , googlev2_telephony_short Deepgram: Any | default |
speechTimeout | positive integer or auto | timeout value |
timeout | positive integer | 5 |
The action
attribute specifies one URL that Twilio sends your request using HTTP.
Necessity | Accepted values | Default value |
---|---|---|
Recommended | Relative or absolute URL | current document URL |
After the caller finishes entering digits or reaches the timeout, Twilio sends a POST
HTTP request to the specified URL. If you omit this attribute, Twilio calls the TwiML document making the request. This might lead to unwanted looping behavior.
This request includes the caller's data and Twilio's standard request parameters.
Twilio might add some extra attributes to its request after the <Gather>
ends:
Digits
attribute containing the numbers your caller entered.SpeechResult
and Confidence
parameters:
SpeechResult
contains the transcribed result of your caller's speech.Confidence
contains a confidence score between 0.0 and 1.0.
A higher confidence score means the potential for greater accuracy of the transcription.Note: Your code shouldn't expect confidence
as a required field. Twilio doesn't guarantee its accuracy or presence in any of the results.
After <Gather>
ends, Twilio sends its request to your action
URL. The current call continues using the TwiML document returned from the action
URL. Twilio won't use any TwiML verbs after your <Gather>
in your original TwiML.
If the caller didn't enter any digits or speech, call flow within the original TwiML document continues.
If you started or updated a <Call>
that included a twiml
parameter, the action
URLs for <Record>
, <Gather>
, and <Pay>
must be absolute.
The Call Resource API Docs have language-specific examples of creating and updating Calls with TwiML:
twiml
parameter, consult Create a Call Resource.twiml
parameter, consult Update a Call Resource.You are hosting the following TwiML document at http://example.com/complex_gather.xml
.
1<?xml version="1.0" encoding="UTF-8"?>2<Response>3<Gather>4<Say>5Please enter your account number,6followed by the pound sign7</Say>8</Gather>9<Say>We didn't receive any input. Goodbye!</Say>10</Response>
This TwiML document can follow one of three scenarios:
Scenario | Caller actions | Twilio actions |
---|---|---|
1 | Doesn't press the keypad or say anything for five seconds, or enters '#' before entering any other digits | Says "We didn't receive any input. Goodbye!" |
2 | Enters a digit while the call says "Please enter your account number..." | <Gather> verb stops speaking. It waits for the caller's action. |
3 | Enters 12345 then presses # , or allows 5 seconds to pass | Submits the digits and request attribute values to the URL of this TwiML document (http://example.com/complex_gather.xml ). Twilio fetches this TwiML document again and execute it. The caller gets stuck in a loop. |
To avoid scenario 3, point your action
URL to a different URL that hosts a different TwiML document. This new TwiML document handles the remainder of the call.
The following code example adds the action
and method
attributes to the previous TwiML document.
1const VoiceResponse = require('twilio').twiml.VoiceResponse;234const response = new VoiceResponse();5const gather = response.gather({6action: '/process_gather.php',7method: 'GET'8});9gather.say('Please enter your account number,\nfollowed by the pound sign');10response.say('We didn\'t receive any input. Goodbye!');1112console.log(response.toString());
1<?xml version="1.0" encoding="UTF-8"?>2<!-- page located at http://example.com/complex_gather.xml -->3<Response>4<Gather action="/process_gather.php" method="GET">5<Say>6Please enter your account number,7followed by the pound sign8</Say>9</Gather>10<Say>We didn't receive any input. Goodbye!</Say>11</Response>
When the caller enters their input, Twilio sends the request parameters, including the digits, to the /process_gather.php
URL.
You can have Twilio read back this input to the caller. To do so, your code /process_gather.php
should resemble the following.
1<?php2// page located at http://yourserver/process_gather.php3echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";4echo "<Response><Say>You entered " . $_REQUEST['Digits'] . "</Say></Response>";5?>
The actionOnEmptyResult
attribute specifies that <Gather>
must send a webhook to the action URL with or without DTMF input. By default, if <Gather>
times out while waiting for DTMF input, it continues to the next TwiML instruction.
Necessity | Accepted values | Default value |
---|---|---|
Optional | true , false | false |
In the following TwiML, when <Gather>
times out, Twilio executes the second <Say>
instruction.
1<?xml version="1.0" encoding="UTF-8"?>2<Response>3<Gather>4<Say>5Please enter your account number,6followed by the pound sign7</Say>8</Gather>9<Say>We didn't receive any input. Goodbye!</Say>10</Response>
To force <Gather>
to send a webhook to the action
URL, write a TwiML document that resembles the following.
1<?xml version="1.0" encoding="UTF-8"?>2<Response>3<Gather actionOnEmptyResult="true" action="/gather-action">4<Say>5Please enter your account number,6followed by the pound sign7</Say>8</Gather>9</Response>
The finishOnKey
attribute specifies the value that your caller presses to submit their digits.
Necessity | Accepted values | Default value |
---|---|---|
Optional | # , * , single digits 0 -9 , an empty string ('' ) | # , the hash or pound sign |
If you set this attribute to an empty string, <Gather>
captures all caller input. After the call reaches its timeout
, Twilio submits the caller's digits to the action
URL.
finishOnKey
attribute to a value of #
.1234#
.#
, Twilio stops waiting for more input.Digits=1234
to your action
URL. The request doesn't include the #
.Consider the following TwiML document. In this example, finishOnKey
never works, because you never inform the caller to press it.
1<?xml version="1.0" encoding="UTF-8"?>2<Response>3<Gather input="speech dtmf" finishOnKey="#" timeout="5">4<Say>5Please say something or press * to access the main menu6</Say>7</Gather>8<Say>We didn't receive any input. Goodbye!</Say>9</Response>
The hints
attribute specifies a list of words or phrases that Twilio should expect during recognition.
Adding hints
to your <Gather>
improves Twilio's recognition.
Necessity | Accepted values | Default value |
---|---|---|
Optional | comma-separated list of up to 500 entries |
Entries contain either single words or phrases. Each entry can up to 100 characters in length. Separate each word in a phrase with a space.
The following example includes four entries as two single words and two phrases.
hints="this is a phrase I expect to hear, keyword, product name, name"
To learn which tokens and keywords your STT provider supports, click the tab for your provider.
The input
attribute specifies which types of input Twilio accepts. The types include dual-tone multi-frequency (DTMF), speech, or both.
Necessity | Accepted values | Default value |
---|---|---|
Optional | dtmf , speech , dtmf speech | dtmf |
speech
, Twilio gathers speech from the caller for a maximum duration of 60 seconds. <Gather>
doesn't recognize speaking individual alphanumeric characters like "ABC123".dtmf speech
, Twilio gives precedence to the first input it detects. If Twilio detects speech
first, it ignores the finishOnKey
attribute.The following code example shows a <Gather>
that specifies speech input from the caller.
<Say>
prompt.action
URL.1const VoiceResponse = require('twilio').twiml.VoiceResponse;234const response = new VoiceResponse();5const gather = response.gather({6input: 'speech',7action: '/completed'8});9gather.say('Welcome to Twilio, please tell us why you\'re calling');1011console.log(response.toString());
1<?xml version="1.0" encoding="UTF-8"?>2<!-- page located at http://example.com/simple_gather.xml -->3<Response>4<Gather input="speech" action="/completed">5<Say>Welcome to Twilio, please tell us why you're calling</Say>6</Gather>7</Response>
The language
attribute specifies the language Twilio should recognize from your caller.
Necessity | Accepted values | Default value |
---|---|---|
Optional | any value the STT provider supports | en-US |
The method
attribute specifies the HTTP verb Twilio should use to request your action URL.
Necessity | Accepted values | Default value |
---|---|---|
Optional | GET , POST | POST |
The numDigits
attribute specifies how many digits you require from callers for this <Gather>
instance for DTMF input.
Necessity | Accepted values | Default value |
---|---|---|
Optional | any positive integer |
Twilio asks the caller for their US ZIP Code. The developer expects the value to contain five digits, so they set numDigits="5"
. Once the caller enters the final digit of 94117
, Twilio submits the data to your action
URL.
The partialResultCallback
attribute specifies a URL to which Twilio sends requests as it recognizes speech in real time. These requests contain a parameter labeled UnstableSpeechResult
which contains partial transcriptions. These transcriptions may change as the speech recognition progresses.
Necessity | Accepted values | Default value |
---|---|---|
Optional | comma-separated list of up to 500 entries |
The Twilio makes asynchronous webhooks to your partialResultCallback
URL. They don't accept any TwiML documents in response. To take more actions based on this partial result, use the REST API to modify the call.
The partialResultCallbackMethod
attribute specifies the HTTP verb Twilio should use to request your partialResultCallback
URL.
Necessity | Accepted values | Default value |
---|---|---|
Optional | GET , POST | POST |
The profanityFilter
attributes specifies whether Twilio should filter profanities out of your speech transcription.
Necessity | Accepted values | Default value |
---|---|---|
Optional | true , false | true |
When set to true
, Twilio replaces all but the initial character in each filtered profane word with asterisks like f***
.
The speechModel
attribute specifies which language model to apply to your <Gather>
request.
Necessity | Accepted values | Default value |
---|---|---|
Optional | default , numbers_and_commands , phone_call , experimental_conversations , experimental_utterances , googlev2_long , googlev2_short , googlev2_telephony , googlev2_telephony_short , deepgram_nova-2 | default |
speechTimeout
to a positive integer value. Don't use auto
.Generic models include the following values:
Model | Supported languages |
---|---|
default | Any |
phone_call | en-US , en-GB , en-AU , fr-FR , fr-CA , ja-JP , ru-RU , es-US , es-ES , pt-BR |
numbers_and_commands | Any |
experimental_conversations | ar-* , da-DK , de-DE , en-AU , en-GB , en-IN , en-US , es-ES , es-US , fi-FI , fr-CA , fr-FR , hi-IN , ja-JP , ko-KR , mk-MK , nl-NL , no-NO , pl-PL , pt-BR , pt-PT , ro-RO , ru-RU , th-TH , tr-TR , uk-UA , vi-VN |
experimental_utterances | ar-* , da-DK , de-DE , en-AU , en-GB , en-IN , en-US , es-ES , es-US , fi-FI , fr-CA , fr-FR , hi-IN , ja-JP , ko-KR , mk-MK , nl-NL , no-NO , pl-PL , pt-BR , pt-PT , ro-RO , ru-RU , th-TH , tr-TR , uk-UA , vi-VN |
Experimental models give access to the latest speech technology and machine learning research, and can provide higher accuracy for speech recognition over other available models. Available models might support features that experimental models don't.
Model | Use Case | Example |
---|---|---|
experimental_utterances | short utterances of a few seconds in length like commands or other single word directed speech | "press 0 or say 'support' to speak with an agent." |
experimental_conversations | spontaneous speech and conversations | "tell us why you're calling today." |
Specific STT models include a variety from Google Speech-to-Text (STT) V2 (googlev2
) or Deepgram (deepgram
). If the provider has an outage, Twilio doesn't switch providers on your behalf.
The following tabs display the accepted model values for each STT provider.
This attribute expresses its value in the format of googlev2_{model}
.
This attribute accepts the following values for Google STT V2 models:
googlev2_long
googlev2_short
googlev2_telephony
googlev2_telephony_short
1<Gather input="speech" speechModel="googlev2_telephony">2<Say>Please tell us why you're calling.</Say>3</Gather>
If Twilio doesn't support the combination of language and model you need, it falls back to a generic model.
To learn which languages and models that Google STT V2 supports, consult Google's documentation on Speech-to-Text V2 supported languages.
To improve the accuracy of your speech to text recognition, set this attribute value to the specific language model best suited for your use case.
To find which works model best for your use case, consider exploring all options.
The speechTimeout
specifies how long Twilio should wait after a pause in speech before stopping recognition.
Necessity | Accepted values | Default value |
---|---|---|
Optional | positive integer or auto | timeout value |
speechTimeout
to auto
, Twilio stops recognizing speech at the first pause in speech.speechResult
to your action
URL.<Gather>
request includes both timeout
and speechTimeout
, timeout
takes precedence for DTMF input and speechTimeout
takes precedence for speech.The timeout
attribute specifies how long Twilio should wait for the caller provide input on the call. This includes either pressing another digit or saying another word.
Necessity | Accepted values | Default value |
---|---|---|
Optional | positive integer or auto | 5 |
timeout
period, it waits until all nested verbs have executed.speechResult
to your action
URL.Consider the following TwiML document. Before submitting the caller's data, Twilio waits three seconds for the caller. This pause gives the caller time to either press another key or say another word.
1<?xml version="1.0" encoding="UTF-8"?>2<Response>3<Gather input="speech dtmf" timeout="3" numDigits="1">4<Say>Please press 1 or say sales for sales.</Say>5</Gather>6</Response>
The enhanced
attribute specifies that <Gather>
should use the premium Google STT V1 model. When transcribing phone conversations, the premium model produces 54% fewer errors compared to the base model.
Necessity | Accepted values | Default value |
---|---|---|
Deprecated | true , false | false |
This attribute has the following limitations:
phone_call
model. Twilio ignores the enhanced
attribute when set for any other model.en-GB
, en-US
, es-ES
, es-US
, fr-CH
, fr-FR
, ja-JP
, ru-RU
.The following TwiML document uses the premium phone_call
model for <Gather>
:
1<Gather input="speech" enhanced="true" speechModel="phone_call" language="en-GB">2<Say>Please tell us why you're calling.</Say>3</Gather>
You can nest the following verbs within <Gather>:
When a <Gather>
contains nested <Say>
or <Play>
verbs, the timeout
begins either after the audio completes or when the caller presses their first key. If <Gather>
contains multiple <Play>
verbs, Twilio retrieves the contents of all files before the <Play>
begins.
This example shows a <Gather>
with a nested <Say>
. This TwiML document reads some text to the caller and can accept input from the caller at any time.
1const VoiceResponse = require('twilio').twiml.VoiceResponse;234const response = new VoiceResponse();5const gather = response.gather({6input: 'speech dtmf',7timeout: 3,8numDigits: 19});10gather.say('Please press 1 or say sales for sales.');1112console.log(response.toString());
1<?xml version="1.0" encoding="UTF-8"?>2<Response>3<Gather input="speech dtmf" timeout="3" numDigits="1">4<Say>Please press 1 or say sales for sales.</Say>5</Gather>6</Response>
If you use <Play>
verbs, consider hosting your media in AWS S3 in the us-east-1
, eu-west-1
, or ap-southeast-2
regions depending on the Twilio Region you use. No matter where you host your media files, verify your Cache Control
headers. Twilio uses a caching proxy in its webhook pipeline and caches media files with cache headers. Serving media out of Twilio's cache can take 10ms or less. As we run a fleet of caching proxies, it may take multiple requests before all of the proxies have a copy of your file in cache.
When a <Gather>
reaches its timeout without any caller input, call control falls to the next verb in your original TwiML document.
To send a request to your action
URL even if <Gather>
times out, include a <Redirect>
after the <Gather>
.
1const VoiceResponse = require('twilio').twiml.VoiceResponse;234const response = new VoiceResponse();5const gather = response.gather({6action: '/process_gather.php',7method: 'GET'8});9gather.say('Enter something, or not');10response.redirect({11method: 'GET'12}, '/process_gather.php?Digits=TIMEOUT');1314console.log(response.toString());
1<?xml version="1.0" encoding="UTF-8"?>2<!-- page located at http://example.com/gather_hints.xml -->3<Response>4<Gather action="/process_gather.php" method="GET">5<Say>Enter something, or not</Say>6</Gather>7<Redirect method="GET">8/process_gather.php?Digits=TIMEOUT9</Redirect>10</Response>
With this code, Twilio moves to the next verb in the TwiML document (<Redirect>
) when <Gather>
times out. In this example, Twilio makes a new GET
request to /process_gather.php?Digits=TIMEOUT
.
You might face a few common issues when working with <Gather>
:
Problem | Solution |
---|---|
<Gather> doesn't receive caller input from callers who use a VoIP phone. | Some VoIP phones have trouble sending DTMF digits. These phones may use compressed bandwidth-conserving audio protocols. These interfere with the transmission of the digit's signal. Consult your phone's documentation on DTMF problems. |
Twilio doesn't send the Digits parameter to your <Gather> URL. | Check to ensure your application doesn't responding to the action URL with an HTTP 3xx redirect. Twilio follows this redirect, but won't resend the Digits parameter. |
If you encounter other issues with <Gather>
, reach out to our support team for assistance.