Alexa.RTCSessionController Interface
The Alexa.RTCSessionController interface describes the messages used by Alexa to interact with endpoints capable of real-time communnication (RTC). The RTCSessionController interface supports 1-way (half duplex) or 2-way (full duplex) communication over audio and video. By using the RTCSessionController interface in your applications, Alexa customers can communicate with a visitor at their front door through their camera and intercom. For more information, see Announcing 2-Way Communication APIs.
RTCSessionController interface is only supported on Echo Show and Echo Spot. To communicate with a front door camera on an Alexa-enabled tablet or FireTV, use the Alexa.CameraStreamController interface.- Overview
- Discovery
- Directives
- Properties and Events
- Session Description Protocol Offer/Answer Format
- Error Handling
- Related Topics
- Related Interfaces
Overview
Supported Communication Types
- 1-way (half duplex) communication allows customers to communicate in two directions, but not simultaneously. For example:
- A walkie-talkie
- A push-to-talk door intercom
- 2-way (full duplex) communication allows customers to communicate in two directions simultaneously. For example:
- A telephone
- A telephone door intercom
Utterances
Customers can start communication with a person next to a real time communication device by talking to their Alexa-enabled device (for example, an Echo Show or Echo Spot) or by using the microphone icon when they are in live streaming mode.
Customers can start conversations by saying one of the following:
User: Alexa, answer the front door
User: Alexa, get the call going with the front door
User: Alexa, please call front door
User: Alexa, respond to the front door
User: Alexa, speak to the front door
User: Alexa, talk to my front door camera
User: Alexa, talk to the front door
User: Alexa, talk to the person at the main door
Customers can end conversations by saying one of the following:
User: Alexa, go home
User: Alexa, stop
Prerequisites and SLA Requirements
To use the RTCSessionController API, you need the following:
-
A minimum timeout of one minute is required.
-
For any offer sent to your skill, you must generate an answer within six seconds.
-
Your device or platform must be WebRTC compliant or support the suite of protocols by WebRTC and all supported resiliency mechanisms used in WebRTC. Specifically,
-
For resource considerations, you must support bundling and rtcp-mux. You use a bundle to send audio and video over the same connection to reduce the number of open sockets.
-
To support full-duplex communication, your device must employ effective algorithms for acoustic echo cancellation (AEC) and noise suppression.
-
To support half-duplex communication, you can use the Push to Talk feature through the typical live view scenario. Declare
isFullDuplexAudioSupportedasfalsein the discovery response. -
To support video, you must use one of the following video codecs:
- H264 (up to profile high, level 4.1)
-
To support audio, you must use one of the following audio codecs:
- Opus (preferred codec)
- PCMU/G.711
- AAC-LC, HE-AAC
-
For Interactive Connectivity Establishment (ICE) candidates, you can use either UDP or TCP but you must use IPv4.
Signaling Diagram
The RTCSessionController communication is shown in the following signaling diagram.

Discovery
When you respond to a discovery request for a skill that controls a real-time communications device, you describe endpoints that support the Alexa.RTCSessionController interface. Use the standard discovery mechanism described in Alexa.Discovery, as shown in the following example:
Discover.Response example containing RTCSessionController
{
"event": {
"header": {
"namespace":"Alexa.Discovery",
"name":"Discover.Response",
"payloadVersion":"3",
"messageId":"ff746d98-ab02-4c9e-9d0d-b44711658414"
},
"payload":{
"endpoints":[
{
"manufacturerName": "Sample Manufacturer",
"modelName": "Sample Model",
"friendlyName": "My front door camera",
"description": "A smart front door camera",
"displayCategories": [ "CAMERA" ],
"cookie": {
"key1": "Arbitrary key/value pairs for skill to reference this endpoint",
"key2": "There can be multiple entries",
"key3": "Use only for reference",
"key4": "Do not use to maintain endpoint state"
},
"capabilities":
[
{
"type": "AlexaInterface",
"interface": "Alexa.RTCSessionController",
"version": "3",
"configuration": {
"isFullDuplexAudioSupported": true
}
}
]
}
]
}
}
}
Payload details
| Field | Description | Type | Required |
|---|---|---|---|
manufacturerName |
The name of the manufacturer of the device. | string | Yes |
modelName |
The model name of the device. | string | No, but strongly recommended |
friendlyName |
A friendly name for the device. | string | Yes |
description |
A description of the device. | string | Yes |
displayCategories |
The categories for the skill. Use CAMERA or DOORBELL. |
An array of strings. | Yes |
isFullDuplexAudioSupported |
True if the device supports 2-way (full duplex) communication. False if the device supports 1-way (half duplex) communication. The default is false. | boolean | No |
Directives
InitiateSessionWithOffer Directive
Initiate a real-time communication session with a front door device.
User: Alexa, talk to my front door camera
InitiateSessionWithOffer directive example
{
"directive": {
"header": {
"namespace": "Alexa.RTCSessionController",
"name": "InitiateSessionWithOffer",
"messageId": "d1ba3aa7-bff7-4406-9425-f25f04ec8d68",
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"payloadVersion": "3"
},
"endpoint": {
"scope": {
"type": "BearerToken",
"token": "access-token-from-skill"
},
"endpointId": "appliance-001",
"cookie": {
"keys": "key/value pairs received during discovery",
}
},
"payload": {
"sessionId" : "the session identifier",
"offer": {
"format" : "SDP",
"value" : "<SDP offer value>"
}
}
}
}
Payload details
| Field | Description | Type | Required |
|---|---|---|---|
sessionId |
The identifier of the session that wants to connect. | A Version 4 UUID | Yes |
offer |
An SDP offer. | string | Yes |
SessionConnected Directive
The directive to connect an RTC session. The payload for this message contains the identifier for the RTC session, received from the original InitiateOfferWithSession directive.
SessionConnected directive example
{
"directive": {
"header": {
"namespace": "Alexa.RTCSessionController",
"name": "SessionConnected",
"messageId": "d1ba3aa7-bff7-4406-9425-f25f04ec8d68",
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"payloadVersion": "3"
},
"endpoint": {
"scope": {
"type": "BearerToken",
"token": "access-token-from-skill"
},
"endpointId": "appliance-001",
"cookie": {
"keys": "key/value pairs received during discovery",
}
},
"payload": {
"sessionId" : "session identifier"
}
}
}
Payload details
| Field | Description | Type | Required |
|---|---|---|---|
sessionId |
The identifier of the session that wants to connect. | A Version 4 UUID | Yes |
SessionDisconnected Directive
The directive to disconnect an RTC session. The payload for this message contains the identifier for the RTC session, received from the original InitiateOfferWithSession directive.
SessionDisconnected directive example
{
"directive": {
"header": {
"namespace": "Alexa.RTCSessionController",
"name": "SessionDisconnected",
"messageId": "d1ba3aa7-bff7-4406-9425-f25f04ec8d68",
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"payloadVersion": "3"
},
"endpoint": {
"scope": {
"type": "BearerToken",
"token": "access-token-from-skill"
},
"endpointId": "appliance-001",
"cookie": {
"keys": "key/value pairs received during discovery",
}
},
"payload": {
"sessionId" : "session identifier"
}
}
}
Payload details
| Field | Description | Type | Required |
|---|---|---|---|
sessionId |
The identifier of the session that wants to disconnect. | A Version 4 UUID | Yes |
Properties and Events
Properties
There are no reportable properties currently defined for this interface.
AnswerGeneratedForSession Event
If the InitiateOfferWithSession directive was successfully handled, you should respond with a AnswerGeneratedForSession event. The payload for this message contains an SDP answer.
AnswerGeneratedForSession event example
{
"event": {
"header": {
"namespace": "Alexa.RTCSessionController",
"name": "AnswerGeneratedForSession",
"messageId": "30d2cd1a-ce4f-4542-aa5e-04bd0a6492d5",
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"payloadVersion": "3"
},
"endpoint": {
"endpointId" : "appliance-001",
},
"payload": {
"answer": {
"format" : "SDP",
"value" : "<SDP answer value>"
}
}
}
}
Payload details
| Field | Description | Type | Required |
|---|---|---|---|
answer |
An SDP answer. | string | Yes |
SessionConnected Event
If the SessionConnected directive was successfully handled, you should respond with a SessionConnected event. The payload for this message contains the identifier for the RTC session, received from the original InitiateOfferWithSession directive.
SessionConnected event example
{
"event": {
"header": {
"namespace": "Alexa.RTCSessionController",
"name": "SessionConnected",
"messageId": "30d2cd1a-ce4f-4542-aa5e-04bd0a6492d5",
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"payloadVersion": "3"
},
"endpoint": {
"endpointId" : "appliance-001" ,
},
"payload": {
"sessionId" : "session identifier"
}
}
}
Payload details
| Field | Description | Type | Required |
|---|---|---|---|
sessionId |
The identifier of the session that was connected. | A Version 4 UUID | Yes |
SessionDisconnected Event
If the SessionDisconnected directive was successfully handled, you should respond with a SessionDisconnected event. The payload for this message contains the identifier for the RTC session, received from the original InitiateOfferWithSession directive.
SessionDisconnected event example
{
"event": {
"header": {
"namespace": "Alexa.RTCSessionController",
"name": "SessionDisconnected",
"messageId": "30d2cd1a-ce4f-4542-aa5e-04bd0a6492d5",
"correlationToken": "dFMb0z+PgpgdDmluhJ1LddFvSqZ/jCc8ptlAKulUj90jSqg==",
"payloadVersion": "3"
},
"endpoint": {
"endpointId" : "appliance-001"
},
"payload": {
"sessionId" : "session identifier"
}
}
}
Payload details
| Field | Description | Type | Required |
|---|---|---|---|
sessionId |
The identifier of the session that was disconnected. | A Version 4 UUID | Yes |
Session Description Protocol Offer/Answer Format
The RTCSessionController interface uses the Session Description Protocol (SDP). For more information, see Session Description Protocol (SDP).
Offer/answer exchange example
v=0
o=- 3747690900 3747690900 IN IP4 0.0.0.0
s=a 2 z
c=IN IP4 0.0.0.0
t=0 0
a=group:BUNDLE audio0 video0
m=audio 1 RTP/SAVPF 96 0
a=candidate:1 1 UDP 2013266430 xxx.xxx.xxx.xxx 8620 typ host
a=candidate:2 1 TCP 1010827775 xxx.xxx.xxx.xxx 45351 typ host tcptype passive
a=candidate:3 2 UDP 2013266429 xxx.xxx.xxx.xxx 50066 typ host
a=candidate:4 2 TCP 1010827774 xxx.xxx.xxx.xxx 65157 typ host tcptype passive
a=candidate:5 2 TCP 1015022078 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=candidate:6 1 TCP 1015022079 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=setup:actpass
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=rtpmap:96 opus/48000/2
a=rtcp:9 IN IP4 0.0.0.0
a=rtcp-mux
a=sendrecv
a=mid:audio0
a=ssrc:118039096 cname:user2571875795@host-433aaf59
a=ice-ufrag:AGVf
a=ice-pwd:h3JAYGhIaQ/Nvyaz9dLoz9
a=fingerprint:sha-256 34:D4:54:17:0C:95:2A:79:FF:72:10:21:E9:6E:F3:77:86:2F:8D:6C:33:45:BA:14:1D:43:01:D7:CD:0A:1A:84
m=video 1 RTP/SAVPF 99
a=candidate:4 1 UDP 2013266430 xxx.xxx.xxx.xxx 8620 typ host
a=candidate:5 1 TCP 1015022079 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=candidate:4 2 UDP 2013266429 xxx.xxx.xxx.xxx 50066 typ host
a=candidate:6 1 TCP 1010827775 xxx.xxx.xxx.xxx 45351 typ host tcptype passive
a=candidate:5 2 TCP 1015022078 xxx.xxx.xxx.xxx 9 typ host tcptype active
a=candidate:6 2 TCP 1010827774 xxx.xxx.xxx.xxx 65157 typ host tcptype passive
b=AS:500
a=setup:actpass
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=rtpmap:99 H264/90000
a=rtcp:9 IN IP4 0.0.0.0
a=rtcp-mux
a=sendrecv
a=mid:video0
a=rtcp-fb:99 nack
a=rtcp-fb:99 nack pli
a=rtcp-fb:99 ccm fir
a=ssrc:3643559644 cname:user2571875795@host-433aaf59
a=ice-ufrag:AGVf
a=ice-pwd:h3JAYGhIaQ/Nvyaz9dLoz9
a=fingerprint:sha-256 34:D4:54:17:0C:95:2A:79:FF:72:10:21:E9:6E:F3:77:86:2F:8D:6C:33:45:BA:14:1D:43:01:D7:CD:0A:1A:84
Error Handling
You should reply with an error if you cannot complete the customer request for some reason. For more details, see Alexa.ErrorResponse.
Related Topics
Related Interfaces
| Interface | Description |
|---|---|
| Alexa.CameraStreamController | Describes the messages used retrieve camera streams from camera endpoints. |
| Alexa.DoorbellEventSource | An endpoint that is capable of raising doorbell events. |
| Alexa.MotionSensor | Describes an endpoint that senses physical movement in an area. |