import Tabs from @theme/Tabs; import TabItem from @theme/TabItem; The instruction makes it possible to send raw audio streams from a running phone call over WebSockets in near real time, to a specified URL. The audio frames themselves are base64 encoded, embedded in a json string, together with other information like sequence number and timestamp. The feature can be used with Speech-To-Text systems and others. ## Attributes An example on how to use Stream: xml This cXML will instruct Signalwire to make a copy of the audio frames of the current call and send them in near real-time over WebSocket to wss://your-application.com/audiostream. will start the audio stream asynchronous manner it will continue with the next cXML instruction at once. In case there is no instruction, Signalwire disconnect the call. javascript const { RestClient } = require(@signalwire/compatibility-api); const response = new RestClient.LaML.VoiceResponse(); const start = response.start(); start.stream({ name: Example Audio Stream, url: wss://your-application.com/audiostream, }); console.log(response.toString()); csharp using System; using Twilio.TwiML; using Twilio.TwiML.Voice; class Example { static void Main() { var response = new VoiceResponse(); var start = new Start(); start.Stream(name: Example Audio Stream, url: wss://your-application.com/audiostream); response.Append(start); Console.WriteLine(response.ToString()); } } python from twilio.twiml.voice_response import Parameter, VoiceResponse, Start, Stream response = VoiceResponse() start = Start() stream = Stream(url=wss://your-application.com/audiostream) stream.parameter(name=FirstName, value=Jane) stream.parameter(name=LastName, value=Doe) start.append(stream) response.append(start) print(response) ruby require signalwire/sdk response = Signalwire::Sdk::VoiceResponse.new response.start do |start| start.stream(url: wss://your-application.com/audiostream) do |stream| stream.parameter(name: FirstName, value: Jane) stream.parameter(name: LastName, value: Doe) end end puts response | Attribute | | | --: | - | | url | Absolute or relative URL. A WebSocket connection to the url will be established and audio will start flowing towards the Websocket server. The only supported protocol is wss. For security reasons ws is NOT supported. | | name optional | Unique name for the Stream, per Call. It is used to stop a Stream by name. | | track optional | This attribute can be one of: inbound_track, outbound_track, both_tracks . Defaults to inbound_track. For both_tracks there will be both inbound_track and outbound_track events. | | statusCallback optional | Absolute or relative URL. SignalWire will make a HTTP GET or POST request to this URL when a Stream is started, stopped or there is an error. | | statusCallbackMethod optional | GET or POST. The type of HTTP request to use when requesting a statusCallback. Default is POST. | ## StatusCallback Parameters For a statusCallback, SignalWire will send a request with the following parameters: | Parameter | | | : | -- | | AccountSid string | The unique ID of the Account this call is associated with. | | CallSid string | A unique identifier for the call. May be used to later retrieve this message from the REST API. | | StreamSid string | The unique identifier for this Stream. | | StreamName string | If defined, this is the unique name of the Stream. Defaults to the StreamSid. | | StreamEvent string | One of stream-started, stream-stopped, or stream-error. | | StreamError string | If an error has occurred, this will contain a detailed error message. | | Timestamp string | The time of the event in ISO 8601 format. | ## Custom Parameters To pass parameters towards the wss server, it is possible to include additional key value pairs. This can be done by using the nested cXML noun. These parameters will be added to the Start message, as json. xml ## Stopping a Stream It is possible to stop a stream at any time by name. For instance by naming the Stream mystream, you can later use the unique name of mystream to stop the stream. xml xml ## Bidirectional Stream The instruction can allow you to receive audio into the call too. In this case, the stream must be bidirectional. The external service (e.g., an AI agent) will then be able to both hear the call _and_ play audio. To initialize a bidirectional stream, wrap the instruction in [](./connect.mdx) instead of : xml ## WebSocket Messages There are 5 separate types of events that occur during the Streams life cycle. These events are represented via WebSocket Messages: Connected, Start, Media, DTMF and Stop. Each message sent is a JSON string. The type of event which is occurring can be identified by using the event property of every JSON object. ### Connected Message The first message sent once a WebSocket connection is established is the Connected event. This message describes the protocol to expect in the following messages. | Event | | | -: | -- | | event | The string value of connected | | protocol | Defines the protocol for the WebSocket connections lifetime. eg: Call | | version | Semantic version of the protocol. | Example Connected Message json { event: connected, protocol: Call, version: 0.2.0 } ### Start Message This message contains important information about the Stream and is sent immediately after the Connected message. It is only sent once at the start of the Stream. | Event | | | : | | | event | The string value of start. | | sequenceNumber | Number used to keep track of message sending order. First message starts with number 1 and then is incremented. | | start | An object containing Stream metadata. | | start.streamSid | The unique identifier of the Stream. | | start.accountSid | The Account identifier that created the Stream. | | start.callSid | The Call identifier from where the Stream was started. | | start.tracks | An array of values that indicates what media flows to expect in subsequent messages. Values are one of inbound or outbound or both. | | start.customParameters | An object that represents the Custom Parameters that where set when defining the Stream. | | start.mediaFormat | An object containing the format of the payload in the Media Messages. | | start.mediaFormat.encoding | The encoding of the data in the upcoming payload. Default is audio/x-mulaw. | | start.mediaFormat.sampleRate | The Sample Rate in Hertz of the upcoming audio data. Default value is 8000, which is the rate of PCMU. | | start.mediaFormat.channels | The number of channels in the input audio data. Default value is 1. For both_tracks it will be 2. | Example Start Message json { event: start, sequenceNumber: 2, start: { tracks: [inbound, outbound], customParameters: { FirstName: Jane, LastName: Doe, RemoteParty: Bob }, mediaFormat: { encoding: audio/x-mulaw, sampleRate: 8000, channels: 1 } } } ### Media Message This message type encapsulates the raw audio data. | Event | | | --: | - | | event | The string value of media. | | sequenceNumber | Number used to keep track of message sending order. First message starts with number 1 and then is incremented for each message. | | media | An object containing media metadata and payload. | | media.track | One of the strings inbound or outbound. | | media.chunk | The chunk for the message. The first message will begin with number 1 and increment with each subsequent message. | | media.timestamp | Presentation Timestamp in Milliseconds from the start of the stream. | | media.payload | Raw audio encoded in base64. | Example Media Messages Outbound json { event: media, sequenceNumber: 3, media: { track: outbound, chunk: 1, payload: iY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//jw== } } Inbound json { event: media, sequenceNumber: 4, media: { track: inbound, chunk: 1, timestamp: 5, payload: /4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JD/+PiY//DwkP/4+Jj/8PCQ//j4mP/w8JDw== } } ### Stop Message A stop message will be sent when the Stream is either stopped or the Call has ended. Example Stop Message json { event: stop, sequenceNumber: 5 } | Event | | | -: | - | | event | The string value of stop. | | sequenceNumber | Number used to keep track of message sending order. First message starts with number 1 and then is incremented for each message. | ### DTMF Message A DTMF message will be sent when the Stream receives a DTMF tone. | Event | | | --: | | | event | The string value of dtmf. | | sequence_number | Number, as a string, used to keep track of message-sending order. The first message starts with 1 and then is incremented for each message. | | streamSid | The unique identifier of the stream as a string. | | dtmf | An object containing the details of the detected DTMF. | | dtmf.duration | The duration of the DTMF in milliseconds. | | dtmf.digit | The digit, as a string, that corresponds to the DTMF. | Example DTMF Message json { event: dtmf, sequence_number: 1, dtmf: { duration: 2700, digit: 8 } } ### Clear Message Send the clear event message if you would like to interrupt the audio that has been sent various media event messages. This will empty all buffered audio. | Event | | | --: |--| | event | The string value of clear. | | streamSid | The unique identifier of the stream as a string. | Example Clear Message json { event: clear, } ## Notes on Usage - The url does not support query string parameters. To pass custom key value pairs to the WebSocket, make use of Custom Parameters instead. - There is a one to one mapping of a stream to a websocket connection, therefore there will be at most one call being streamed over a single websocket connection. Information will be provided so that you can handle handle multiple inbound connections and manage the association between the unique stream identifier (StreamSid) and the connection. - On any given call there are inbound and outbound tracks, inbound represents the audio Signalwire receives from the call, outbound represents the audio generated by Signalwire for the Call.