SDP: Session Description Protocol in WebRTC
The first time I encountered a raw SDP message, I felt like I was looking at an alien language. It was a wall of cryptic text filled with seemingly random letters, numbers, and abbreviations:
v=0
o=- 7614219274584779017 2 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE audio video
m=audio 49170 UDP/TLS/RTP/SAVPF 111
c=IN IP4 192.0.2.1
a=rtpmap:111 opus/48000/2
m=video 56607 UDP/TLS/RTP/SAVPF 96 97
c=IN IP4 192.0.2.1
a=rtpmap:96 VP8/90000
a=rtpmap:97 H264/90000
Despite its intimidating appearance, this cryptic text is the Session Description Protocol (SDP)—one of the most crucial elements in establishing WebRTC connections. It's the language that allows two peers to negotiate their capabilities and agree on how they'll communicate.
In my years implementing WebRTC solutions, I've found that understanding SDP is like having a secret decoder ring. It allows you to diagnose connection issues, optimize media quality, and implement advanced features that would otherwise be impossible. Yet many developers treat SDP as a black box, copying and pasting code without truly understanding what's happening under the hood.
In this article, we'll demystify SDP in WebRTC. We'll explore what it is, how it works, and how you can leverage it to build more robust real-time applications.
What is SDP and Why Does WebRTC Use It?
The Session Description Protocol (SDP) is not actually a protocol in the traditional sense—it's a format for describing multimedia communication sessions. Think of it as a detailed menu of what a peer can offer and what it's willing to accept in a communication session.
SDP was originally developed for Session Initiation Protocol (SIP) in the telecommunications world, long before WebRTC existed. When the WebRTC standard was being developed, rather than creating an entirely new format, the designers chose to adopt SDP—a decision that brought both benefits and challenges.
The Challenge of Media Negotiation
To understand why SDP is necessary, consider the vast diversity of devices and networks in the WebRTC ecosystem:
- Devices with different camera resolutions and capabilities
- Various supported audio and video codecs
- Different network bandwidth constraints
- Diverse security requirements
- Optional features like data channels
For two peers to communicate effectively, they need to find common ground across all these variables. This negotiation process is what SDP facilitates.
I once worked on a WebRTC application connecting desktop and mobile devices. The desktop clients supported high-resolution H.264 video, while some older mobile devices only supported VP8 at lower resolutions. Without a negotiation mechanism like SDP, these devices would have no way to establish a compatible connection.
The Offer/Answer Model: How SDP Negotiation Works
WebRTC uses an offer/answer model for SDP negotiation:
- The initiating peer creates an "offer" SDP describing its capabilities and preferences
- This offer is sent to the remote peer through the signaling channel
- The remote peer creates an "answer" SDP describing its capabilities and selecting compatible options
- The answer is sent back to the initiator
- Both peers now have a mutual understanding of how communication will proceed
This process happens as part of the broader connection establishment:
// Initiator side
async function startCall() {
// Create an offer
const offer = await peerConnection.createOffer();
// Set it as the local description
await peerConnection.setLocalDescription(offer);
// Send the offer to the remote peer via signaling
signalingChannel.send({
type: 'offer',
sdp: peerConnection.localDescription
});
}
// Receiver side
async function handleOffer(offer) {
// Set the received offer as the remote description
await peerConnection.setRemoteDescription(offer);
// Create an answer
const answer = await peerConnection.createAnswer();
// Set it as the local description
await peerConnection.setLocalDescription(answer);
// Send the answer back via signaling
signalingChannel.send({
type: 'answer',
sdp: peerConnection.localDescription
});
}
While this code looks straightforward, the magic happens inside the SDP messages being exchanged.
Anatomy of an SDP Message
Let's dissect an SDP message to understand its structure. SDP uses a text-based format with a series of lines, each beginning with a single character (the type) followed by an equals sign and a value.
Session-Level Information
The first part of an SDP contains session-level information that applies to all media streams:
v=0 // Version (always 0)
o=- 7614219274584779017 2 IN IP4 127.0.0.1 // Origin (username, session ID, version, network type, address type, address)
s=- // Session name (usually a hyphen in WebRTC)
t=0 0 // Timing (start and stop times, 0 0 means unbounded)
a=group:BUNDLE audio video // Group media streams (BUNDLE optimization)
Media-Level Information
Following the session information are one or more media sections, each starting with an "m=" line:
m=audio 49170 UDP/TLS/RTP/SAVPF 111 // Media type, port, protocol, format (codec IDs)
c=IN IP4 192.0.2.1 // Connection data (network type, address type, connection address)
a=rtpmap:111 opus/48000/2 // RTP mapping (codec ID, codec name, clock rate, channels)
a=fmtp:111 minptime=10;useinbandfec=1 // Format parameters for the codec
a=sendrecv // Media direction (send and receive)
m=video 56607 UDP/TLS/RTP/SAVPF 96 97 // Another media section, this time for video
c=IN IP4 192.0.2.1
a=rtpmap:96 VP8/90000 // VP8 video codec
a=rtpmap:97 H264/90000 // H.264 video codec
a=fmtp:97 profile-level-id=42e01f // H.264 specific parameters
a=sendrecv
ICE Candidates
SDP also includes ICE candidates, though with trickle ICE these are often sent separately:
a=candidate:1 1 UDP 2113937151 192.168.1.2 49170 typ host // ICE candidate (foundation, component, transport, priority, IP, port, type)
a=candidate:2 1 UDP 1845501695 203.0.113.5 56607 typ srflx raddr 192.168.1.2 rport 49170 // Server reflexive candidate
Security Parameters
WebRTC mandates encryption, so SDP includes security parameters:
a=fingerprint:sha-256 19:E2:1C:3B:4B:9F:81:E6:B8:5C:F4:A5:A8:D8:73:04:BB:05:2F:70:9F:04:A9:0E:05:E9:26:33:E8:70:88:A2 // DTLS fingerprint
a=setup:actpass // DTLS setup role (active, passive, or actpass)
This is just a simplified overview—real SDP messages contain many more attributes and details.
SDP in Action: The Negotiation Dance
To truly understand SDP, let's follow the negotiation process in a typical WebRTC scenario.
Step 1: Creating the Offer
When peerConnection.createOffer() is called, the WebRTC implementation examines the local device's capabilities and generates an SDP offer. This includes:
- Available audio and video codecs
- Supported media formats and their parameters
- ICE candidates (unless using trickle ICE)
- Security parameters
- Data channel information (if used)
The resulting offer might look something like this (simplified):
v=0
o=- 7614219274584779017 2 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE audio video data
m=audio 49170 UDP/TLS/RTP/SAVPF 111 103 104
c=IN IP4 0.0.0.0
a=rtpmap:111 opus/48000/2
a=rtpmap:103 ISAC/16000
a=rtpmap:104 ISAC/32000
a=sendrecv
m=video 56607 UDP/TLS/RTP/SAVPF 96 97 98
c=IN IP4 0.0.0.0
a=rtpmap:96 VP8/90000
a=rtpmap:97 H264/90000
a=rtpmap:98 VP9/90000
a=sendrecv
m=application 56608 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
a=sctp-port:5000
This offer says: "I can send and receive audio using Opus, ISAC/16kHz, or ISAC/32kHz codecs. I can send and receive video using VP8, H.264, or VP9. I also support data channels."
Step 2: Setting the Local Description
When peerConnection.setLocalDescription(offer) is called, the WebRTC implementation commits to the capabilities described in the offer. This is an important step—it's like saying, "I promise I can support these options."
Step 3: Transmitting the Offer
The offer is then sent to the remote peer through the signaling channel. This is outside the WebRTC standard, as we discussed in our article on signaling.
Step 4: Processing the Offer
When the remote peer receives the offer, it calls peerConnection.setRemoteDescription(offer). This tells the WebRTC implementation what the other side is capable of.
Step 5: Creating the Answer
The remote peer then calls peerConnection.createAnswer(), which generates an SDP answer based on:
- The capabilities in the received offer
- The local device's capabilities
- The intersection of what both can support
The answer might look like this:
v=0
o=- 3782890523576841725 2 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE audio video data
m=audio 49170 UDP/TLS/RTP/SAVPF 111
c=IN IP4 0.0.0.0
a=rtpmap:111 opus/48000/2
a=sendrecv
m=video 56607 UDP/TLS/RTP/SAVPF 96
c=IN IP4 0.0.0.0
a=rtpmap:96 VP8/90000
a=sendrecv
m=application 56608 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
a=sctp-port:5000
This answer says: "I've chosen to use Opus for audio and VP8 for video. I also accept the data channel."
Notice that although the offer included multiple audio and video codecs, the answer selected just one of each. This is a key part of the negotiation—narrowing down options to what will actually be used.
Step 6: Completing the Negotiation
The answer is sent back to the initiator, which calls peerConnection.setRemoteDescription(answer). Now both peers have agreed on the communication parameters, and media can begin flowing.
Common SDP Modifications
While the WebRTC API handles SDP generation and parsing automatically, there are cases where you might need to modify SDP messages manually. Here are some common scenarios:
Codec Preferences
You might want to prioritize specific codecs for quality or compatibility reasons:
// Modify SDP to prioritize H.264 over VP8
function preferH264(sdp) {
return sdp.replace(
/m=video \d+ [A-Z/]+ 96 97/g,
'm=video $1 $2 97 96'
);
}
// Usage
peerConnection.createOffer()
.then(offer => {
const modifiedOffer = new RTCSessionDescription({
type: 'offer',
sdp: preferH264(offer.sdp)
});
return peerConnection.setLocalDescription(modifiedOffer);
})
.then(() => {
// Send the modified offer via signaling
signalingChannel.send({
type: 'offer',
sdp: peerConnection.localDescription
});
});
I once worked on a project where we needed to ensure H.264 was used for compatibility with hardware decoders. By modifying the SDP to prioritize H.264, we could influence the negotiation without forcing it (the remote peer could still reject H.264 if unsupported).
Bandwidth Limitations
You can set bandwidth limits for different media types:
function limitBandwidth(sdp) {
// Add bandwidth limit for video (2000 kbps)
return sdp.replace(
/m=video.*\r\n/g,
'$&b=AS:2000\r\n'
);
}
This can be crucial for applications that need to work in bandwidth-constrained environments. During a project for a rural telemedicine application, we used SDP bandwidth limitations to ensure video calls could function even on poor connections.
Disabling Features
Sometimes you need to disable certain features:
function disableVideo(sdp) {
// Find the video m-line
const videoMLineIndex = sdp.indexOf('m=video');
if (videoMLineIndex === -1) {
return sdp; // No video m-line found
}
// Find the next m-line after video
const nextMLineIndex = sdp.indexOf('m=', videoMLineIndex + 5);
// Extract the sections before, during, and after the video m-line
const sdpBefore = sdp.substring(0, videoMLineIndex);
const videoSection = nextMLineIndex === -1
? sdp.substring(videoMLineIndex)
: sdp.substring(videoMLineIndex, nextMLineIndex);
const sdpAfter = nextMLineIndex === -1
? ''
: sdp.substring(nextMLineIndex);
// Modify the video section to reject it
const modifiedVideoSection = videoSection.replace(
/m=video\s+\d+\s+[A-Z/]+\s+/,
'm=video 0 UDP/TLS/RTP/SAVPF '
);
return sdpBefore + modifiedVideoSection + sdpAfter;
}
This technique can be used for audio-only calls or to implement features like video muting at the transport level.
Advanced SDP Concepts
As you become more comfortable with SDP, you'll encounter several advanced concepts that are important for robust WebRTC applications.
Simulcast
Simulcast involves sending multiple versions of the same video stream at different qualities, allowing receivers to select the appropriate version based on their bandwidth:
m=video 56607 UDP/TLS/RTP/SAVPF 96
a=rtpmap:96 VP8/90000
a=ssrc:1234567890 cname:user@example.com
a=ssrc:1234567890 msid:stream track
a=ssrc:9876543210 cname:user@example.com
a=ssrc:9876543210 msid:stream track
a=ssrc-group:SIM 1234567890 9876543210
This is particularly useful for multi-party video conferencing, where different participants may have different bandwidth capabilities.
Plan B vs. Unified Plan
WebRTC has gone through significant changes in how it handles multiple media streams. The older "Plan B" approach is being replaced by the standardized "Unified Plan":
// Modern approach using Unified Plan (default in recent browsers)
const peerConnection = new RTCPeerConnection({
sdpSemantics: 'unified-plan'
});
// Legacy approach using Plan B (deprecated)
const legacyConnection = new RTCPeerConnection({
sdpSemantics: 'plan-b'
});
The transition between these approaches has been a source of compatibility challenges. I've spent countless hours debugging issues caused by mixing Plan B and Unified Plan implementations.
Renegotiation
When conditions change (adding/removing tracks, changing codecs, etc.), you need to renegotiate the connection:
// When adding a new track
peerConnection.addTrack(newVideoTrack, stream);
// Renegotiate
peerConnection.createOffer()
.then(offer => peerConnection.setLocalDescription(offer))
.then(() => {
// Send the new offer via signaling
signalingChannel.send({
type: 'offer',
sdp: peerConnection.localDescription
});
});
Renegotiation follows the same offer/answer pattern as the initial negotiation but can be trickier to handle correctly, especially with concurrent renegotiations.
Debugging SDP Issues
SDP-related problems are among the most common and challenging issues in WebRTC development. Here are some approaches I've found effective for debugging:
SDP Visualization Tools
Raw SDP is hard to read, but tools like the WebRTC Internals page in Chrome (chrome://webrtc-internals/) provide visualizations that make it easier to understand what's happening.
Common SDP Problems and Solutions
Here are some issues I've frequently encountered:
- Codec Mismatch: When peers don't share any common codecs for a media type
Solution: Ensure both peers support at least one common codec, or modify SDP to include additional codecs.
- ICE Candidate Issues: When SDP doesn't contain the right candidates or they're in the wrong format
Solution: Use trickle ICE to separate candidate gathering from SDP exchange, and ensure candidates are properly formatted.
- Security Parameter Mismatch: When DTLS parameters don't align
Solution: Check fingerprint and setup attributes in the SDP.
- Direction Mismatch: When one peer expects to send media but the other isn't expecting to receive it
Solution: Ensure media directions (sendrecv, sendonly, recvonly, inactive) are compatible.
Logging and Analysis
Implementing detailed logging of SDP messages can be invaluable:
function logSdp(description, sdp) {
console.log(`${description} SDP:`);
// Log session-level information
const sessionPart = sdp.split('m=')[0];
console.log('Session-level info:', sessionPart);
// Log each media section
const mediaSections = sdp.match(/m=.*?(?=m=|$)/gs);
if (mediaSections) {
mediaSections.forEach((section, index) => {
console.log(`Media section ${index} (${section.startsWith('m=audio') ? 'audio' : section.startsWith('m=video') ? 'video' : 'data'}):`);
console.log(section);
// Extract key information
const codecs = section.match(/a=rtpmap:(\d+) ([a-zA-Z0-9-]+)\/(\d+)/g);
if (codecs) {
console.log('Codecs:', codecs.map(c => c.match(/a=rtpmap:(\d+) ([a-zA-Z0-9-]+)\/(\d+)/)).map(m => `${m[2]} (ID: ${m[1]}, Clock: ${m[3]})`));
}
const direction = section.match(/a=(sendrecv|sendonly|recvonly|inactive)/);
if (direction) {
console.log('Direction:', direction[1]);
}
});
}
}
// Usage
peerConnection.createOffer()
.then(offer => {
logSdp('Offer', offer.sdp);
return peerConnection.setLocalDescription(offer);
});
This kind of structured logging has helped me identify subtle issues that would be nearly impossible to spot in raw SDP.
The Future of SDP in WebRTC
The WebRTC standard continues to evolve, and SDP's role is changing:
SDP Object Model
The WebRTC community has recognized that manipulating raw SDP strings is error-prone. Work is underway on an SDP Object Model that would provide a more structured way to work with session descriptions:
// Future API (conceptual example)
const capabilities = {
audio: {
codecs: [
{ name: 'opus', clockRate: 48000, channels: 2, priority: 1 }
],
direction: 'sendrecv'
},
video: {
codecs: [
{ name: 'VP8', clockRate: 90000, priority: 1 },
{ name: 'H264', clockRate: 90000, priority: 2 }
],
direction: 'sendrecv'
}
};
peerConnection.createOffer({ capabilities })
.then(offer => peerConnection.setLocalDescription(offer));
This would make SDP manipulation safer and more intuitive.
WebRTC-SVC
Scalable Video Coding (SVC) is gaining support in WebRTC, allowing for more efficient adaptation to changing network conditions:
m=video 56607 UDP/TLS/RTP/SAVPF 96
a=rtpmap:96 VP9/90000
a=fmtp:96 profile-id=2 max-fs=3600; max-fr=30
a=ssrc:1234567890 cname:user@example.com
a=ssrc:1234567890 msid:stream track
a=rid:high send
a=rid:medium send
a=rid:low send
a=simulcast:send high;medium;low
This approach embeds multiple quality layers within a single encoded stream, offering more flexibility than traditional simulcast.
WebTransport Integration
As WebTransport emerges as a complementary technology to WebRTC, we may see changes in how session descriptions are handled for these new transport mechanisms.
Beyond the Technical: The Human Impact of SDP
While SDP is highly technical, its impact extends beyond the code. The negotiation it enables has real implications for user experience:
Accessibility Considerations
The codec and quality negotiations facilitated by SDP directly affect accessibility. For users with hearing impairments, high-quality audio codecs with good noise cancellation can make the difference between understanding a conversation and missing critical information. For users with visual impairments who rely heavily on audio cues, maintaining audio connections even when video fails is essential.
Global Connectivity
SDP's flexibility allows WebRTC to adapt to widely varying network conditions around the world. During a project connecting rural healthcare workers in Southeast Asia, I saw firsthand how the negotiation process would automatically select lower-bandwidth codecs in areas with poor connectivity, ensuring that communication remained possible even in challenging conditions.
Energy Efficiency
The negotiation process also has implications for device battery life. By selecting appropriate codecs and resolutions based on device capabilities, SDP helps optimize power consumption—a critical consideration for mobile WebRTC applications.
The Art of SDP Mastery
After years of working with WebRTC, I've come to see SDP not just as a technical necessity but as an art form. The ability to craft and manipulate session descriptions effectively is what separates basic WebRTC implementations from truly robust, adaptable applications.
When I mentor new WebRTC developers, I always emphasize the importance of understanding SDP. While it's tempting to treat it as a black box, the developers who take the time to master SDP gain a superpower—the ability to diagnose and solve problems that leave others mystified.
As you continue your WebRTC journey, I encourage you to embrace SDP rather than avoid it. Look at the SDP messages your application generates. Experiment with modifications. Use the debugging tools available. The investment will pay dividends in the form of more reliable, flexible, and powerful real-time applications.
In our next article, we'll explore another crucial aspect of WebRTC security: DTLS and SRTP. We'll see how these protocols ensure that your WebRTC communications remain private and protected, even when traversing the public internet.
---
This article is part of our WebRTC Essentials series, where we explore the technologies that power modern real-time communication. Join us in the next installment as we dive into DTLS and SRTP in WebRTC.
