VoIP protocols, part 1
When we talk about VoIP, the most prominent and widely used protocol is Session Initiation Protocol (SIP). But SIP doesn’t operate alone. There are many other protocols involved in IP telephony that function alongside SIP or even in the place of it. It is important for a telecom engineer to be familiar with these protocols, to know what they do and how they can be leveraged on the telecommunications network.
In this article, we’ll look at the companion protocols that work in conjunction with SIP. In a future article, we will focus on protocols that can be used as alternatives to SIP.
Quick review of SIP
To see where these protocols fit in, let’s quickly review SIP and how it works.
SIP is a protocol used by both voice and video endpoints to provide call setup, signaling, and call teardown for communication sessions. A VoIP endpoint can be a physical IP phone, a softphone, a videoconferencing terminal, a voice gateway, or even a gaming console – essentially any device capable of enabling voice and/or video communication. All of these endpoints register to a SIP server, also known as an IP PBX, which is used to coordinate advanced features such as call transfer, call hold, music on hold, and other traditional and enhanced telephony features.
To enable even more specialized services, such as call routing, call queuing, Interactive Voice Response, multiparty voice and video conferencing, integration with web services, and interconnectivity with the traditional PSTN, additional servers and devices can be added to a VoIP network.
Keep in mind that SIP does not carry the voice or video itself. Instead, it operates in conjunction with several other protocols that carry the session media. Let’s look at these in more detail.
Companion protocols that work in conjunction with SIP
The following diagram shows the various protocols that are used in conjunction with SIP in a typical VoIP telephony conversation. These protocols are mapped to their relative layers within the OSI model framework.
Actually, the OSI model has no official definition for SIP or the companion protocols we’re reviewing here (SIP, SDP, RTP, and RTCP), so they don't actually fit into the OSI model. But because the OSI model is so widely used, we can say they sit on top of the Transport Layer. This means that SIP is independent of the protocols used at lower layers, so it can work with any Transport Layer protocol. UDP is usually employed, although if SIP messaging is secured using Transport Layer Security (TLS), then TCP is utilized for the SIP signaling.
Session Description Protocol (SDP) – While SIP is used for VoIP endpoints to exchange signaling information, SDP is employed to describe multimedia sessions. Specifically, it enables endpoints to negotiate the media type, format, and all of its associated properties. SDP does not carry the media itself, nor is it sent via any Transport Layer protocol. Rather, it is included as a payload of the SIP messages themselves.
Real-time Transport Protocol (RTP) – RTP carries the actual media streams, whether voice, video or both. While SIP establishes connections across the network, RTP transports the actual voice packets over the provisioned connections. Keep in mind that unlike SDP, which is a payload of SIP, RTP sessions run independently of and in parallel to SIP sessions, and are controlled via those SIP and SDP sessions. RTP typically runs over UDP and always functions in conjunction with RTCP.
RTP Control Protocol (RTCP) – RTCP is often referred to as the sister protocol of RTP. RTCP collects out-of-band statistics and control information for RTP sessions. Out-of-band just means that the exchange of this information occurs in a separate, parallel session rather than within the RTP media stream. The purpose of RTCP is to provide feedback on the Quality of Service (QoS), including statistics such as packet counts, packet loss, jitter, and round-trip delay time. This information is shared between endpoints, which can react to these changes by limiting packet flows or by changing to another available codec. Some IP phone models can display network statistics either within an embedded web browser on the phone itself, or on the phone’s LCD screen. Information such as codec, jitter, received packets, and lost packets can be tracked. This data is all collected using the RTCP protocol. Like most other VoIP protocols, RTCP also operates with UDP as its Transport Layer protocol.
Transmission Control Protocol/User Datagram Protocol (TCP/UDP) – Unless you explicitly configure them otherwise, all of these companion protocols use UDP as the underlying Transport Layer protocol. UDP has much lower overhead than TCP because it doesn’t have any flow control, error checking, or packet ordering mechanisms and thus is more suitable for transporting media. A steady flow of data is much more important than a correct ordering of packets or perfect packet arrival, because voice and video are very forgiving up to a certain level of such irregularities. Half a lost syllable or an incorrect pixel on a single frame is of no consequence in a voice or video conversation. This is why, if TCP is ever used, it is for signaling protocols and only very rarely for media-carrying protocols such as RTP. Instead, the media will be carried on UDP, even when TCP is used to carry the signaling packets.
The following diagram puts these protocols into some perspective within the framework of a VoIP voice call between two endpoints:
Conclusion
SIP is a revolutionary protocol that delivers phenomenal flexibility and capabilities to voice and video communications. Nevertheless, it does not function alone. Knowing and understanding its companion protocols is essential to many aspects of network management such as making procurement decisions, troubleshooting, or network optimization.
You may also like:
The wonderful world of voice codecs
These facts about VoIP may surprise you