Jitter is one of the most common causes of substandard voice communications over IP networks. This article provides an overview of jitter and how it influences voice applications, as well as methods for resolving it on your networks.
Jitter defined
Jitter is technically described as “the deviation from true periodicity of a presumably periodic event.” For example, imagine you are asked to beat a drum every second. If you are able to beat that drum at intervals of exactly one second, then the beating of the drum is experiencing zero jitter. Now imagine that occasionally, you beat the drum several milliseconds too early, or too late. In such a case, you are “deviating from true periodicity,” or in other words, experiencing jitter. The greater the deviation of each beat from exactly one-second intervals, the greater the jitter.
Jitter within the context of voice networks
In a VoIP conversation, voice packets are exchanged between IP telephony endpoints. When using G.711, one of the most common VoIP codecs, a voice packet is sent every 20 milliseconds. Other codecs send packets out at slightly different intervals. For G.711, ideally, these voice packets should arrive at the destination device constantly and consistently, 20 milliseconds apart. However, this is rarely the case. Because each packet is routed individually and independently through the network infrastructure, a certain amount of jitter is expected in the arrival times of the packets. The following diagrams illustrate this:
When there is no jitter, X consistently equals 20ms. When there is jitter, however, X ≠ Y ≠ Z.
How jitter is measured
Jitter on packet networks is also called packet delay variation. As such, jitter is indicated in units of time, and measured as the difference between the expected arrival of a packet and its actual arrival. Jitter is gauged on a per-packet basis, but the jitter measured on any single packet is not a helpful piece of information. Useful measurements of jitter for evaluating the quality of voice include average jitter and maximum jitter over the course of a voice conversation.
What jitter does to voice
Voice is an extremely time-sensitive application and requires a constant and consistent flow of packets to function correctly. When extensive jitter is experienced in a voice conversation, users will hear voice drop-outs and clipped words. The result is much the same as packet loss, but there are no actual lost packets. The packets are just delayed to an extent that they are too late to be incorporated into the reconstructed sound by the receiving device, thus leaving gaps in the sound.
According to most VoIP equipment vendors, an average jitter of 30ms is the maximum acceptable level of jitter you should experience for voice. Although voice conversations are still understandable when average jitter is less than 100ms, these are unacceptable levels of quality for production voice networks.
How to minimize jitter
There are various ways to mitigate jitter on a voice network.
Solid network design – When a network is designed correctly for the expected traffic and applications that will be using the infrastructure, the phenomena that cause jitter will be minimized. Appropriate levels of bandwidth oversubscription, inherent redundancy, and the implementation of load balancing are all aspects of the design that contribute to an acceptably low level of jitter for all network applications.
Quality of Service – QoS can never be understated as an important part of mitigating phenomena like jitter. No matter how well a network is designed, there are times when network traffic congestion will occur. When it does, QoS mechanisms should be configured to ensure that voice packets are given the appropriate priority, thus minimizing jitter. Remember to employ QoS at both Layer 2 (switch trunks) and Layer 3 (routed) links.
Employ jitter buffers – Even with solid network design and correctly employed QoS features, there is still a possibility of experiencing some level of jitter. For this reason, most VoIP endpoints employ what is known as a jitter buffer, sometimes called a de-jitter buffer. A jitter buffer is a type of memory found within the VoIP endpoint that receives, stores, and resends voice packets to the voice processor. It collects the voice packets at whatever level of jitter they arrive with and then slightly delays their forwarding so that they are sent to the voice processor at evenly spaced intervals, thus reducing jitter to zero. The following image illustrates this process:
As is the case with most things, you cannot get something for nothing. Jitter buffers will eliminate jitter, but will also introduce a delay in the playback of the voice that is sent to a device. A jitter buffer will only be able to compensate for jitter that is less than or equal to the delay it introduces. So to eliminate jitter of up to 50ms, for example, the delay the user will hear in the incoming voice must be at least 50ms. This is in addition to the propagation delay that is already experienced, which is simply the travel time of the packets from one endpoint to the other, which can be on the order of 100ms or more. Any total delay exceeding 150ms becomes noticeable and somewhat bothersome in voice conversations.
Jitter buffers, whose size can be configured on most VoIP endpoints, should be set up with a delay of around 30ms, which, as stated above, is the maximum acceptable jitter you should see on any correctly designed network. Any packet with jitter that exceeds this value will not be compensated for and will be lost, but this will be a small percentage of the packets in most cases and will thus be largely imperceptible to the user.
Conclusion
Jitter is a phenomenon that can be devastating to any time-sensitive network application, including voice. With a combination of a good network design, the employment of QoS features, as well as the appropriate jitter buffer configurations on end devices, virtually all instances of jitter can be eliminated from the voice network.
You may also like:
Troubleshooting poor voice quality on VoIP systems
Quality-of-Service must-haves for converged networks
QoS for VoIP networks: IntServ vs. DiffServ
Comments