.jpg?width=1200&height=630&name=voice-quality1200x630(b).jpg)
Market trends continue to favor cloud-based UC deployments over traditional on-site call control, but users still demand PSTN-like availability, reliability, and service quality. What many organizations that move to the cloud quickly realize is that the move does not automatically guarantee better call quality—the usual culprits that cause both voice and video quality issues are still very much at play.
In this article, we explore the issues that lead to poor VoIP and UC communications quality in cloud-based deployments and how specific technologies and devices can be used to ensure consistently high-quality services.
Why call quality changes when UC moves to the cloud
There are many operational advantages to cloud-based UC and VoIP services, such as a high degree of user mobility, simplicity of deployment, and ease of scalability. Moving from a more traditional LAN-based communication system to a cloud-based VoIP or UCaaS solution provides benefits, but also challenges that are not immediately apparent.
End devices must register with a server that is physically much farther away. This means that transmissions must traverse more hops, more infrastructure, and multiple providers, and this introduces more points of failure and a loss of end-to-end visibility. The issue has the potential to exacerbate the traditional culprits associated with bad call quality and to make troubleshooting such problems much harder in cloud environments. More traditional troubleshooting strategies don’t easily scale in this new paradigm due to the necessary traversal of several third-party networks.
Why voice suffers more than video
Although the cloud delivers many innovations, it is still subject to the same network phenomena that cause poor call quality. These are latency, jitter, and packet loss. That said, although these factors affect both voice and video, they have a greater impact on the intelligibility of voice communications than on the video component of UC. Video is much more forgiving for two reasons.
First, video in most communications is a “nice-to-have” component. It’s not critical to convey your message effectively because the words convey the information. The video adds facial expressions and body language, which is valuable but not vital for understanding. This is why poor video quality is often tolerable to participants, and communication can still take place just fine as long as the voice is heard loud and clear, even if occasionally choppy video or frozen frames exist.
Second, the technical nature of video is such that it can withstand greater network instability. Minor frame loss may go unnoticed by participants in mild cases and only become disruptive in more severe cases. Video can also cope with these factors and preserve visual contact (albeit at a lower quality) by concealing or interpolating the missing information and by automatically reducing the resolution and frame rate. This can also be achieved with voice by reducing quality, but it has significantly less tolerance, as speech becomes unintelligible much more quickly.
While the quality of voice is much more susceptible to degradation due to network phenomena than the quality of video, that does not mean that we should not care about video quality in UCaaS systems. It does, however, provide us with a deeper understanding of the behavior of cloud-based VoIP and UCaaS systems, as well as the tolerances required for each. It also allows us to approach resolving such issues more precisely.
The three main causes of poor call quality
While the cloud delivers many improvements, it is still subject to the same network phenomena that cause poor call quality: latency, jitter, and packet loss. Let’s take a closer look at each of them.
Latency is simply a measure of how long it takes a bit of data to travel across the network from source to destination. In the context of UC and VoIP, it refers to the time it takes for a packet to travel from the speaker’s device to the listener’s device. Depending on the type of communication, the path travelled may be from the source, through the cloud-based server, to the destination device, or it may be directly from the source to the destination. In either case, because the source, destination, and cloud server can potentially be anywhere in the world, and the communication path typically travels over the internet as well as other local networks, latency may simply be a factor of physical distance. Latency is also affected by network congestion, which may occur anywhere along the communication path.
Jitter is technically described as “the deviation from true periodicity of a presumably periodic event.” In networking, it describes the variation in packet arrival times when packets should be arriving at consistent intervals. Jitter is a natural phenomenon that occurs with all packetized data and is typically dealt with using buffers on the receiver device. If it becomes too pronounced and exceeds the capabilities of the buffer, it affects the sound of the voice, making it sound metallic, echoey, or choppy with gaps. For UC systems, it can also cause audio-video desynchronization.
Packet loss occurs when voice or video packets never reach their destination, resulting in missing pieces of audio or video data. In VoIP and UC, whether on the cloud or not, this can cause speech to sound clipped, distorted, or robotic, and it may lead to missing syllables or brief silences. Even small amounts of packet loss can significantly impact call quality because real-time media has little ability to retransmit lost data without introducing significant delay.
The following tables summarize the industry-standard thresholds for these phenomena for both VoIP and UC video.
Thresholds for audio (VoIP)
| Metric | Excellent | Acceptable | Poor /Noticeable issues |
| Latency (one-way) | < 150 ms |
150-300 ms |
> 300 ms |
| Jitter | < 20 ms | 20-30 ms | > 30 ms |
| Packet loss | < 0.5% | 0.5-1% | > 1-2% |
| MOS (mean opinion score*) | 4.0-4.5 | 3.5-4.0 | < 3.5 |
* MOS is a voice quality rating system for telecom engineering. It can also be affected by the codec used.
Thresholds for video
| Metric | Excellent | Acceptable | Poor /Noticeable issues |
| Latency (one-way) | < 150 ms | 150-400 ms | > 400 ms |
| Jitter | < 30 ms | 30-50 ms | > 50 ms |
| Packet loss | < 1% | 1–3% | > 3–5% |
| Frame loss | Minimal | Some tolerated | Frequent freezes |
How to diagnose voice issues in the cloud
Diagnosing voice and video quality issues in the cloud starts with one essential requirement: visibility. Cloud-based VoIP and UC communications span multiple third-party networks, providers, and geographic regions, making it harder to pinpoint where problems originate. This is why having end-to-end visibility, from the core network to the end user’s device, is more important than ever. High visibility can be achieved using a suitable network monitoring system that focuses on the aforementioned metrics that impact user experience rather than just basic uptime.
Combining synthetic monitoring with data about real users provides a more complete picture, allowing teams to detect potential issues before users are affected. Proactive testing helps prevent problems instead of reacting to them after complaints arise. Correlating network events with reported voice issues enables faster root cause analysis and more targeted problem resolution, reducing downtime and improving overall UC and VoIP system reliability.
How QoS, SBCs, and SD-WAN mitigate low quality
QoS: Protecting real-time traffic
The term “quality of service (QoS)” describes a set of mechanisms designed to deliver traffic prioritization to real-time applications such as VoIP and UC. QoS helps ensure that packets of these services experience minimal delay, jitter, and packet loss within a controlled network by employing packet marking, queuing, and prioritization methods. However, QoS alone is often insufficient in cloud environments, as traffic frequently traverses third-party “uncontrolled” networks where QoS policies cannot be enforced end to end.
SBCs: Stability, security, and control
Session border controllers (SBCs) play a critical role in maintaining the stability and security of VoIP and UC services. Typically placed at the edge of the enterprise network, an SBC can also be deployed as a virtual machine or container inside the cloud itself.
SBCs help manage SIP signaling to prevent interoperability issues and can aid in protecting against threats such as call flooding and toll fraud. In conjunction with network monitoring systems, they can also provide visibility into call flows, signaling behavior, and media performance, providing a significant source of information for troubleshooting and allowing better overall control of real-time communications.
SD-WAN: Intelligent routing for voice and video
For multi-site deployments, SD-WAN can help enhance voice and video quality by dynamically selecting the best available network path based on real-time performance metrics. It can route traffic away from congested links toward paths with lower latency, jitter, and packet loss. Many SD-WAN platforms also include mechanisms and techniques specifically designed for VoIP and UC, such as forward error correction and packet duplication, along with fast failover capabilities to minimize call disruptions.
Conclusion
For cloud-based VoIP and UC services, latency, jitter, and packet loss remain the primary threats to call quality, with voice being especially sensitive. Combining end-to-end insight with technologies like QoS, SBCs, and SD-WAN allows organizations to protect real-time traffic, resolve issues more quickly, and provide a more consistent communications experience.
You may also like:
Catch VoIP problems before they happen with synthetic monitoring
Minimize jitter, latency and other UC issues through proper routing
Voice network security and troubleshooting




Comments