When you speak on the phone, it’s natural to believe that it’s a private conversation between you and the person on the other end of the line. However, do you know what mechanisms are employed to deliver this confidentiality? And how confident can you truly be in these security measures?
In this article, we take a detailed look at Voice over Internet Protocol (VoIP) security technologies and how they keep conversations confidential.
Anatomy of a VoIP telephone call
To understand how VoIP security works, we must first understand how VoIP conversations are implemented. Unlike data communications, which consist of single streams of data moving from source to destination, VoIP communication consists of two distinct communication streams that work together for each call.
The first form of communication is the exchange of signaling information, which is the communication between IP phones, VoIP apps, desktop VoIP endpoints, voice gateways and IP PBXs. This signaling data deals with the initiation, maintenance and teardown of the telephone connection itself.
Signaling allows these devices to perform all the functions they were designed to perform, including making a phone ring and routing a call to the appropriate device using the touch-tone dial pad. It also enables devices to execute features such as call display, call transfer, call waiting, call hold and many others.
Signaling is performed using Session Initiation Protocol (SIP). Since it deals with signaling, SIP doesn’t carry the voice data itself. Instead, it enables VoIP endpoints, devices and servers to orchestrate a successful telephone call and successfully implement all the aforementioned features.
The second vital component of a VoIP telephone call is the stream of data that carries the actual voice packets. When you speak, a VoIP endpoint digitizes your voice and sends the packets containing your voice information to the recipient’s VoIP endpoint, where it is reconstructed and reproduced.
Once the signaling establishes a simple VoIP call, voice packets are exchanged directly between the VoIP endpoints themselves, primarily using the Real-time Transport Protocol (RTP) and RTP Control Protocol (RTCP).
The following diagram illustrates the various communications streams that exist during a typical VoIP phone call between two IP phones via an IP PBX on the local network.
As you can see, voice packets are exchanged directly between the IP phones and do not go through the PBX. The IP phones exchange signaling information but may also maintain a separate SIP session with the IP PBX, depending upon the employed features.
Keep in mind that this is a simple scenario. Imagine a case where a call goes to the PSTN via a voice gateway, or there is a conference call between four IP phones and a fifth participant on the PSTN. You can immediately see how more complex communications arrangements can involve multiple sessions of both types between multiple devices.
Implications for VoIP security
Knowing how VoIP communication takes place and understanding the various components of a VoIP conversation can help you recognize the various vectors a malicious attacker may use to compromise a conversation.
When dealing with security on network services such as VoIP, an attack vector is a specific path, method or tactic that a malicious individual may use to compromise security. As you might imagine, two fundamental attack vectors exist for a VoIP conversation: those targeting the signaling information and those that go after the voice packets.
Signaling attack vector
When attacks target the signaling of a VoIP call, they attempt to disrupt the VoIP service itself. By interfering with SIP communication, they can cause current calls to fail, cause features such as conference calls or call forwarding to cease, or even prevent a VoIP endpoint from registering with its IP PBX.
Such attacks can target single end-devices but most often are used to target IP PBXs or gateways, which can potentially disrupt the service for dozens or even hundreds of users.
In addition, the signaling attack vector can also be used to conduct toll fraud, where SIP sessions are hijacked in such a way that the IP PBX believes that the attacker’s VoIP endpoint is a legitimate VoIP device. Once this occurs, the attacker can use the IP PBX’s resources to make unauthorized calls.
A worst-case scenario would be an attacker attempting to resell the hijacked resources to unsuspecting third parties as a telephony service, incurring even more charges for the victim.
Voice attack vector
The most obvious and common goal in an attack on the voice session of a VoIP conversation is eavesdropping. Indeed, packet sniffers such as Wireshark, a freely available network diagnostics and troubleshooting tool, can capture and save voice packets that traverse a network. These packets can be later reconstructed and listened to as a complete audio record of the conversation that took place.
Employing VoIP security
Fortunately, it is relatively difficult to attack VoIP even if you don’t configure any security. That said, the risks are there, so it makes sense to take precautions that can thwart even the most tenacious and persevering attacker from disrupting your VoIP communication.
Protecting SIP sessions
The best way to prevent your VoIP services from being hijacked is to use Secure SIP, a secure version of the original SIP protocol defined in RFC 3261. Secure SIP employs the high-grade Transport Layer Security (TLS) cryptographic protocol. This is widely used in many applications, including the well-known HTTPS, used for securing web browsing.
You may have initially come across HTTPS when using a web banking service or making an online payment, but these days, most web pages leverage HTTPS. The same principle used to secure these sites keeps SIP signaling between VoIP devices safe.
Protecting voice packets
In a similar fashion, protecting the confidentiality of VoIP packets can be achieved using secure versions of the RTP and RTCP protocols. Predictably, these are called Secure RTP (SRTP) and Secure RTCP (SRTCP), and both are defined in RFC 3711. These protocols encrypt voice packets using the Advanced Encryption Standard (AES) cipher so that they remain unintelligible to the attacker even if they are intercepted in transit.
The practicality of implementation and alternatives
In most cases, employing Secure SIP, SRTP, and SRTCP on your VoIP services does not require you to do any coding or complex configurations. The VoIP services that you use as a business, whether cloud-based, server-based, or appliance-based, may inherently include them in their implementations.
Alternatively, they may give you the option of enabling them whenever needed, usually via a checkbox or an option within a control panel. Ask your VoIP provider about these protocols, whether they support them and how you can enable them.
Remember that enabling these features will add some overhead to your servers and the network. This overhead is negligible in most cases, but sometimes you may find that it impacts performance and voice quality.
Some alternatives to these protocols include the use of a firewall or session border controller (SBC) to protect the internal enterprise network from external attacks. However, while this is effective, it cannot protect from attacks initiated from inside the enterprise network or aid in protecting remote workers.
Another alternative is to create an encrypted VPN tunnel between each endpoint and the IP PBXs to protect signaling and voice packets. This requires that the endpoints support such security features, so you should discuss this further with your VoIP vendor.
Most VoIP providers deliver services that are already airtight in terms of security. However, it is important to do due diligence and ask the right questions to ensure that you are as safe as you can be, including verifying that all security features that can be enabled are turned on. Knowing how VoIP security works will make you that much more effective when negotiating with your VoIP provider.
You may also like: