Unlike conventional telephony infrastructure, which was developed and tailored to be used exclusively for voice, today’s telecom networks accommodate a multitude of different data traffic types, of which VoIP is just one of many. Because of this, it can be challenging to adapt a network to accommodate voice.
One of the most common challenges involves a technology called Network Address Translation (NAT), which for data networks has been a godsend, but if not configured carefully, could cause problems for voice applications. In particular, NAT is a common cause of one-way and no-way audio on VoIP calls.
In this article, we dive into how NAT can impair voice sessions, cite some common symptoms that indicate NAT may be at the root of your call audio problems, and address how to resolve the issues.
Signs that NAT could be the issue
One tell-tale sign that NAT is at the root of this issue is if there is no audio on a call that has nonetheless been successfully connected (i.e., the call timer is counting as if the call is in effect, but there is no sound). Another symptom is when the one-way or no-way audio only occurs on incoming calls, but not on outgoing calls or internal calls to other extensions on the same network. Yet another clue could be if the audio works fine on internal calls within a single location, but not on internal calls with extensions at another location, even though it is managed on the same enterprise network.
What is NAT and why is it important?
It has been widely stated for more than a decade now that the IPv4 address space on the internet is being depleted. Today, there are many more internet-connected devices in the world than there are IPv4 addresses. The solution to this address depletion is the introduction of IPv6, with a much larger address space and other improvements. However, the transition from IPv4 to IPv6 is expected to take years, if not decades. June 6, 2012 was designated as World IPv6 Launch Day, the day where IPv6 was permanently enabled on the internet. At the time of this writing, over six years later, less than 25% of the internet has transitioned to IPv6.
For this reason, a more immediate and readily available solution had to be found. This is where NAT comes in. NAT is a method of remapping one IP address space onto another by modifying network address information in the IP header of packets while they are in transit across a router. NAT is capable of mapping multiple IP addresses to a single IP address, thus allowing tens or even hundreds of hosts to share the same IP address. This also allows internal IP addresses (i.e., those assigned to computers and devices within an enterprise network) to be reused multiple times by many enterprises across the world. The result is a vast savings in the number of IP addresses needed to deliver internet connectivity.
The following diagram further describes this functionality:
Notice that there are three enterprises, all of which have the same internal IP address ranges of 192.168.1.1 to 192.168.1.254. Each has a NAT router that has a single external IP address of 220.127.116.11, .42 or .43. All internal devices on each enterprise, when communicating with the internet, share a single IP address. For example, the web server that receives requests from multiple devices in Enterprise Network 1 will see all of them as if they are coming from a single IP address of 18.104.22.168.
This is phenomenal, because if each enterprise has 100 internal devices, then 300 devices are granted access to the internet using only three external IP addresses, providing a vast savings in the usage of addresses.
The internal addresses that can be used and that are reserved for this purpose are found within the following ranges:
- 10.0.0.0 to 10.255.255.255
- 172.16.0.0 to 172.31.255.255
- 192.168.0.0 to 192.168.255.255
These addresses, called private IP addresses, are defined in the Internet Engineering Task Force’s RFC 1918 document, and cannot exist on the internet. They are reserved only for internal usage on private networks.
Despite NAT’s incredible benefits, it does have some drawbacks, especially when it comes to voice. Troubleshooting voice over NAT is not as straightforward as troubleshooting other data communication issues involving NAT. Phone conversations may behave strangely, and unless you’re familiar with how VoIP and SIP function, these behaviors can be difficult to interpret. With that in mind, let’s take a closer look at how VoIP functions, and specifically, how the Session Initiation Protocol (SIP) operates.
The idiosyncrasies of SIP
When a VoIP phone call is initiated, the SIP protocol will send various control packets between the initiator of the call, the SIP server (which is sometimes referred to as the VoIP PBX or IP PBX) and the destination device. These control packets manage things like ringing, dial tone, DTMF tones and other signaling functionalities. So SIP’s role is to initiate, modify, and finally tear down telephone conversations.
SIP is only a signaling protocol – it doesn’t actually carry the voice of a telephone conversation. To transfer voice between VoIP endpoints, SIP works in tandem with other protocols that transmit the voice information as payload. These include Real-time Transport Protocol (RTP) and RTP Control Protocol (RTCP), both of which are User Datagram Protocol (UDP)-based protocols. This means that SIP message exchange and voice packet exchange occur over two separate sessions, or channels. They are essentially two independent communication streams between endpoints that are distinguished by the transport layer port numbers used.
Another somewhat counter-intuitive aspect of SIP, and of VoIP in general, is the fact that the actual voice traffic travels directly from endpoint to endpoint without having to physically pass through the IP PBX.
The following diagram depicts this fact in a scenario where two IP phones within the same enterprise communicate with each other.
During a conversation between the two IP phones, SIP messages are exchanged between the phones and the IP PBX that deal with call setup, call teardown, DTMF tones, as well as other call control functions. SIP messages are also exchanged between the two phones directly. These messages include functionalities such as codec choice and off-hook and on-hook messages. And finally, the actual voice packets travel between the two phones directly, without the aid of a VoIP or SIP intermediary device.
How NAT operates
For communications between end devices within an enterprise network, there are no issues concerning NAT. NAT is primarily employed at the network edge: at the location where the internal network meets the internet. NAT will only affect voice conversations that take place between an internal device and an external destination, where the actual voice traffic traverses the NAT router.
Whenever NAT is employed, any type of communication between internal and external devices can only take place when the internal device initiates communication. When this occurs, a NAT translation is created within the NAT router and all incoming traffic that is a response to that initial communication is allowed to come back through (a "response" is defined by the fact that the same port is used). Any communication that is initiated from the outside will be blocked.
In the above diagram, the web server’s communication is able to pass through the NAT router because it is a response to the initial request which was initiated by the laptop on the inside of the network. The request from the internet user, however, is being denied because it has been initiated from the outside, so it has no active NAT translation allowing it through as a response.
Of course, there are methods by which communication that has been initiated on the outside can be allowed through a NAT router to reach the desired internal device. However, these methods must be carefully configured for the voice to function correctly.
How NAT affects voice
Typically, if you want to allow communication from the internet to enter into a NAT environment, you can enable a feature called port forwarding. You can specify which TCP or UDP ports permit externally initiated communications to “punch through.” SIP functions using TCP or UDP ports 5060 and 5061. So, a simple and intuitive solution would be to allow ports 5060 and 5061 to be port-forwarded internally.
In the following diagram, the external voice endpoint is attempting to communicate with the internal IP phone. The external endpoint will communicate with 22.214.171.124 on port 5060 and 5061, and with the appropriate configuration of port forwarding, this will be translated by the NAT router and directly forwarded internally to the SIP server.
The SIP server receives the initial communication and sets up a call control session between it and the external voice endpoint. In this call control session, the SIP server is informed of the destination endpoint. So, the SIP server communicates with the internal IP phone and causes it to ring. The user picks up and the call is connected.
So far, we have successfully connected the call, but voice packets have yet to be exchanged. When the external endpoint begins to send a stream of voice packets, this is viewed as a new session initiated from the outside, so NAT blocks it. Additionally, the internal IP phone begins sending IP packets, again seen as a separate session because it is using a different port from the inbound packets, and these do pass through the NAT router since it perceives the communication as being initiated from the inside.
This results in a classic one-way voice scenario where SIP packets go through, internal to external voice communication also successfully goes through, but, external to internal communication does not. In this case, it would not be feasible to employ the same port forwarding on the voice stream as we did for the SIP session, because voice is carried by the RTP protocol, which unlike SIP, typically uses UDP port numbers in the range between 1024 and 65535. Port-forwarding all of these ports is definitely not an option, as this can cause port depletion for other applications using NAT, not to mention a severe security risk, essentially opening the door to malicious hackers.
There are variations to the above scenario where calls seem to be connected, where the session clock on the screen of the phone is indeed counting seconds, but no voice can be heard in either direction. Other situations may involve calls not being completed at all. Regardless of the variations of the potential problems and their symptoms, the behavior of the phones and the network can be more readily interpreted when the above peculiarities of VoIP, SIP and RTP are understood.
Additional NAT scenarios
Some additional scenarios to be aware of when troubleshooting NAT and VoIP issues include:
Some IP PBX vendors supply NAT functionality within their IP PBX device. This means that IP phones can exist behind the IP PBX using a private subnet, and the IP PBX itself performs the NAT translation without the need for an additional NAT router.
In most enterprise networks, NAT is already being performed by the edge router, a firewall or the ISP router itself. Depending on how the IP PBX has been installed in the enterprise network, this may result in NAT being performed twice. So, voice communications between internal telephony devices and the PSTN are traversing a NAT device twice before reaching the internet.
While this setup can indeed function if configured correctly, it introduces more variables and increases complexity, especially when attempting to troubleshoot voice issues over NAT. It is generally good practice, whenever possible, to eliminate one of the two NAT operations in order to simplify both the network and troubleshooting procedures. Most IP PBXs can be configured to function either with or without NAT.
Remote telephony devices
There are also scenarios where the actual telephony devices are not found within the same enterprise network as the IP PBX. This means that communication between the IP PBX and the devices it serves will have to occur over NAT. This can be an issue when devices are registering as well as when using specialized applications such as three-way conferencing where voice packets have to communicate between many IP devices.
This scenario is often found in enterprises that have multiple physical sites, such as branch offices, where the remote telephones register to the IP PBX located offsite at corporate headquarters.
One way to bypass the issues of NAT is to create a Virtual Private Network (VPN) on the enterprise network that places all the physical sites on the same internal subnet, or a series of linked internal subnets. VPN tunnels assist in passing packets from site to site, thus reducing the involvement of NAT on the edge router.
Solutions to NAT-related VoIP issues
As we have established, a NAT router has no automatic method of determining the internal destination host for an incoming packet that has been initiated from the outside. Although port forwarding is a solution for SIP messages and for registering remote SIP endpoints, it is unfeasible for the voice portion of the call. Several solutions have been introduced as remedies for this. These include:
- UPnP internet Gateway Device Protocol (IGDP)
- Session Traversal Utilities for NAT (STUN)
- NAT hole punching
- Application Level Gateway (ALG)
The first three are more suited to smaller NAT gateways for home or small office settings. The fourth is more suited towards enterprise networks and is explained here.
SIP Application Level Gateway
A SIP ALG is not a standalone device, but rather a component of network edge equipment, such as a NAT router or a firewall, that performs customized NAT traversal. It is essentially a feature set that informs the NAT-performing edge device of applications, addresses and port numbers, and opens up port mappings dynamically as required. This allows legitimate application data, such as voice packets, to pass through the security checks of the firewall or the translation rules of the NAT router.
In order to achieve this, a SIP ALG performs the following functions:
- It allows communications, such as voice packets being streamed through a NAT router, to use dynamic TCP and UDP ports to communicate with the ports used by the internal voice device, even though a NAT configuration may allow only a limited number of ports using port forwarding. In the absence of an ALG functionality, either the ports would get blocked or the network administrator would need to explicitly open up a large number of ports in the NAT router, which as we have said renders the network vulnerable to attacks.
- It converts the IP address information found inside the payload to addresses that are acceptable by the hosts (i.e., the two devices communicating with each other) on either side of the NAT router.
- It synchronizes between multiple sessions of data between two hosts. For example, a voice conversation has a SIP session and an RTP session exchanging voice packets. The ALG will recognize the connection between these two sessions and manage them accordingly.
All of these features are made possible by a process called deep packet inspection. This is a type of data processing that inspects in detail not only the headers of the IP packet, but also those of the transport layer protocol and the actual content of the data, to verify their type and determine how they will be handled.
So how does ALG perform this NAT traversal without compromising network security? Basically, a NAT router with ALG can rewrite information within the SIP messages and can hold address bindings (i.e., the NAT address translations) until the SIP session terminates. Additionally, it will dynamically open up only the required ports for the RTP sessions to take place for the duration of the conversation, and it will replace the private IP addresses and ports in the SIP packet with the mapped public address and port of the host that sends the packet. All of these functions alleviate many of the problems associated with VoIP NAT traversal.
Issues with SIP ALG
But SIP ALG is not a panacea. Even though it is intended to assist users who have phones on private IP addresses, if it is implemented poorly, it can actually cause more problems than it solves.
If configured incorrectly, SIP ALGs can modify SIP packets in unexpected ways, corrupting them and making them unreadable, resulting in unexpected behavior from your IP telephony network. Some of these situations can include:
- A NAT device altering the address and port fields in the SIP response message, causing the message to fail to pass the SIP response verification
- Modification of the SIP headers may hide important information from the SIP server, causing registration or call initiation to fail
- Modifications in the Contact URI (user identifier) make the SIP client unable to detect its registered status
There are many similar situations that should be taken into account, and SIP ALG configuration and implementation should be verified in order to avoid such scenarios.
NAT is a useful and essential solution to the problem of IPv4 address exhaustion. At the same time, NAT is an inelegant solution for some applications such as VoIP and the protocols it uses, including SIP and RTP. Although IPv6 will do away with the need for NAT in the future, the extensive use of NAT has delayed IPv4 address exhaustion for many years to come. This means that IPv4 will be with us for the foreseeable future, and so will NAT.
Therefore, it is important to understand the idiosyncrasies of VoIP as it traverses NAT in order to more correctly implement it and more readily troubleshoot it. There are various solutions that can help such as SIP ALG, which if correctly and carefully implemented, can solve many of the most common problems.
You may also like: