USG with site to site VPN drop calls after 30 seconds

New information: only a problem when calls originate from the new building.

Today I'm at the original main site that has all the phones and the "PBX" Ring-u hub. Testing the calls and doing packet captures. If I call from the old building to the new the call stays on. When calling from the new building to the old the calls drop at 30 seconds as noted.

I did packet captures from the phone at the old building. I may go ahead and grab packet captures from the new building phone.
 
New building?

Did you exempt that IP range from NAT correction?

*Edit* I can't seem to find those settings in my 3CX anymore, but most VoIP PBXs need to be told about new IP ranges that are "internal" so it knows to not do NAT corrections. Also, the firewall/router on the far side needs to know not to do NAT corrections for SIP too.
 
Last edited:
New building?

Did you exempt that IP range from NAT correction?

*Edit* I can't seem to find those settings in my 3CX anymore, but most VoIP PBXs need to be told about new IP ranges that are "internal" so it knows to not do NAT corrections. Also, the firewall/router on the far side needs to know not to do NAT corrections for SIP too.
Sorry if I posted confusing information. This is not on 3CX. It's Ring-U. Ring-U is what the customer uses. I set up a home 3CX PBX to test some things out.

New building = new site for customer. They used to have one building but recently opened a new location about a mile away. Using the same cable ISP. Have a site to site VPN set up using UniFi Security Gateways and their controller. Original site is 192.168.111.0 and new site is 192.168.112.0. My understanding is that NAT would not be involved, but I may be wrong.

Ring-U is apparently Asterisk based. The login I've tried presents me with a really basic interface, so I can't find any settings like that, they're all user level type settings.
 
So the PBX is cloud hosted then?

*Edit* Googling didn't wind up in a happy place...


Looks to me like these guys intentionally nerfed the PBX into LOCAL and INTERNET, which is similar to how 3CX works.

SO, if you're riding SIP over a VPN tunnel YOU WILL HAVE PROBLEMS! Because NAT ISN'T INVOLVED, and yet the PBX is expecting it. Contact their support, see if they have an advanced mode that lets you tell asterisk about the 192.168.112.0 IP range, and treat it as a "local" IP range. Given the data in this tread, I'm all but certain that NAT automagic BS is the source of your pain. I've seen this crap before, and fixed it many times. But this was with "older" "harder to use" PBX servers that actually required an admin, not this automagic crap we have today. Don't get me wrong I love the new toys, so much easier... but at times they drive me up a wall.

P.S. Make sure you TURN OFF Unifi's SIP Contrack Module...


That's otherwise known as a SIP NAT Helper, and the PBX needs to be doing this or very bad things happen. Automagic NAT SIP insanity in two places? That's how you get grey hair... TURN THAT JUNK OFF! (Note turning this off might fix your current woes actually... if the PBX has any intelligence at all anyway)
 
Last edited:
So the PBX is cloud hosted then?
No, it's a box on their local network 192.168.111.0

So, this link doesn't really apply? They're mentioning site to site VPN but only giving help on a port forwarding setup.

SO, if you're riding SIP over a VPN tunnel YOU WILL HAVE PROBLEMS! Because NAT ISN'T INVOLVED, and yet the PBX is expecting it.
Makes sense. I wanted to look myself, but the web UI was super minimal, not a full blown Asterisk menu.

P.S. Make sure you TURN OFF Unifi's SIP Contrack Module...


That's otherwise known as a SIP NAT Helper, and the PBX needs to be doing this or very bad things happen. Automagic NAT SIP insanity in two places? That's how you get grey hair... TURN THAT JUNK OFF! (Note turning this off might fix your current woes actually... if the PBX has any intelligence at all anyway)
Yep, that was turned off early on in the process.
 
Hmm... Does the branch location have a static IP address? Because it seems like the easy solution is an IP limited port forward on the host side, and just don't use the tunnel. Then the PBX will adjust for NAT as expected, and things work in automagic land.
 
I called them. I asked them if they could set it so 192.168.112.0 was considered local.

They said they think the main site's USG is rejecting some of the traffic. They mentioned "SIP connection tracking" and I told them I couldn't find any such setting in the controller. I showed them the link about how to turn off SIP ALG, but they didn't seem interested in that setting.

They're going to send me some commands to run via SSH on the main site's USG.
 
SIP Connection Tracking is the SIP ALG feature. The link I aimed at above? See the mention of conntrack? Conntrack isn't some odd name, it's a feature of the Linux kernel that allows it to keep track of connections. Shocking I know right? Whomever named this thing must work for Microsoft now...

Anyway, having that feature ON also deals with NAT traversal, because the SIP NAT helper is a feature of the SIP Conntrack module.

So if they are trying to figure out something in the SIP Connection Tracking, it may be that they assume that module is ON. Have you tried enabling it on the host side, but leaving it disabled on the client side? You're in the weeds of tinkeritis here to find a combo that works.
 
SIP Connection Tracking is the SIP ALG feature. The link I aimed at above? See the mention of conntrack? Conntrack isn't some odd name, it's a feature of the Linux kernel that allows it to keep track of connections. Shocking I know right? Whomever named this thing must work for Microsoft now...

Anyway, having that feature ON also deals with NAT traversal, because the SIP NAT helper is a feature of the SIP Conntrack module.

So if they are trying to figure out something in the SIP Connection Tracking, it may be that they assume that module is ON. Have you tried enabling it on the host side, but leaving it disabled on the client side? You're in the weeds of tinkeritis here to find a combo that works.
I think that has been OFF during my testing. It may have started that way, not sure, but have only actively tested it while OFF.

Not sure what you mean by leaving it on on the host but off on the client? Which is which? Host is the USG at the main customer site and client is the USG at the new building?
 
Sorry it took me a couple days, I have not been in the office. Here's two pcaps in wireshark. The left is my local IP and the right is the IP of my PBX. It shows the flow of the calls back and forth. The Bye line is where the call is terminated.


This call I terminated from my desk set. You can see that on the second to last line in the call flows. This particular call used 5065 for my local SIP port and 5060 on the PBX, then used 14002 for RTP (audio stream) on my desk set and 9248 on the PBX.
Screen Shot 2022-03-09 at 5.29.48 PM.png

In this call flow you can see the SIP ports remained the same (that is a setting on the extension settings in my PBX) and the RTP ports have moved slightly to 14004 and 9252. This call was terminated from my cell phone which is why you can see the direction of the BYE command going the opposite direction.
Screen Shot 2022-03-09 at 5.30.23 PM.png

I would do a pcap from the failing phone and one from a good phone and compare the two or post them here.

Something else to note, and maybe I missed it above, but does this happen on both extension to extension calls and extension to outside world calls? We are a 3CX partner so that's what I'm most familiar with, but in the settings of 3CX, you can do a pcap from the server as well. That will show more calls as it will show the connection from the phone to the server and from the server to the SIP trunk. If this happens on internal to internal calls as well this step probably isn't necessary at this point.
 
@timeshifter You have two USGs right? One in front of the PBX and one at a remote location. Try turning the SIP Contracker ON in the USG protecting the PBX. Leave the SIP Contracker off on the other USG.
 
does this happen on both extension to extension calls
It happens on extension to extension calls, so it looks like a problem "internally" so to speak, but across their site to site VPN.

How do you get that view in Wireshark? I've looked around but I can't get the capture to display like you did.

You have two USGs right?
Right, one at each site.
Try turning the SIP Contracker ON in the USG protecting the PBX. Leave the SIP Contracker off on the other USG.
I'll give that a shot.

Here's what they they told me to do via email:

========================
Here is a list of modifications we need to confirm have been made via the GUI ,

-SIP ALG: Found under firewall settings. Must be disabled.
-SPI Firewall: Found under firewall settings. Must be disabled.
-UDP Timeout: Found under firewall settings. Usually set to 30 seconds by default. Should be increased to at least 300 seconds.
-SIP Transformations: Found under firewall settings. Must be disabled.
-Consistent NAT: Found under firewall settings. Must be enabled.

Alternatively, it may be time to involve Ubiquiti in this issue and get some insight from the brand themselves. The issue appears fairly clear cut with the traffic dropping in a consistent fashion, as I mentioned on the call earlier, it's a matter of having the right buttons checked etc in the right fashion for that firewall's model in order for it to treat the traffic properly.

The system appears stable in every other way, it's an issue with the USG and their engineers may have just the answer for you.
========================
 
Cool, thanks. I had displayed the VoIP calls before, but it was only one entry, didn't notice the additional buttons.
 
-SIP ALG: Found under firewall settings. Must be disabled.
-SPI Firewall: Found under firewall settings. Must be disabled.
-UDP Timeout: Found under firewall settings. Usually set to 30 seconds by default. Should be increased to at least 300 seconds.
-SIP Transformations: Found under firewall settings. Must be disabled.
-Consistent NAT: Found under firewall settings. Must be enabled.

SIP ALG being off makes sense... SIP Transformations being off makes sense... UDP Timeout being extended is a little odd, but the change they suggest is tolerable.

SPI being disabled doesn't make much sense... not sure what that's referring to. There's no way to turn off Stateful Packet Inspection... nor should you ever even consider doing so. Perhaps they mean IPS/IDS? If those things are enabled, the routers get SLOW.
 
Cool, thanks. I had displayed the VoIP calls before, but it was only one entry, didn't notice the additional buttons.
Welcome. If you have a chance pull a PCAP from the problematic phone connecting down the vpn tunnel, a good phone so we can compare and if possible pull one from the server too.

Is the PBX installed on a windows box or is it it’s own standalone unit they provide? If it’s windows you can install wireshark there and capture the Ethernet traffic there as well to give us the other side of the connection. We should be able to see where the SIP request to terminate the call is coming from or when the connection drops what exactly is happening.
 
There's no way to turn off Stateful Packet Inspection... nor should you ever even consider doing so.
That one kinda bugged me. They want me to turn off the firewall altogether? Really? Maybe if there was a way to disable the firewall on just the incoming traffic from the other site, but not sure you can do that.
If you have a chance pull a PCAP from the problematic phone connecting down the vpn tunnel, a good phone so we can compare and if possible pull one from the server too.
Will do, next chance I get. The PBX is in it's own standalone little box, not Windows.
 
Back
Top