Wednesday, September 19, 2007

PAT/NAPT Firewalls and Quake III Servers

Full PAT (Port Address Translation), aka NAPT, is commonly used on higher end firewalls and will reek havoc on any attempt to run a Quake III public server behind one. This also affects the many open source Quake III derivitives, such as Open Arena, and many commercial games based on the Quake III engine. The problem is not with PAT itself, it's the way in which the master server communication works. A Quake III server run in public mode (+set dedicated 2) will be unable to properly list itself on the master server list.

The process for a server listing itself in the master list is a three step process.
First the Quake III server sends a UDP heartbeat packet to the master server letting it know it's there so it can be listed. The master server will then send a query packet to the Quake server telling it to send back it's game information, such as the server name, the current number of players, and the maximum number of players.

Provided the NAT'ing device has the proper forwarding rules in place, the problem starts when the heartbeat packet is sent. Let's say the server is running on port 27960, the master server's IP address is 1.1.1.1 and is listening on port 22000. The heartbeat packet is sent on source port 27960 to the master server (1.1.1.1:22000). The PAT device will then change the source port to a random port, let's say 40000. The master server sees the source port of the heartbeat packet as being port 40000 and thus assumes the server to be listening on that port. When the master server sends its query packet to port 40000, the PAT device knows that traffic from 1.1.1.1:22000 to port 40000 should be sent to port 27960 on the Quake Server. This works just fine, the heartbeat reaches the master, the master's query packet reaches the server, and the query response reaches the master.

The problem then shows itself when the game clients grab the master list. After receiving the list, the clients perform a quick connection to each server in the list to determine their latency with the servers. The clients think the server is running on port 40000, which is in fact not the case. The PAT device drops the packets from the client since it only knows to translate the traffic for the master server's IP. The server never shows up on the in game list, or will be listed as unresponsive.

This poses no issue if the client were to manually connect to the server with the proper port.

A simple fix would be for the heartbeat packet to contain the port that server is running on locally (i.e. Quake III net_port variable), instead of the master server relying on the source port of the heartbeat packet. The additional bandwith and processing overhead introduced by this should be completely irrelevant. The required modification to the Quake III engine is a one line fix, however the master server code, at least in the case of dpmaster, will need more extensive modification.

No comments: