Jump to content

Linux Server - randomly kicking all players


ilganna

Recommended Posts

Hello,

 

I have an annoying issue with my Linux dedicated server (LGSM), randomly it kicks all the players (including myself which I play on LAN) and I'm not sure of the cause. The issue is occuring 2-3 times per day (server restarts every 6 hours)

 

More background information:

 

I have the server running on a VM running CentOS 7.4 (LGSM), has 4 CPU assigned , 8GB RAM and plenty of SSD disk space.

 

The host OS is ESXi 6.5 running on a Supermicro SYS-5028D-TN4T, 8 core 16 threads Intel Xeon CPU D-1541 @ 2.10GHz, 64GB of RAM, 4 nics (2 Gbit, 2 10 Gbit), 4TB RAID 10 Sata HDD (Raid card LSI Megaraid 9341-8i) and a 1TB Samsung SSD dedicated to gameservers VMS.

 

Also the server is on a business class G.fast DSL (500Mbit dl / 100Mbit upl), and the issue is not related on possible WAN packet drops since on my case, I connect using LAN (other people connects via Internet). Also, I have other VMs running and they are not disconnected from the Internet.

 

I have uploaded yesterday's logs here:

 

https://pastebin.com/UfBY2VRh:

 

And as you can see from line 3700, the disconnects is happening.

 

Note that the server is running in vanilla mode, no mods or XML modifications of any kind.

 

World is RWG, seed is xa2019 and map size is 4096.

 

Thanks for any tips!

 

ilganna

Link to comment
Share on other sites

Well, one off thing is that you're running Cent OS 7.4, and the client reads it as unkown 3.7. That seems pretty odd to me, but then again, I've always had issues running it on Cent. (To be fair though, the last time I played with Cent was like A15. I pretty much stick to Ubuntu and Debian.) Not sure if that is an indication of an issue, or just the client not reading the OS data correctly.

 

When the server is starting up, it does get a bind exception when assigning ports. You'll want to verify that nothing else is using those ports already. (Could be as simple as a previously running client that crashed, and the process is still up.)

 

Depending on what port it is, the issue may not appear immediately. Sadly, the log doesn't say what port it giving the error.

 

If you fix the port issue, and are still having the problem, disable LineNetLib. I really don't know why, but it seems that each new network protocol Unity comes up with tends to cause issues for some people. The latest isn't nearly as bad as Unet was, but often resolving connection issues is as simple as disabling this protocol in the config.

 

You will want to remove SteamNetworking from the disabled protocols in the server config, as this will prevent people from connecting to the server, and from seeing it in the server list.

Link to comment
Share on other sites

Hello SylenThunder,

 

First of all thanks for your kind reply! :smile-new:

 

For what regards the OS version, this is right, as the OS reports the kernel version (my mistake, it is CentOS 7.6, 7.4 is what I use at work):

 

[root@gs02 ~]# cat /etc/centos-release

CentOS Linux release 7.6.1810 (Core)

[root@gs02 ~]# uname -r

3.10.0-957.21.2.el7.x86_64

 

 

I have checked the server config and I'm using the following ports:

 

<property name="ServerPort" value="26900"/>

<property name="ControlPanelPort" value="8082"/>

<property name="TelnetPort" value="8083"/>

 

Firewall config:

[root@gs02 ~]# firewall-cmd --list-all

public (active)

target: default

icmp-block-inversion: no

interfaces: ens192

sources:

services: ssh dhcpv6-client

ports: 8082-8083/tcp 26900/udp 26900/tcp 26901-26902/udp

protocols:

masquerade: no

forward-ports:

source-ports:

icmp-blocks:

rich rules:

 

On the server I see the following used ports:

 

[root@gs02 ~]# netstat -tulpn

Active Internet connections (only servers)

Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name

tcp 0 0 0.0.0.0:8082 0.0.0.0:* LISTEN 27820/./7DaysToDieS

tcp 0 0 0.0.0.0:8083 0.0.0.0:* LISTEN 27820/./7DaysToDieS

tcp 0 0 0.0.0.0:26900 0.0.0.0:* LISTEN 27820/./7DaysToDieS

tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 6948/sshd

tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 7438/master

tcp6 0 0 :::22 :::* LISTEN 6948/sshd

tcp6 0 0 ::1:25 :::* LISTEN 7438/master

udp 0 0 127.0.0.1:323 0.0.0.0:* 5807/chronyd

udp 0 0 0.0.0.0:26900 0.0.0.0:* 27820/./7DaysToDieS

udp 0 0 0.0.0.0:26902 0.0.0.0:* 27820/./7DaysToDieS

udp6 0 0 ::1:323 :::* 5807/chronyd

 

So this does not indicate me any possible colliding ports; I have checked this morning log @ 6:00AM and nobody was connected before the restart and the bind exception was still there :upset:

 

I will remove straight away the SteamNetworking entry from the disabled protocols, I forgot that line... (even thus my frieds were able to see the server in the server list and able to successfully connect to it)

 

I will check for a couple of days if this issue comes back and in that case I'll try to disable LineNetLib.

 

PS: if you're courageous enough to try again CentOS, my advise, after you fresh installed the OS, is straighly disable SELinux; it cause unecessary trouble/unexpected behaviour to the system you're administering :chuncky:

 

[root@gs02 ~]# sestatus

SELinux status: disabled

 

 

Thanks again,

 

ilganna

Link to comment
Share on other sites

Update:

 

Removed SteamNetworking from disabled protocols - problem still occurs.

Added LineNetLib to disabled protocols - Connection to server and logging into the world works, but then the game becomes unplayable: FPS are dropping to single digit, the game freezes for seconds, unfreezes and start over again with freezes/unfreezes.

 

One thing to note, is that on the beginning of our adventures (clean server, just rolled out), the disconnection issue was not present, it started when we hit hard on digging tunnels/undregorund bases and traps.

 

Thanks

 

ilganna

Link to comment
Share on other sites

Update:

 

Made progress on the bind exception issue: found out that IPv6 was the culprit, so I disabled it on the Debian VM and the error went away (I have a dual stack IPv4 & IPv6 Internet connection, but I did not yet enabled it on my main router, just want to be sure everything is fine with IPv4 first).

 

Here's a fresh uploaded logs to pastebin:

 

https://pastebin.com/6s8nTk7S

 

In order to found out the issue, I had to check the console loggin; on the CentOS VM I had this entry (which I didn't bother to follow):

 

Console logging disabled: Bug in tmux 1.8 breaks logging

https://linuxgsm.com/tmux-upgrade

Currently installed: tmux 1.8

 

While on the Debian VM I have:

 

eac_server.so [x64] :: OnLoad()

System.Net.Sockets.SocketOptionName 0x1b is not supported at IPv6 level

[s_API FAIL] SteamAPI_Init() failed; SteamAPI_IsSteamRunning() failed.

Setting breakpad minidump AppID = 251570

 

Now I will have to wait for people to connect to the server and see if disconnect is still happening...

 

Thanks,

 

ilganna

Link to comment
Share on other sites

Update 2:

 

Unfortunately the Debian servers still disconnects the players... At this point, the only possible cause I'm thinking, is that we did somehow overloaded the server with all the defense systems, electrical devices, electrical traps, underground buildings and tunnels etc.

 

Anyhow, I will see if the issue comes back also on an empty /fresh server.

 

Thanks,

 

ilganna

Link to comment
Share on other sites

I think your problem is a networking issue, and not related to having too much going on on the server (defenses, electrical, underground builds). It could be a protocol problem or a ping timeout maybe, or firewall, or ???.

 

The reason I say this is that, in my experience, the symptoms of server overload or corruption are, rubberbanding, lag, zombie ai misbehaving, teleporting vehicles, or even a server crash... You haven't described any of these symptoms, so I don't think that's the issue.

 

Another thing to note is that, because you're using a VM, you have two network setups that need to be right.

 

First is the network port of the VM to the Host, and second is the network port of host to the world.

 

Both networks - ports, protocols, firewalls, nats, etc. - have to be right.

 

Does everyone disconnect at the same time? Does the game server crash?

 

From the looks of your machine, (4 network ports) you might have a number of VLans, hardware firewalls/routers, port aggregation... etc. to look at.

Link to comment
Share on other sites

Hi Beelzybub,

 

As troubleshooting steps, keeping on mind networking as you pointed out, I did the following:

 

Reset to factory default my router/bridge and manually reconfigured, swapped all ethernet cables, tested thru all day with ping over Internet (pinging cloudflare's DNS - 1.1.1.1 - because I don't like google :p), and local router: no packet loss.

Tested the game, but not yet with many players online and seems no disconnects were experienced. (My issue was only disconnection from the game, the dedicated server was still running on the Linux VM).

 

Just to clarify here's a simple resume of my setup:

 

ESXi Host (4x nic in teaming) -> Cisco LAN Switch -> Router -> xDSL Bridge -> Internet

 

Since the Cisco LAN Switch had only one cable connecting to the router, could be that the media was faulty, or in the case the issue will appear again, the Cisco Switch may be faulty - I did some tests from my machine, by continuosly pinging the router (on same built-in switch of the router) and had no issue at all.

 

I keep my finger crossed that this is now solved once and for all :)

 

Thanks,

 

ilganna

Link to comment
Share on other sites

I just noticed that, in this thread, you guys have been misspelling "litenetlib" as "linenetlib". So that would be something to check in your disabled protocol line.

 

Also @Sylanthunder, can you disable both Steamnetworking and Litenetlib? What protocol would get used?

Link to comment
Share on other sites

Sure, you can disable both. It would just use the internal Unity protocol I believe. I don't recall off the top of my head which others are currently loaded. RAKNet might still be in there too. Disabling SteamNetworking will prevent the server from showing in the list, so anyone connecting will need to use IP and Port. As for LiteNetLib being typoed, if it was me, it was just a typo. Hell I did it when typing it just now, and the only reason I can think of for doing so is that I'm old. When editing my config, I usually copy-paste the protocols. :p

Link to comment
Share on other sites

Update:

 

I finally found the root issue: after days of pings, troubleshoot, hardware relocation, monitor 3-4 VMs with pings to WAN, wireshark capure etc I found out that the the link between my servers router and Cisco core switch was dropping packets randomly (like 5-10 per day).

I figured out since I was pinging the same gateway on my client network (I have another router and Internet access (cable ISP) connected with a ethernet cable to the business Internet connection and it was not dropping a single packet). So again this pointed out the link between business router and Cisco switch. I swapped the Cisco switch with another I had (different brand) and boom, problem solved, no more packets dropped randomly and server is rock solid now :smile-new:

 

Thanks very much SylenThunder & Beelzybub for your help, really appreciated!

 

I have a last, non important question: is there a way to avoid the server to advertise over LAN? Example, if I connect my pc directly to the business router LAN, on the server browser, I see the private IP of the server, instead of the public one.

 

Thanks,

 

ilganna

Link to comment
Share on other sites

Yeah, both will typically show. It's only going to happen on your private network though. The only real way around it would be to DMZ the server, and that's one of the worst things you could do. (Although, depending on the router, even DMZ may not make a difference since it will still translate an internal IP in the NAT. I'm looking at you AT&T/Motorolla-Arris.)

Link to comment
Share on other sites

Hi,

 

I can easily remove the LAN advertise by:

 

1. modifying the firewall on the game server machine (replace xxx.xxx.xxx.xxx/xx) with your LAN network):

firewall-cmd --add-rich-rule='rule family=ipv4 source address=xxx.xxx.xxx.xxx/xx port port=26900-26902 protocol=udp reject' --permanent

 

Since I have two Internet connections (one business for server network and one for home use), I would force myself to play over Internet.

 

2. Or as a more elegant solution, you can use VLANs to separate server network from the client one.

 

Thanks,

 

ilganna

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...