Server Crash every 5 minutes

cookyman · January 5, 2020

These were my values (on a out-of-the-box) RedHat/CentOS 7 server before any changes:
$ cat /proc/sys/net/ipv4/tcp_syn_retries
6
$ cat /proc/sys/net/ipv4/tcp_synack_retries
5

Which I changed to:
$ echo 3 > /proc/sys/net/ipv4/tcp_syn_retries
$ echo 3 > /proc/sys/net/ipv4/tcp_synack_retries

how or where can I change this or see if it is?

RheaAyase · January 5, 2020

I had moved the server Linux VM to another disk and lowered the memory from 16Gb to 8GB and ran in the same problem. After thinking it was due to file corruption and swaping everything around It was as simple as giving it back it's initial memory.

I think this null pointer problem is simply due to lack of memory to allocate something.. maybe? Bottom line Increase the memory of the server and it might (cross fingers) solve the problem.

No idea why did that solve the problem for you but that's neither the issue nor the solution (I have about 100GBs of free memory.) There is a message saying "mono_fdhandle_insert: duplicate File fd 0"

In other words, it's trying to open a stream that's already open (whether it's memory stream, file, connection, whatever.)

We might be observing two different issues here (depending on whether we run linux or windows?).
But since I reduced the SYN retries on my servers, I've not seen this issue anymore (with EAC enabled).
I'd assume the developers would be grateful if we could pinpoint this issue to a soft timeout/cleanup they have to look into.

These were my values (on a out-of-the-box) RedHat/CentOS 7 server before any changes:
$ cat /proc/sys/net/ipv4/tcp_syn_retries
6
$ cat /proc/sys/net/ipv4/tcp_synack_retries
5

Which I changed to:
$ echo 3 > /proc/sys/net/ipv4/tcp_syn_retries
$ echo 3 > /proc/sys/net/ipv4/tcp_synack_retries

Don't worry, those setting will not be persistent (they'll be restored at a server reboot) unless you defined them in sysctl.

This basically changes the total SYN timeout from about 180 seconds to 40 seconds, which seems sufficient to circumvent the bug.

Solved for me as well. (Fedora Linux = also RedHat type.)

KrazyKrampus · January 6, 2020

how or where can I change this or see if it is?

Even if adjusting syn/ack retries in the network stack fixes the problem, it's not an acceptable workaround IMO. That setting affects things system-wide and if you have other things on that server will impact it. The issue here I believe is the 7DTD code not handling an exception when it can't reach out to the EAC servers when EAC is enabled. This may be intentional, as someone would be able to circumvent the integrity of the server if it can't communicate when the global ban list, but I think a more "graceful" way of stopping the server would be a better approach if this was handled. E.g. a system-wide-message informing users EAC is unavailable and if it can't be reached in say, a minute or two the server will gracefully shutdown.

Also, this seems like EAC is experiencing issues on their end, as I've seen other games have this problem (and gracefully inform the user/admins). Essentially I believe this is just TFP forgetting to implement some safety code on their side to prevent the server from blatantly exiting when it can't open a socket to the central hydra host (in AWS) ELB.

BadPlayer · January 6, 2020

Neither do I.

This is a configuration setting in the Ubuntu server OS.

Is that something I can do in Allocs server scripts?

funtimes · January 7, 2020

Sadly the only current solution for linux dedicated servers that I've found that works is disabling EAC. But why should we have to disable something that is supposed to help protect us?

**SylenThunder** · January 7, 2020

Sadly the only current solution for linux dedicated servers that I've found that works is disabling EAC. But why should we have to disable something that is supposed to help protect us?

piLON's suggestion seems to have taken care of it on the server I was testing this error on. EAC is still enabled on it.

piLON · January 7, 2020

Yeah. Personally I felt that disabling EAC was the worst work-around I could think of, but I merely used it to prove my hypothesis (comment #4 in this thread).

The problem is obviously in the code or libraries of 7daystodie and will hopefully be fixed, but reducing the SYN retries (as I suggested) will (in 99.9% of the cases) not affect anything else nowadays.

But sure, to be able to apply those changes, you need to have root access to the linux server/shell, or ask the owner of the server to apply them.

Btw, my server is The World's End, hence I felt quite eager to find the best work-around to the problem.

BadPlayer · January 7, 2020

I switched to LGSM and it has been running smooth as silk.

piLON · January 8, 2020

Okay, you win. I'm just someone who've been in the Linux open source community for 23 years and just wanted to help my fellow players. :-/

BadPlayer · January 8, 2020

Okay, you win. I'm just someone who've been in the Linux open source community for 23 years and just wanted to help my fellow players. :-/

That wasn't directed towards you. I was just saying that I switched and now the problem is gone.

KrazyKrampus · January 8, 2020

The SYN workaround is fine if your server is dedicated to 7 Days or you have root yeah. I run my own server on an Ubuntu bionic 18.04.3 release. EAC is important for running a public server so disabling it is a bad idea if it can be avoided. My comment was more directed that I am really surprised TFP haven't released some sort of hotfix sooner regarding this issue since I am sure many are being afflicted by this if it is indeed related to some EasyAntiCheat availability problems.

estiaan1234 · January 8, 2020

piLON, how do i reduce SYN retries on ubuntu 16? i tried following the path you sugguested but the file is empty when edited with nano or vi command.

staalelor · January 9, 2020

Same problem on my Debian 10 linux container (lxc).

Have not tried the SYN-hack, but I had to turn EAC off.

Are the developers even aware of this issue at all? Are they reading this?

**SylenThunder** · January 9, 2020

I've been running the Syn hack for several days. The test server falls over less, but it still falls over. Side effect was during heavy use, users were experiencing some decent latency. Have returned to normal operations.

To point though, my servers aren't falling over every 5 minutes. We only encountered this about once or twice a day with a 12-hour restart cycle.

Sign In

Server Crash every 5 minutes

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived