Jump to content

Server Crash every 5 minutes


estiaan1234

Recommended Posts

These were my values (on a out-of-the-box) RedHat/CentOS 7 server before any changes:

$ cat /proc/sys/net/ipv4/tcp_syn_retries

6

$ cat /proc/sys/net/ipv4/tcp_synack_retries

5

 

Which I changed to:

$ echo 3 > /proc/sys/net/ipv4/tcp_syn_retries

$ echo 3 > /proc/sys/net/ipv4/tcp_synack_retries

 

 

how or where can I change this or see if it is?

Link to comment
Share on other sites

I had moved the server Linux VM to another disk and lowered the memory from 16Gb to 8GB and ran in the same problem. After thinking it was due to file corruption and swaping everything around It was as simple as giving it back it's initial memory.

 

I think this null pointer problem is simply due to lack of memory to allocate something.. maybe? Bottom line Increase the memory of the server and it might (cross fingers) solve the problem.

 

No idea why did that solve the problem for you but that's neither the issue nor the solution (I have about 100GBs of free memory.) There is a message saying "mono_fdhandle_insert: duplicate File fd 0"

 

In other words, it's trying to open a stream that's already open (whether it's memory stream, file, connection, whatever.)

 

We might be observing two different issues here (depending on whether we run linux or windows?).

But since I reduced the SYN retries on my servers, I've not seen this issue anymore (with EAC enabled).

I'd assume the developers would be grateful if we could pinpoint this issue to a soft timeout/cleanup they have to look into.

 

These were my values (on a out-of-the-box) RedHat/CentOS 7 server before any changes:

$ cat /proc/sys/net/ipv4/tcp_syn_retries

6

$ cat /proc/sys/net/ipv4/tcp_synack_retries

5

 

Which I changed to:

$ echo 3 > /proc/sys/net/ipv4/tcp_syn_retries

$ echo 3 > /proc/sys/net/ipv4/tcp_synack_retries

 

Don't worry, those setting will not be persistent (they'll be restored at a server reboot) unless you defined them in sysctl.

 

This basically changes the total SYN timeout from about 180 seconds to 40 seconds, which seems sufficient to circumvent the bug.

 

Solved for me as well. (Fedora Linux = also RedHat type.)

Link to comment
Share on other sites

how or where can I change this or see if it is?

 

Even if adjusting syn/ack retries in the network stack fixes the problem, it's not an acceptable workaround IMO. That setting affects things system-wide and if you have other things on that server will impact it. The issue here I believe is the 7DTD code not handling an exception when it can't reach out to the EAC servers when EAC is enabled. This may be intentional, as someone would be able to circumvent the integrity of the server if it can't communicate when the global ban list, but I think a more "graceful" way of stopping the server would be a better approach if this was handled. E.g. a system-wide-message informing users EAC is unavailable and if it can't be reached in say, a minute or two the server will gracefully shutdown.

 

Also, this seems like EAC is experiencing issues on their end, as I've seen other games have this problem (and gracefully inform the user/admins). Essentially I believe this is just TFP forgetting to implement some safety code on their side to prevent the server from blatantly exiting when it can't open a socket to the central hydra host (in AWS) ELB.

Link to comment
Share on other sites

Sadly the only current solution for linux dedicated servers that I've found that works is disabling EAC. But why should we have to disable something that is supposed to help protect us?

piLON's suggestion seems to have taken care of it on the server I was testing this error on. EAC is still enabled on it.

Link to comment
Share on other sites

Yeah. Personally I felt that disabling EAC was the worst work-around I could think of, but I merely used it to prove my hypothesis (comment #4 in this thread).

The problem is obviously in the code or libraries of 7daystodie and will hopefully be fixed, but reducing the SYN retries (as I suggested) will (in 99.9% of the cases) not affect anything else nowadays.

But sure, to be able to apply those changes, you need to have root access to the linux server/shell, or ask the owner of the server to apply them.

 

Btw, my server is The World's End, hence I felt quite eager to find the best work-around to the problem.

Link to comment
Share on other sites

The SYN workaround is fine if your server is dedicated to 7 Days or you have root yeah. I run my own server on an Ubuntu bionic 18.04.3 release. EAC is important for running a public server so disabling it is a bad idea if it can be avoided. My comment was more directed that I am really surprised TFP haven't released some sort of hotfix sooner regarding this issue since I am sure many are being afflicted by this if it is indeed related to some EasyAntiCheat availability problems.

Link to comment
Share on other sites

I've been running the Syn hack for several days. The test server falls over less, but it still falls over. Side effect was during heavy use, users were experiencing some decent latency. Have returned to normal operations.

 

To point though, my servers aren't falling over every 5 minutes. We only encountered this about once or twice a day with a 12-hour restart cycle.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...