Server Broken and Then Unbroken

Posted Apr 29, 2011 at 7:20 pm

The following chronology of events is brought to you with limited commercial interruption by Death Bat 4, in theaters some time in the late 70’s.

Sunday 4/24

It all started Sunday when Dog noticed the Left4DoD server hosted on my server was lagging horribly. I checked the graphs on server stats and saw nothing unusual. There were no increases in load or network usage anywhere. Very curious…

Monday 4/25

I noticed the same game server lagging horribly again. Curiously, no other services were affected, including other game servers. It’s also doubtful the Left4DoD plugin is to blame since other servers with the same plugin were fine.

Tuesday 4/26

The entire server locked up for about an hour. All processes were being blocked. The graphs showed a huge spike in load, memory, processes, and everything else immediately before the freeze. It kind of looked like a DoS attack. I put in a ticket with the datacenter asking them to investigate, but they didn’t find anything.

Wednesday 4/27

While poking around the server, I realized it was running Fedora 11, which is very old. So I decided it was time to upgrade to Fedora 15! The process of remotely upgrading Fedora remotely via yum is messy to say the least. So after hours of updating, resolving dependencies, and removing old junk the upgrade was complete!

Later that night, Dog and I noticed the Left4DoD server lagging again. Restarting and disabling the plugin had no effect, so we started suspecting a DoS attack. After changing the port, the lag immediately stopped. That all but confirmed our suspicion and we went to work hardening the server against the various DoS vulnerabilities in srcds. Dog technologied an anti-DoS plugin and I scienced together some iptables rules, and the problem hasn’t occurred since.

Thursday 4/28

I noticed no emails were getting through the server since the upgrade to Fedora 15. The email services were all managed by ISPConfig, so I thought I’d update that. Well that didn’t go well and I’ve been dissatisfied with it before, so I decided to replace it with Webmin. Installing it was easy.

I set up my main websites and copied the files to the new directories. Most systems were fully operational. However, the email situation was only made worse by these changes.

Friday 4/29

I spent most of the day fixing the email situation. Everything relating to emails seemed to be failing. First I fixed postfix somehow (I forget; something about the hostname) to allow the server to send and receive mail. Meanwhile, I got the rest of the website stuff set up again.

The next step was to get Dovecot working again so I can get my mail out of the server. It was not accepting my credentials. After figuring out how to fix the logging, I was able to see that Dovecot couldn’t find where the mail is kept. A small change in the configuration solved that and everything was happy again in email land.

However, there seemed to be some DNS issues. I think everything is right in that area after lots of BIND tweaking, but I’m not sure. To be continued…

Conclusion

Comment if you notice any problems accessing this site.

In closing:

2 Comments

  • Dog

    You did a great job, as ever! I didn’t realize you had a whole pile more work on top of just the srcds hardening. I hope it all gets fixed soon!
    Oh and the anti-DoS stuff is working a treat!

    Reply
    • StevoTVR

      Yeah, everything should be working now.

      Reply

Leave a Reply to Dog

Cancel Reply