Author Topic: Downtime July 2011 – A Second Statement  (Read 1795 times)

Offline Suzy Scott

  • Suzy, Nuisance of the Coach Park
  • Administrator
  • Hero Member
  • ******
  • Thank You
  • -Given: 69
  • -Receive: 584
  • Posts: 6412
    • ScotBus
Downtime July 2011 – A Second Statement
« on: July 15, 2011, 06:21:33 PM »
Downtime July 2011 – A Second Statement

Further to the statement issued by myself on Wednesday 13th July 2011 at http://angliaandthamesvalleybusforum.com/index.php?topic=5102.msg45533 a lot has happened in 48 hours.

Firstly, I would like to formally thank you all for all the support – here, by email, text, PM’s, Facebook and the rest. I will ensure that I reply personally to them all by the end of the weekend. I do appreciate everything, but I wanted to bring you all up to speed with some important finds and developments since then.

On Thursday 14th July, a complaint (similar to what was here, but with pronouns in the right direction, and adding extra detail) was faxed to Heart Internet. Within a few hours, one of the directors passed our case onto their Customer Services Manager, who reassured me she was on the case. Firstly, the £25 + VAT = £30 that I paid last Saturday for a restore (that did not work!) has been refunded to the debit card I used, which is tied into our joint account. (It’s not back yet, but it is clearly on it’s way).

Subsequently, on the evening of Thursday 14th July, Alison (CSM at Heart) said that someone had apparently logged in legitimately via FTP on the morning where we experienced downtime i.e. Thursday 7th, and emptied settings.php – i.e. removed the file that brought the house down.

As you can perhaps imagine, this had me concerned. Currently, none of the other Administrators have FTP access, and I was asleep on the coach at the time quoted. So, naturally, I asked for an IP address.

Today, emails back and forth. Now transpires we were *not* hacked… or indeed logged into (eh?), but simply, the settings.php file self-combusted – for want of a better word. 

Long Explanation
 “There's no evidence of a manual update via FTP, however googling the behaviour comes up with:
http://www.simplemachines.org/community/index.php?topic=210487.0
So apparently SMF tries to write the latest error message to Settings.php whenever one happens. If two happen at once, it gets blanked. This is called a "race condition", I'll explain - if two different requests are "A" and "B":
A1 Read Settings.php into a cache
A2 Open Settings.php for writing (Settings.php is empty)
B1 Read Settings.php into a cache (empty!)
A3 Write cache to Settings.php (Settings.php has content)
B2 Open Settings.php for writing (Settings.php is empty)
B3 Write cache to Settings.php (Settings.php is empty)
It's the kind of mistake that even experienced programmers can make, and it's one that does get exacerbated when the machine is under high load:
if each rewrite takes 0.2s, it's "very unlikely" to get two at once, but if each rewrite takes 1s, it's just "unlikely".
I'd recommend just updating to the latest SMF (2.0), hoping that will resolve the issue, and just being aware that it can happen in future. If you really want to, you could in theory set up a cron job to fix Settings.php if it is empty, but given the rarity of the issue that would be of little value.”

Apparently Heart think it’s not too big a problem, but informal communications with the original Forum host says he sees this all too often.

Short explanation - WE WERE NOT HACKED. WE WERE NOT DELETING THE FORUM. It broke on the morning of a service outage that just happened to be the same morning that I went away up north. Your data remains safe! I have given my Administrators permission to install updates on the day they see them – in future – to draw the strings even tighter. This may result in 5-10 seconds of downtime, which may generate a “try again” error, a few ties a year (2-3). I had a problem with an early update, but they have all been rock-solid since!

So, the Forum has two months free of charge, and I have said we will give it three months. This will give me hopefully enough time to make the right choice, and not a rushed one. But, if we get service like this again, I’ll be off quicker than you can imagine.

New hosts – a few words to say. I will come back to you re the various suggestions. We originally took A&TVBF (or ABF as it was) away for financial reasons – but the free package was no longer provided, but I decided to stay with Heart – as well as I could. So far, asides from Caroline’s Camino and personal websites, I’ve only got two websites moved across to Heart – this Forum, and dundeebuses.info  (which was a useful yardstick at wondering what was happening with this Forum!) A&TVBF has not been hacked yet, whereas Dundee Area Bus Forum was skewered once by someone taking advantage of server permissions – something the host has now removed, presumably to prevent everyone from doing the same! This was a good 5/6 years ago, though!

I will certainly be looking at every alternative. The options vary from moving everything onto a new host completely (longest time) to moving the two websites (plus C’s stuff) back to Evo, paying for multiple packages if needed. I also found another webhost that might be able to give us a chance, but everything would need to be moved then.

In any case, moving away may cost less than what we are paying now. With Heart I had to take out reseller packages, to get multiple domains. Even adding more space with the old host will only be a pound or two less. However, if Forum Aid becomes necessary, I will be making an announcement here first.

Also, one of my fellow Admins has suggested we start with a new SMF2.0 database – which might well be good as part of a move. What does this mean to the average non-geek? Nothing. No extra downtime (five minutes more maximum), but a lot smoother service in the background.

Do have a good weekend! I have left this thread open too, so feel free to reply or comment!

Suzy (Forum Admin)