After several sleepless nights we now appear to be on top of our downtime crisis so I can dedicate some time and explain what happened.
Clear Books is a startup – it’s exciting and we are adapting all the time but sometimes we make mistakes. We’re learning from our mistakes, improving all the time and maturing as a business.
Everyone at Clear Books is extremely sorry for the recent downtime. Our reputation and livelihood depend on providing an excellent service so when there is no service, it’s a heart stopping moment for our entire team and not much fun.
Thank you to all our customers who have supported us during this difficult period. Your rallying comments really help raise our spirits.
I will begin with a quick summary of how we have changed in our short 3 year history.
90 paying customers.
I was a one man band working from my lounge in my pyjamas.
Any kind of enterprise level hosting solution was far from my mind at this very early stage. Instead the focus was developing a basic application.
800 paying customers.
In an office now. Got some staff.
Then we suffered a major hardware failure on our single database server. Having a single database server was a mistake although such a severe hardware failure felt like bad luck (but then you have to expect the worst). Customers working the bank holiday weekend lost their weekend’s worth of data as we had to resort to restoring offsite backups. We immediately reacted by introducing real time replication of our database server (master/slave) to ensure data would be replicated and safe in the future.
2,500 paying customers.
Bigger office. More staff.
And now we have just suffered a serious accessibility issue at the web tier. There was single point of failure with our NFS server. For the techies amongst you full details are provided here by Senior System Architect at CatN, Mark Sutton. CatN has also apologised to its customers and outlined the changes they are making here in a post by CatN’s commercial director, Joe Gardiner.
What is CatN?
CatN is Fubra Limited’s cluster hosting solution. Customers should be familiar with Fubra because you sign into Clear Books with your Fubra Passport and you make payments through the Fubra Payments system. It’s widely documented that Fubra Limited is a 50% shareholder in Clear Books too. Clear Books is hosted on the CatN platform.
CatN is in beta. What this means is CatN is planning to launch as a fully redundant system in January 2012 to the general public. At the current time CatN has no redundancy for its NFS server.
Clear Books Backup Cluster
Clear Books is working with CatN to ensure that Clear Books has full redundancy before January 2012. We have already made significant progress with this as we were put to the test on Tuesday morning when the NFS server failed again. We were able to change our DNS record and resurrect access to Clear Books from a backup web server. This ensured Clear Books remained accessible. Shortly, we will be moving to managed DNS so that the switch over will be seamless.
It’s really difficult to set an expectation for the resolution time of a server issue. By “hoping things will be resolved within the hour” we are simply setting customers up for potential disappointment and more frustration. Therefore if we suffer downtime in the future we won’t put a time estimate on a solution. Instead we will:
ask you to register your email address on our status page (see below)
alert you via email & twitter when the system is back up
This way customers can spend time focusing on other tasks. When the system is back up and running we will inform you immediately.
We are also exploring incorporating a credit system into our subscriptions such that compensation will be applied to all customer accounts for any significant unscheduled downtime.
Please take note of our Clear Books Status page which collates tweets from Clear Books and CatN to provide real time updates. If you cannot access Clear Books google “Clear Books status” to find this externally hosted website.
In the past we made a mistake with database redundancy and we addressed it. Now we have made a mistake with accessibility, and we are addressing it. We are learning and improving all the time. Judge us on how we respond and we will continue to work hard to build you and your business a bigger and better online accounting system.