Clear Books
Blog
03Nov 11

Downtime explained

Downtime explained

After several sleepless nights we now appear to be on top of our downtime crisis so I can dedicate some time and explain what happened.

Clear Books is a startup –  it’s exciting and we are adapting all the time but sometimes we make mistakes. We’re learning from our mistakes, improving all the time and maturing as a business.

Everyone at Clear Books is extremely sorry for the recent downtime. Our reputation and livelihood depend on providing an excellent service so when there is no service, it’s a heart stopping moment for our entire team and not much fun.

Thank you to all our customers who have supported us during this difficult period. Your rallying comments really help raise our spirits.

I will begin with a quick summary of how we have changed in our short 3 year history.

Year 1

  • 90 paying customers.

  • I was a one man band working from my lounge in my pyjamas.

Any kind of enterprise level hosting solution was far from my mind at this very early stage. Instead the focus was developing a basic application.

Year 2

  • 800 paying customers.

  • In an office now. Got some staff.

Then we suffered a major hardware failure on our single database server. Having a single database server was a mistake although such a severe hardware failure felt like bad luck (but then you have to expect the worst). Customers working the bank holiday weekend lost their weekend’s worth of data as we had to resort to restoring offsite backups. We immediately reacted by introducing real time replication of our database server (master/slave) to ensure data would be replicated and safe in the future.

Year 3

  • 2,500 paying customers.

  • Bigger office. More staff.

And now we have just suffered a serious accessibility issue at the web tier. There was single point of failure with our NFS server. For the techies amongst you full details are provided here by Senior System Architect at CatN, Mark Sutton. CatN has also apologised to its customers and outlined the changes they are making here in a post by CatN’s commercial director, Joe Gardiner.

What is CatN?

CatN is Fubra Limited’s cluster hosting solution. Customers should be familiar with Fubra because you sign into Clear Books with your Fubra Passport and you make payments through the Fubra Payments system. It’s widely documented that Fubra Limited is a 50% shareholder in Clear Books too. Clear Books is hosted on the CatN platform.

CatN is in beta. What this means is CatN is planning to launch as a fully redundant system in January 2012 to the general public. At the current time CatN has no redundancy for its NFS server.

Clear Books Backup Cluster

Clear Books is working with CatN to ensure that Clear Books has full redundancy before January 2012. We have already made significant progress with this as we were put to the test on Tuesday morning when the NFS server failed again. We were able to change our DNS record and resurrect access to Clear Books from a backup web server. This ensured Clear Books remained accessible. Shortly, we will be moving to managed DNS so that the switch over will be seamless.

Communicating

It’s really difficult to set an expectation for the resolution time of a server issue. By “hoping things will be resolved within the hour” we are simply setting customers up for potential disappointment and more frustration. Therefore if we suffer downtime in the future we won’t put a time estimate on a solution. Instead we will:

  • ask you to register your email address on our status page (see below)

  • alert you via email & twitter when the system is back up

This way customers can spend time focusing on other tasks. When the system is back up and running we will inform you immediately.

Compensation

We are also exploring incorporating a credit system into our subscriptions such that compensation will be applied to all customer accounts for any significant unscheduled downtime.

Status Page

Please take note of our Clear Books Status page which collates tweets from Clear Books and CatN to provide real time updates. If you cannot access Clear Books google “Clear Books status” to find this externally hosted website.

What Next?

In the past we made a mistake with database redundancy and we addressed it. Now we have made a mistake with accessibility, and we are addressing it. We are learning and improving all the time. Judge us on how we respond and we will continue to work hard to build you and your business a bigger and better online accounting system.

  • http://www.lucidrep.com Hat Margolies

    Hi Tim,

    Thanks for explaining so fully what happened, but I do think that there were failings in the fact that you couldn’t contact customers and let them know that there was a problem via email and that is was only through twitter that you could find out what was going on. I also found that when I had problems with the system again on Sunday there was a slightly incredulous air to the response, as if it was my fault, rather than that the system had failed again – as it had…
    Once the system was up again, I’m really suprised you haven’t emailed all your customers to apologise and even to offer some sort of compensation for the stress it caused, coming at the end of the month.
    I hope that the problems are solved now, and that you will have a more efficient way of letting people know there is a problem if it occurs on this scale again.

  • Paul

    Thank you for the apology, the informative explanation and most importantly the well explained way forward. Good communication is the keystone to any relationship!

  • Robert

    Many thanks for your full explanation Tim – it is good to know you guys have addressed the issues now – it has put my mind at rest.

  • http://www.clearbooks.co.uk Tim Fouracre

    Thanks for your positive comments.

    I’ve blogged again giving some more details about the next steps we are taking.

    http://www.clearbooks.co.uk/2011/11/04/next-steps-to-a-stable-platform/

  • Adam

    Sorry had to chuckle at this line:

    “CatN is in beta. What this means is CatN is planning to launch as a fully redundant system in January 2012 to the general public. At the current time CatN has no redundancy for its NFS server.”

    Well they’ve got a couple of months then :p

  • http://www.turnkeyit.co.uk Mike Turner

    Tim,

    “CatN is in beta. What this means is CatN is planning to launch as a fully redundant system in January 2012 to the general public. At the current time CatN has no redundancy for its NFS server.”

    I’m quite surprised that Clearbooks is using a hosting system which is in beta and not fully operational or fully supported.

    • http://www.clearbooks.co.uk Tim Fouracre

      @Mike CatN is not a beta company – they have blue chip clients on their private clusters. Their vCluster hosting solution and control panel, which we are currently on, is in private beta and is being used by a select few businesses.

      Clear Books recently recruited a full time sys admin who is working with the CatN engineers and our developers to build our own private Cluster bespoke hosting solution.

I want a piece of this pie

Pay nothing today and try Clear Books on a full featured free trial.
No payment details required for trial.

Try for free
Blog
03Nov 11

Downtime explained

Downtime explained

After several sleepless nights we now appear to be on top of our downtime crisis so I can dedicate some time and explain what happened.

Clear Books is a startup –  it’s exciting and we are adapting all the time but sometimes we make mistakes. We’re learning from our mistakes, improving all the time and maturing as a business.

Everyone at Clear Books is extremely sorry for the recent downtime. Our reputation and livelihood depend on providing an excellent service so when there is no service, it’s a heart stopping moment for our entire team and not much fun.

Thank you to all our customers who have supported us during this difficult period. Your rallying comments really help raise our spirits.

I will begin with a quick summary of how we have changed in our short 3 year history.

Year 1

  • 90 paying customers.

  • I was a one man band working from my lounge in my pyjamas.

Any kind of enterprise level hosting solution was far from my mind at this very early stage. Instead the focus was developing a basic application.

Year 2

  • 800 paying customers.

  • In an office now. Got some staff.

Then we suffered a major hardware failure on our single database server. Having a single database server was a mistake although such a severe hardware failure felt like bad luck (but then you have to expect the worst). Customers working the bank holiday weekend lost their weekend’s worth of data as we had to resort to restoring offsite backups. We immediately reacted by introducing real time replication of our database server (master/slave) to ensure data would be replicated and safe in the future.

Year 3

  • 2,500 paying customers.

  • Bigger office. More staff.

And now we have just suffered a serious accessibility issue at the web tier. There was single point of failure with our NFS server. For the techies amongst you full details are provided here by Senior System Architect at CatN, Mark Sutton. CatN has also apologised to its customers and outlined the changes they are making here in a post by CatN’s commercial director, Joe Gardiner.

What is CatN?

CatN is Fubra Limited’s cluster hosting solution. Customers should be familiar with Fubra because you sign into Clear Books with your Fubra Passport and you make payments through the Fubra Payments system. It’s widely documented that Fubra Limited is a 50% shareholder in Clear Books too. Clear Books is hosted on the CatN platform.

CatN is in beta. What this means is CatN is planning to launch as a fully redundant system in January 2012 to the general public. At the current time CatN has no redundancy for its NFS server.

Clear Books Backup Cluster

Clear Books is working with CatN to ensure that Clear Books has full redundancy before January 2012. We have already made significant progress with this as we were put to the test on Tuesday morning when the NFS server failed again. We were able to change our DNS record and resurrect access to Clear Books from a backup web server. This ensured Clear Books remained accessible. Shortly, we will be moving to managed DNS so that the switch over will be seamless.

Communicating

It’s really difficult to set an expectation for the resolution time of a server issue. By “hoping things will be resolved within the hour” we are simply setting customers up for potential disappointment and more frustration. Therefore if we suffer downtime in the future we won’t put a time estimate on a solution. Instead we will:

  • ask you to register your email address on our status page (see below)

  • alert you via email & twitter when the system is back up

This way customers can spend time focusing on other tasks. When the system is back up and running we will inform you immediately.

Compensation

We are also exploring incorporating a credit system into our subscriptions such that compensation will be applied to all customer accounts for any significant unscheduled downtime.

Status Page

Please take note of our Clear Books Status page which collates tweets from Clear Books and CatN to provide real time updates. If you cannot access Clear Books google “Clear Books status” to find this externally hosted website.

What Next?

In the past we made a mistake with database redundancy and we addressed it. Now we have made a mistake with accessibility, and we are addressing it. We are learning and improving all the time. Judge us on how we respond and we will continue to work hard to build you and your business a bigger and better online accounting system.

  • http://www.lucidrep.com Hat Margolies

    Hi Tim,

    Thanks for explaining so fully what happened, but I do think that there were failings in the fact that you couldn’t contact customers and let them know that there was a problem via email and that is was only through twitter that you could find out what was going on. I also found that when I had problems with the system again on Sunday there was a slightly incredulous air to the response, as if it was my fault, rather than that the system had failed again – as it had…
    Once the system was up again, I’m really suprised you haven’t emailed all your customers to apologise and even to offer some sort of compensation for the stress it caused, coming at the end of the month.
    I hope that the problems are solved now, and that you will have a more efficient way of letting people know there is a problem if it occurs on this scale again.

  • Paul

    Thank you for the apology, the informative explanation and most importantly the well explained way forward. Good communication is the keystone to any relationship!

  • Robert

    Many thanks for your full explanation Tim – it is good to know you guys have addressed the issues now – it has put my mind at rest.

  • http://www.clearbooks.co.uk Tim Fouracre

    Thanks for your positive comments.

    I’ve blogged again giving some more details about the next steps we are taking.

    http://www.clearbooks.co.uk/2011/11/04/next-steps-to-a-stable-platform/

  • Adam

    Sorry had to chuckle at this line:

    “CatN is in beta. What this means is CatN is planning to launch as a fully redundant system in January 2012 to the general public. At the current time CatN has no redundancy for its NFS server.”

    Well they’ve got a couple of months then :p

  • http://www.turnkeyit.co.uk Mike Turner

    Tim,

    “CatN is in beta. What this means is CatN is planning to launch as a fully redundant system in January 2012 to the general public. At the current time CatN has no redundancy for its NFS server.”

    I’m quite surprised that Clearbooks is using a hosting system which is in beta and not fully operational or fully supported.

    • http://www.clearbooks.co.uk Tim Fouracre

      @Mike CatN is not a beta company – they have blue chip clients on their private clusters. Their vCluster hosting solution and control panel, which we are currently on, is in private beta and is being used by a select few businesses.

      Clear Books recently recruited a full time sys admin who is working with the CatN engineers and our developers to build our own private Cluster bespoke hosting solution.

I want a piece of this pie

Pay nothing today and try Clear Books on a full featured free trial.
No payment details required for trial.

Try for free