Sites outage

time to read 1 min | 182 words

We have an outage that appears to have taken roughly 12 hours.

The reason it took so long to fix, it was after business hours, and while we have production support for our clients, we never hooked up our own websites to our own system. A typical case of the barefoot shoemaker.

The reason for the outage? Also pretty typical:

image

The reason for that? We had a remote backup process that put some temp files and didn’t clean them up properly. The growth rate was about 3-6 MB a day, so no one really noticed.

The fix:

image

All is working now, I sorry for the delay in fixing this. We’ll be having some discussion here to see how we can avoid repeat issues like that.