Forum OpenACS Q&A: Response to Database errors after power fails

Collapse
Posted by Rodger Donaldson on
Of course, sometimes UPSes don't help.  I've worked in a number of places where there have been outages because:

1/ The UPS fails when grid power dissapears (never tested regularly, probably never worked).

2/ Electricians get involved (nohing like a fumble fingered electrician to take down your power network).

3/ UPS faults take down the system even though grid power is present (which is why dual power supply systems are best set up with one UPS rail and one grid rail, rather than both on UPS).

4/ Dummies have plugged so much non-essential equipment into the UPS that there isn't enough power for a clean shutdown.

Battery backing your disk arrays and controller so it can last the few seconds needed to flush to disk in an emergency is another level of protection (NetApps do this, for example), but this is subject to the same potential issues as a UPS.

The only thing you can really do is bite the bullet and pay the penalty of diabling write caching to your drives (or at least the drives that DB writes happen on).  You'll just need to make up the speed with a good system design.