On Monday, January 9, 2012, starting at around 11:30am ET, a script was scheduled to run and clean-up some space on the NFS partition. During that clean-up, the partition of the NFS server who's serving the CSS and some part of our API run out of disk space.
We detected this error at around 12:15pm ET and put this partition on read-only status until we freed up some space. We started the clean-up process immediately and had the partition back online at 12:30pm ET.
At that time we detected that some files were still corrupted or missing on this specific partition, and decided to rebuild the partition from scratch and recover a backup on it.
The restore took about 4 hours to complete, and we were back in action at around 4:30pm ET.
What we will do to prevent this in the future:
- Restore time - An upgrade of our backup solution will be scheduled in our next maintenance window.
- Increase the SAN reserved space for real-time snapshot before running any scripts.
We apologize for any inconvenience this may have caused you,
- The CakeMail Team
We are currently experiencing a issue with CSS for our client interface. The team is working on the issue.
Update 2:42pm ET - We are still working on the issue. Affected areas: CSS customization. (- AS)
Update 4:17pm ET- We are at 90% recovery! The process should be completed within the next few hours. (- AS)
Update 4:35pm ET - The issue has now been resolved and we are at 100% recovery! (- AS)
Comments
0 comments
Article is closed for comments.