Sunday, December 04, 2011
POP/IMAP/Webmail Servers
The file server responsible for storing mail data has crashed three times in the last 3 days.
Investigations have shown one of the drives to be generating IO errors.
At midnight the failed drive will be replaced and re silvered
UPDATE 9am Monday
The drive replacement / resilvering is 50% completed. The file server is operating in a degraded state so is running with less performance than normal.
We expect the process to be completed by 6pm.
Until that time the mail servers will run with a higher than normal load and may be less responsive that usual.
UPDATE 1pm Monday
Work to bring the file server up to speed with a full disk set should now be completed by 4pm.
We have identified during the work that one of the POP cluster servers is suffering from a faulty network connection. Further investigation will continue this afternoon.
UPDATE 3 pm Monday
The process to re silver the missing drive is due to complete at 4.30pm.
During the monitoring of the server loads it has become increasingly apparent that there are a number of customers with POP accounts where mail is being stored for considerable periods of time. POP is not best suited to storing mail for longer periods and it is apparent that the loads on the POP servers are being caused by inappropriate use of the POP accounts.
We will be performing a review of POP usage over the next week and advising customers where their usage needs to be modified.
UPDATE 8pm Monday
File system completed the rebuilt at 4.30pm. Server loads all returned to normal within a few minutes of the work being completed.
We will monitor the situation over the next 24 hours to ensure there are no further issues.
Investigations have shown one of the drives to be generating IO errors.
At midnight the failed drive will be replaced and re silvered
UPDATE 9am Monday
The drive replacement / resilvering is 50% completed. The file server is operating in a degraded state so is running with less performance than normal.
We expect the process to be completed by 6pm.
Until that time the mail servers will run with a higher than normal load and may be less responsive that usual.
UPDATE 1pm Monday
Work to bring the file server up to speed with a full disk set should now be completed by 4pm.
We have identified during the work that one of the POP cluster servers is suffering from a faulty network connection. Further investigation will continue this afternoon.
UPDATE 3 pm Monday
The process to re silver the missing drive is due to complete at 4.30pm.
During the monitoring of the server loads it has become increasingly apparent that there are a number of customers with POP accounts where mail is being stored for considerable periods of time. POP is not best suited to storing mail for longer periods and it is apparent that the loads on the POP servers are being caused by inappropriate use of the POP accounts.
We will be performing a review of POP usage over the next week and advising customers where their usage needs to be modified.
UPDATE 8pm Monday
File system completed the rebuilt at 4.30pm. Server loads all returned to normal within a few minutes of the work being completed.
We will monitor the situation over the next 24 hours to ensure there are no further issues.