Tuesday, June 10, 2008
Webclusters 150 and 156/212
It has become apparent that the work on the database server today has not resolved all of the issues on the above clusters. While the service is working better than it was earlier, the process list on the web servers is continuing to climb to the point that new connections are not permitted.
At this point we have eliminated the possibility that the fault is within the LVS load balancer (that was replaced this morning) and the SQL server (replaced 1pm with a new server). We have also eliminated the network as the possible source of the issue as virtual servers and the mail service are working without issue.
The only other element of the service which is now in question is the nfs file server. While there are no obvious errors being produced we feel that it is the only possible cause of the issues left. A new file server is in the rack and we have just begun the process of transferring of data from the old server to the new. We expect that to be substantially completed within the next 4 hours.
UPDATE 8PM
The transfer of files to a new file server is underway and proceeding without issues. The web servers have been pointed to the new file server. Files are being restored from a to z so sites starting a and b have already been migrated. Judging by the first hour of transfer, we expect the process to complete in the early hours of the morning.
We would like to thank customers for their patience during this time.
UPDATE 1AM
We are approaching half way through the transfer of sites from the old file server to the new. We expect the remainder of the process to be completed by 6 to 8am.
Webmail services have been restored and are working without issue.
FTP access to the new file server will be suspended until mid morning Wednesday.
UPDATE 7AM
The file transfer is still running with about 75% of sites completed allbeit very slowly. Clients with sites still not available can email our support email address with any sites not showing so that we can push them by hand. Priority will be given to business sites.
UPDATE 11 AM
All of the remaining sites should have been resored within the next 2 hours. Once that is complete, ftp access will be made available to the new file server. Customers will not need to change any settings in their ftp clients.
UPDATE 2PM
All transfers are completed and there appears to be stability at last. A few bugs with sites have cropped up during the process but they have been ironed out. If you are aware of any site which is not working correctly please raise the issue with our support mail address and we will investigate it.
In summary,
Mysql has moved to a new server, no changes required from customers.
File server is on new hardware, no customer changes needed.
FTP service is up and working, no change to FTP settings needed.
Webmail up and working.
We would again like to thank our customers for their patience during this issue.
At this point we have eliminated the possibility that the fault is within the LVS load balancer (that was replaced this morning) and the SQL server (replaced 1pm with a new server). We have also eliminated the network as the possible source of the issue as virtual servers and the mail service are working without issue.
The only other element of the service which is now in question is the nfs file server. While there are no obvious errors being produced we feel that it is the only possible cause of the issues left. A new file server is in the rack and we have just begun the process of transferring of data from the old server to the new. We expect that to be substantially completed within the next 4 hours.
UPDATE 8PM
The transfer of files to a new file server is underway and proceeding without issues. The web servers have been pointed to the new file server. Files are being restored from a to z so sites starting a and b have already been migrated. Judging by the first hour of transfer, we expect the process to complete in the early hours of the morning.
We would like to thank customers for their patience during this time.
UPDATE 1AM
We are approaching half way through the transfer of sites from the old file server to the new. We expect the remainder of the process to be completed by 6 to 8am.
Webmail services have been restored and are working without issue.
FTP access to the new file server will be suspended until mid morning Wednesday.
UPDATE 7AM
The file transfer is still running with about 75% of sites completed allbeit very slowly. Clients with sites still not available can email our support email address with any sites not showing so that we can push them by hand. Priority will be given to business sites.
UPDATE 11 AM
All of the remaining sites should have been resored within the next 2 hours. Once that is complete, ftp access will be made available to the new file server. Customers will not need to change any settings in their ftp clients.
UPDATE 2PM
All transfers are completed and there appears to be stability at last. A few bugs with sites have cropped up during the process but they have been ironed out. If you are aware of any site which is not working correctly please raise the issue with our support mail address and we will investigate it.
In summary,
Mysql has moved to a new server, no changes required from customers.
File server is on new hardware, no customer changes needed.
FTP service is up and working, no change to FTP settings needed.
Webmail up and working.
We would again like to thank our customers for their patience during this issue.