Saturday, February 10, 2007

 

Service events

Virtual Server Host 48 was power cycled today. It had ceased to respond. Remote power cycling failed to bring it back up. After a short delay on site operatives gave the unit an attended reboot bringing it back full service.

The host supplying egress filtering for the web clusters followed a similar demise, however did not return from a power cycle - querying PSU failure. In relation to this incident we have lost a primary resolving domain name server (not ns1 or ns2) - adding some latency to internal system and ADSL customer resolutions. This has resulted in a high load on cluster machines requesting data over HTTP from external sources.

However, a byproduct of this will be a lack of logging information for the immediate future. We don't envisage a loss of logging data - this is simply a loss of service delivery not data.

Changes will be affected across the clusters to remove this issue. In the interim pages that are reliant on external data will run slowly while the request times out, or appear not to run at all.

[1400 Update] Changes have been made to allow the cluster services to route their outbound HTTP requests via another egress filter, marking a return of service.

[Tuesday Update]
We are on site working on the failed egress server.





<< Home

This page is powered by Blogger. Isn't yours?