Re: Recovery via Unrecovery
- From: David Gallatin <dgallatin@xxxxxxxxxxxxx>
- Date: 4 Sep 2007 19:33:34 GMT
On Sun, 02 Sep 2007 18:00:15 +0000, David Gersic wrote:
The chief problem here seems to be a lack of testing.
Testing. Testing! *froths at the mouth for a few minutes*
The inability of some people to understand the essential need for thorough
testing would astound me, if my ability to feel surprise at the stupidity
some people display hadn't already been burnt out from continuous overload.
*sets the not-so-way-back machine to last week*
Myself and a couple of my co-workers had been spending the last month or
so preparing to roll out some updates to our main web servers. A shiny new
service, some bug fixes, a couple other minor tweaks. A goodly portion of
this time was spent testing everything as best as we could to make certain
that nothing would unexpectedly fail come the big switch. Many minor
changes were made and potential problems were nipped in the bud. However,
knowing that the tendrils of long-standing systems were many and that
there might well be older functions that might be impacted by the updates
that we did not know of and therefore could not test, we asked that the
cow-orkers who regularly spoke with the customers, whose job it was to
know what services and aspects of the server were being used, to test
everything and report the results.
Come the day of the switch, we had green lights reported from everyone. A
few more bugs had been found and fixed, and the commands were given. The
new, improved servers were placed into position and the old ones were
pulled back.
And everything worked perfectly, right? Of course not. Clients called.
Cow-orkers complained. Management asked pointed questions. We pulled the
new servers right back out again and began to track down what had
happened. Every single error would have shown up with the bare minimum of
testing. A full three-quarters of the errors could be tracked down to one
specific cow-orker's clients. Her response when asked why she hadn't
reported the problem? 'Oh, was that today?' She'd done no testing at all,
but rather had rubber-stamped the go-ahead.
I've added the areas that generated errors to my testing list, so
those particular items won't cause a problem again, but that doesn't
remedy the core problem. I really don't want to get to the point where I
*have* to test everything myself, but if I don't do it myself I distrust
the result.
I need a drink.
.
- Follow-Ups:
- Re: Recovery via Unrecovery
- From: Steve VanDevender
- Re: Recovery via Unrecovery
- References:
- Re: Recovery via Unrecovery
- From: Steve VanDevender
- Re: Recovery via Unrecovery
- From: David Gersic
- Re: Recovery via Unrecovery
- Prev by Date: Re: Yet another glass (or carafe, or quart, or gallon) of whine
- Next by Date: Re: Heads-up: Possible Televisual Recovery
- Previous by thread: Re: Recovery via Unrecovery
- Next by thread: Re: Recovery via Unrecovery
- Index(es):
Relevant Pages
|