Re: Don't Fix It if it is Not Broken (was Looking at Macs...)



In article <FGKOe.49191$rp.27881@xxxxxxxxxxxxxxxxxxxxxx>, TheLetterK
<theletterk@xxxxxxxxxxxxxxxxx> wrote:

> > Our other G5 has 8 GBs of RAM, 16 times more RAM, therefore that RAM is
> > 16 times more likely to develop soft failures when it is heavily used.
>
> The chances of failure do not accumulate. If there's a .05% chance of
> data corruption (absurdly high) with 1GiB of RAM, there will also be a
> .05% chance of data corruption with 8GiB. Assuming the physical modules
> are (relatively) identical, of course.

Gadd, you are as stubborn and hard to convince as I am. <g>

Reputable electronics lab's do not agree with your conclusion about the
quantity of RAM and the frequency of RAM failure.

A quote, for your enlightenment:

"Statistically soft errors scale linearly with memory size, so 256MB is
twice as likely to see a problem as 128MB."

....reference item 4. of the following website:

<http://www.anandtech.com/guides/viewfaq.aspx?i=3>




Their conclusions vary from one soft error every month to one soft
error every six months, depending on which lab is doing the testing.

IBM for example claims one error every month for a 128MB PC100 SDRAM
module. That would translate to 64 soft errors per month for 8GB of
RAM. If the computer were located in Denver with its mile high
altitude, the error rate would be 640 soft errors per month, or
approximately one soft error per hour of steady operation.

That is just for the RAM modules themselves, it does not take into
account the additional soft failures associated with RAM support
circuitry like the memory manager module, or the soft errors associated
with the CPU transistors and all the CPU support registers.

Nor does it take into account the deterioration of the RAM modules with
age, which according to IBM is 7 years of reliable life with steady
use.


Don't even try operating the same computer in an aircraft at 60,000
feet, because the soft error rate increases to 6,400 per month, or
appromately one soft error every 8 minutes of steady operation.



> > Bottom line on this lightly loaded Mac, little RAM to fail, and the
> > CPU is idling with a lot of its internal circuits not being used, so
> > few soft errors will happen with this Mac.
> The frequency of errors shouldn't change dramatically with
> the amount of load the system is under.

Incorrect. Most of the RAM in a lightly loaded Mac is not even being
used, so any soft failures in the RAM which is not being used will not
cause corruption.

Put another way, from single-user mode it is easy to "cut-out" most of
your RAM, such that only one-sixteenth of your 8GBs of RAM is being
used, therefore you will get only one-sixteenth of the soft RAM
failures on a lightly loaded G5 computer.

It does not really matter whether you manually cut-out the RAM or not,
because if most of the RAM is not being used, it is not being used.

Same as if it were never installed in the first place.

And remember, the more RAM actively being used, the higher the failure
rate, according to IBM. (and others)



> > Therefore I can not agree that a whole bunch of 3rd party app's trying
> > to run at once do not heavily contribute to file corruption caused by
> > 'soft' failures.
>
> Can you backup your claims with some hard evidence?

Yes, plenty of references regarding various lab's testing conditions
are readily available to you from that same website.




> > Bottom line, extra maintenance can often benefit a heavily loaded Mac,
> > or a Mac in a hot environment, or a Mac with old RAM and old
> > components.
>
> No, no it can't. OS X will almost always do a better job of disk
> maintanence than it's administrator will. The only time an
> administrator should be involved is when *** hits the fan.

....locking the barn door after the horse has been killed by a mountain
lion does not make sense to me.

If minor corruption of files is not fixed by periodic maintenance, such
corruption can cause additional user files to become corrupted.

The "damage" an administrator causes by "unnecessary testing" is often
far outweighed by heading off little problems before they become big
problems.

File damage occurs over time, that is an easily provable fact.

If that damage is allowed to accumulate, because some misguided
administrator thinks nothing is wrong, the damage can build up to the
point where it causes severe problems, which could have been prevented.



Time for a war story - - -

I had slightly flaky RAM on an old Mac, nothing I could really put my
finger on. Everything appeared to be running normally on that old
Mac.

I could have let the situation go, until some nasty mountain lion
gobbled up my Mac and turned everything to ***.

But I didn't. I kept poking and prodding that Mac, also doing stuff
like reseating RAM modules, etc.

I was convinced in my own mind that the RAM was going south, even
though there were no overt indications, other than the results of my
testing.

Even memtest said the RAM was okay, but we all know that memtest does
not always catcth bad RAM, don't we.

Well, to make this long story longer, I removed all the RAM, replaced
it with fresh new RAM.

Viola! - - - all my poking and proding now revealed that nothing was
wrong with the new RAM. That old Mac never skipped a beat, never did
corrupt any files, and never will, thanks to my "preventative"
maintenance.

Now for the punch line. I placed the old RAM into a Mac I did not care
about, and after a few short weeks of operation that old RAM failed
miserably.

Did I do good in preventing a failure of my online Mac, you bet I did.

My only indication of impending failure - - - was the fact that areas
of the disk partition that should have been all hex zeroes, sometimes
had non-zeroes. (because the flaky RAM occassionally caused the
OS/file-system to write to areas of disk that it should not be writing
to)

With the new RAM, that anomoly did not occur any more.

End of War Story - - -

Mark-
.


Quantcast