Re: Don't Fix It if it is Not Broken (was Looking at Macs...)



In article <BF3669AF.34973%OhNoSPAM@xxxxxxxxxxx>, MR_ED_of_Course
<OhNoSPAM@xxxxxxxxxxx> wrote:

> The dispute is how often radiation causes errors for consumers, and then
> from that what action consumers should need to take.
>
> Where most of us are disagreeing with you Mark is that we see far less:
> Radiation -> RAM -> common every day file corruption
>
> Than we do:
> Causes X,Y,Z -> rare file corruption
>
> Where causes are most likely something like:
> X = drive failure
> Y = software error
> Z= user error


Okay, I can live with that line of thinking.

How many of us have been dead certain that we knew the exact cause for
a particular file being corrupted, only to find out later that we were
wrong, that it was some other totally unrelated "thing" that caused the
corruption.



> TechTool still does a memory check by writing the following patterns
> throughout the memory matrix and then reading and verifying each
> pattern:
> 10101010
> 01010101
> 11110000
> 00001111
> 11111111

Yep, MMMT (Mickey Mouse Memory Tests) Not much better than to POST
tests that OS X does at cold startup.



> I don't recall the memory test in previous versions of TechTool doing
> anything more substantial...other than having a very sexy voice.

Believe me, they are *much* more comprehensive in the older version,
doing much more than the utility "memtest" for example. The old
version can even do the "Major March" RAM test, which is ordinarily
only done on the RAM of super-computers. If done on a Mac Lombard
powerbook, the estimated time to complete Major March is 662 days, or
about 2 years.

I normally only do the hour long tests, like "Rotational", "Web",
"Leap", "Arpeggio", which take four hours total.



> Though again, none of this should be considered "maintenance".

It's maintenance in the same sense that it prevents additional file
damage from occuring, which would otherwise happen if you developed bad
RAM suddenly, and did not test RAM.




> I think you're confusing neutron bombs with EMP devices.

Probably - - - I did not read enough of that dry scientific article
from that website to see whether the guy tied in the neutron
bombardment to computer failure.

For certain, massive neutron particle bombardment would not do a
computer's RAM any good.

A lot depends on the speed of those neutrons.

If they are up near the speed of light, they have a tremendous
penetrating power, no "shielding" is going to stop them.





We could make a very long list indeed about what might cause file
corruption, when using consumer-grade computers.

1) Vibration
2) Extreme heat or cold
3) Fine dust
4) High humidity
5) Jarring the computer
6) Intermittant connections of all sorts
7) Flaky parts due to manufacturing defects
8) Radiation and high-speed particle bombardment
9) Deterioration of parts at the molecular level
10) Low voltages caused by any number of things,
causing the RAM cells to not be "refreshed" properly,
and some weaker cells therefore being interpreted as "0"
when they should be "1"


....and of course, the failure of older disks due to media damage, head
crashes, "stiction", worn and misaligned drive mechanisns.

....buggy software code can cause file corruption

....user accidentally corrupting a file, without noticing it

...."flipping" of RAM cells due to "crosstalk" from adjacent RAM cells
and support circuits

(I ran into a classic case of this happening, and the only RAM test
that detected it was the "Arpeggio" pattern on the old OS9 version
of TechTool Pro)

A flaky memory-manager chip on the CPU board caused intermittant
freezes of my Lombard 6 years ago. It took Apple 3 months and 3
turn-arounds to fix it, despite me telling them that TechTool Pro's
report specifically stated that the problem could be due to a bad
memory-manager chip.

Apple eventually replaced the chip, the freezes no longer occured.

TechTool passed the "Arpeggio" pattern RAM test with the new chip in my
computer.



All the present RAM tests available for OS X no longer test
specifically for crosstalk induced failures - - - "memtest" can't, for
example.

According to Micromat, PC users still have access to this kind of RAM
testing. Micromat claims they dropped the test from the OS X version
of TechTool because it "took to long" to run those tests.

I never found the time oppressive as long as I stuck to doing the
reasonable hour-long tests. Oh well...



> In your test, you're ignoring all possibility of any kind
> of software error.

Not entirely. I realize that these millions of lines of code could
hide mistakes that might well cause file corruption.

On the other hand, using all these app's and utilities on a daily basis
tends to make one trust them somewhat.
(assuming they are not _too_ buggy)

Everyone marches to their own drummer. I have had more bad luck with
RAM than most users, so I tend to be more suspicious of RAM than
most...

All we can say for certain about software errors is that they are rare
enough to make app's usable, but not so rare that we would trust them
with our lives. (except the space shuttle guys, of course, and they
have been brainwashed to believe that software is reliable)<g>






As regards the _major_ causes of file corruption, I guess we are all
left with our own opinions, which may or may not reflect reality.

It would be interesting to hear if someone has already approached this
subject in a scientific manner, and nailed down the real causes.



BTW, after a week I finally nailed down _which_ file had the
corruption that caused my error message:

"Beta tree node has invalid value"

It was the actual backup file I have been using regularly for at least
two months, without any problems.

Recently, that file became corrupt, for some reason. I went through
my entire computer, doing many hours of disk scans, many hours of RAM
testing, everything I could think of that might have corrupted that
file.

All the tests checked out okay, so I am at a loss as to what originally
caused the corruption.

Mark-
.