Re: Weird kernel errors
- From: Chris Croughton <chris@xxxxxxxxxxxx>
- Date: Sat, 31 Dec 2005 13:32:20 +0000
On Thu, 01 Dec 2005 18:53:55 +0000, Alex Butcher
<alex.butcher.news1005@xxxxxxxxxxxxxx> wrote:
> On Thu, 01 Dec 2005 13:24:58 +0000, Chris Croughton wrote:
>
>> I'm using a Gentoo system, with kernel 2.6.10-gentoo-r6, on an Athlon
>> 1800XP system (Jetway motherboard). I'm getting a couple of oddities,
>> both annoying:
>>
>> The time is horribly variable. On occasions it suddenly loses 4-5
>> seconds. This may be related to the kernel error message:
>>
>> [kernel] Hangcheck: hangcheck value past margin!
>>
>> which as far as I can see means that a kernel timer has 'hung' for a
>> long time (when it occurs it seems to repeat about every 3-4 minutes).
>> This is worrying...
>
> I guess that could be down to buggy TSC implementation on your CPU (which
> is what hangcheck uses) or perhaps something is causing the machine to
> pause for >=180 seconds more than hangcheck is expecting.
>
> Is the BIOS flashed to the most recent version? (I'm thinking of ACPI bugs
> here)
>
> Is power management enabled in the BIOS?
Yes and I can't see an option. But oddly the behaviour seems to have
disappeared, no sign of the error message for several weeks now. And I
didn't do anything. Chrony is now happily keeping the time synchronised
to within a few milliseconds of the remote server (it was jumping about
all over the place).
>> The second is that my hard disks won't accept DMA enabling, so are slow.
>
> Some drivers/controllers won't let you enable DMA using hdparm, but will
> happily continue using it if it has already been setup by the BIOS before
> the kernel loads.
I've found it, eventually. It was setting up my new AMD64 machine that
found it, I ran hdparm -Tt when I was installing and it reported nice
speeds. Yesterday I put in a 250GB drive and ran hdparm on it -- no
DMA. No DMA on any of the drives. I wondered whether I'd broken
something. Then I thought to retry with the boot/install CD (not
something I can generally do with the fileserver) -- and DMA was enabled
and speeds were high again! So I liooked at dmesg from both the LiveCD
and the normal boot, and found that although both said that they were
using the generic IDE drivers the LiveCD one indicated that it was a VIA
82* controller and my one didn't. So I rebuilt the kernel with the VIA
82C* driver and it all enables DMA happily.
So then I rebuilt the fileserver kernel with the same change, and
rebooted. Having forgotten to rerun LILO (the AMD64 uses GRUB), it then
hung and I had to get a LiveCD for the x86 to run LILO. And then it
happily enabled DMA and I get around 60MB/s.
(Weirdness: I set the acoustic management to "slow and quiet" (-M128)
and the drive seems to be faster!)
> What values does 'hdparm -tT' report? Have you tried booting the kernel
> with 'elevator=deadline'? What effect does that have on the figures
> reported by 'hdparm -tT'?
It was getting around 5MB/s, around 7MB/s if I forced 32 bit transfers
(-c1). Now I'm getting 30MB/s on the 30GB drive, 50MB/s on the 80GB
drive and 60MB/s on the 250MB drive. Which is rather better, and
actually saturates the NFS throughput rather than being the
bottleneck...
> So CurCHS/CurSects seems to be the 8GByte ATA limit. This doesn't matter,
> as Linux is using the LBA48 geometry to access the entire disc. LBAsects
> must reflect the 137.4GByte LBA28 limit.
Yes, doing -I gives
CHS current addressable sectors: 4128705
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 488397168
device size with M = 1024*1024: 238475 MBytes
device size with M = 1000*1000: 250059 MBytes (250 GB)
> Various BIOS LBA48 implementations, chipsets, kernels and partitioning
> tools have various bugs that mean that sometimes partitions aren't created
> on the boundaries that other partitioning tools (and, worse,
> filesystems/block device layers) think they should have. Some of these
> bugs can be fatal. Yes, it sucks, particularly if you're running more than
> one OS (e.g. Windows, which has been reported to write outside its
> allotted partitions if they aren't "right").
I've seen that. If I actually need to dual-boot I use Partition Magic
(I know, it's non-free (as in beer as well as freedom) and runs on
'doze, but it's the best I've found) to create the partitions and Linux
or 'doze to format them as apropriate, that way everyone is happy.
> Read more about LBA48 at <http://www.48bitlba.com/>
Thanks. "No one will ever need more than a 8/32/137GB disk..." I have
over two terabytes of disk now in my house (that's frightening!), but
most of it is in USB drives and the largest drive I have so far is
250GB. It may take me a while to reach the LBA48 limit...
Chris C
.
- References:
- Weird kernel errors
- From: Chris Croughton
- Re: Weird kernel errors
- From: Alex Butcher
- Weird kernel errors
- Prev by Date: Re: "full tower" case?
- Next by Date: Re: "full tower" case?
- Previous by thread: Re: Weird kernel errors
- Next by thread: Re: Problems with 2 Web sites, using Konqueror
- Index(es):
Relevant Pages
|
|