Re: micro-optimization



Dave Hart wrote:
On Mar 11, 8:55 am, Martin Burnicki <martin.burni...@xxxxxxxxxxx>
wrote:

Dave Hart wrote:
I have released a new test version of 4.2.4p6 with numerous Windows-
specific improvements compared to the baseline 4.2.4p6.  Since my last
release, the most significant change is to read the processor cycle
counter using the RDTSC instruction directly when it is equivalent to
QueryPerformanceCounter.  When it is not equivalent, ntpd is allowed
to roam freely across all logical processors once again.

As a consequence of the above I'd say using RDTSC directly instead of QPC
is a step in the wrong direction. From what I've seen adding the
/usepmtimer switch should fix problems on systems where TSC is used even
though it is not reliable.

This should also make it obsolete to nail down all threads to a single
CPUm as suggested in bug
#1124:https://support.ntp.org/bugs/show_bug.cgi?id=1124

Or am I missing something?

I think so. This version only uses RDTSC if QueryPerformanceCounter
is using RDTSC. If the HAL is using a different timer for QPC, this
version uses QPC.

The question about "missing something" was intended to be related to the way
QPCs under certain HALs/CPUs/mainboard are working. ;-))

Regarding bug 1124 with this version threads are not nailed down to a
particular processor except when RDTSC underlies QPC.

Anyway, this sound pretty reasonable. However, you still may have to make
sure RDTSC calls are not messed up by other effects (clock frequency
changes, ...) which QPC seems to try to correct.

Martin
--
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany
.



Relevant Pages

  • Re: micro-optimization
    ... Dave Hart wrote: ... As a consequence of the above I'd say using RDTSC directly instead of QPC is ... This version only uses RDTSC if QueryPerformanceCounter ...
    (comp.protocols.time.ntp)
  • Re: QueryPerformanceFrequency
    ... While rdtsc is one possible implementation, on some CPUs/systems the QPC ... needs to be implemented in a more complex manner (multiple CPUs, ...
    (microsoft.public.vc.language)
  • Re: GetTickCount() performance
    ... >> On uniprocessor machines, that is. ... Cycles apart, I made some testing in the past where the ratios were ... approximately 1:10 for rdtsc vs QPC on SMP and 1:80 for rdtsc vs. QPC on ...
    (microsoft.public.win32.programmer.kernel)
  • Re: micro-optimization
    ... the most significant change is to read the processor cycle ... thereby making the count returned by RDTSC ... have a variable rate on some processors (namely Intel's Pentium4 and ... QueryPerformanceCounter, ...
    (comp.protocols.time.ntp)
  • Re: GetTickCount() performance
    ... >> QueryPerformanceCounter takes 1,000. ... KeQueryPerformanceCounter calls ... I claimed that QPC() was "small hundreds" when using RDTSC. ...
    (microsoft.public.win32.programmer.kernel)