Re: Is a RISC chip more expensive?



"Wilco Dijkstra" <Wilco_dot_Dijkstra@xxxxxxxxxxxx> wrote in message
news:XWiCi.47131$ie3.21895@xxxxxxxxxxxxxxxxxxxxxxx

I thought the GEM compilers were pretty good at the time. I do recall
the instruction grouping rules on the 21164 were complicated. What
you're saying sounds like a broken instruction scheduler (or one
defaulting to a different core). Schedulers work in a similar way as
described, but they don't have the intuition humans have in quickly
finding a good solution. Trying out all possibilities is too slow, so they
use heuristics, and these are only as good as the compiler writer...
Humans are also good at bypassing typical constraints a compiler
would have like reordering memory accesses or moving instructions
past branches.

The GEM compilers were good enough to win the SPEC sweepstakes at
the time, but good enough doesn't mean they were good. If you
looked at the code generated you would see a lot of garbage in
there. Complex arithmetic wouldn't get simple optimizations
like mixed complex-real arithmetic would convert the real number
to complex first, then carry out the arithmetic operation. Shifts
are problematic in Fortran, as the ISHFT intrinsic can shift either
way. As I recall, left shifts would be generated by the compiler
but not right shifts.

What makes you think there even was a scheduler? When you write
out some assembly code, you normally calculate or estimate the
latency of the code sequence and compare with other code
sequences and perhaps some ideal. It's not clear that GEM or
its successor in Intel would do such of a thing. It's not a
question of trying out all possibilities, just working out the
latencies of a couple of possible candidates via emulation would
have been a good thing had it ever happened.

Moving instruction past branches or out of loops is a double-
edged sword. Consider a loop like:

a = iand(a,not(b))

In a similar situation the GEM compiler moved the logical negation
of the scalar b outside the loop that modified the vector a. The
problem with this is Alpha already has an instruction that ANDs
an element of a with the logical negation of b, so moving the
negation outside the loop created an extra unnecessary step! You
may think that's not a problem because the extra stuff happens
outside the inner loop, but the code was being run in a context
where it was used until the setup time for the loop was greater
than would be the case for alternative code. The extra work
involved in setup changed the crossover point and increased the
overall execution time appreciably. Alphas could knock out the
inner loops so efficiently that optimizing middle loops made a
difference, and GEM wouldn't optimize middle loops very well.

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end


.



Relevant Pages

  • Re: IAR MSP430 compiler problem
    ... Does anybody knows how to force compiler to use call instruction ... since an infinite loop can clearly never ... If the scheduler depends on the compiler using particular instructions to call a function, ... when using the embedded multiplier on the msp430 you must disable interrupts or be sure that the interrupt routines don't use the multiplier. ...
    (comp.arch.embedded)
  • Re: To Richard Heathfield: enoughs enough
    ... >> But if it is translated instead by an optimizing compiler to a single ... >> instruction the function becomes in practice O. ... practice of calling strlen in a loop is "on the order of quadratic n". ... law actually REDUCES the effective cycle time, ...
    (comp.programming)
  • Re: IAR MSP430 compiler problem
    ... Does anybody knows how to force compiler to use call instruction ... If the scheduler depends on the compiler using particular instructions to call a function, ... infinite loop, and then complain because you can't see how to break out of it. ...
    (comp.arch.embedded)
  • Re: IAR MSP430 compiler problem
    ... Does anybody knows how to force compiler to use call instruction ... The compiler has deduced that a branch instruction is as good as a call instruction for this/these calls of Spin. ... Of course the compiler cannot know that your scheduler breaks C semantics and needs the return address. ...
    (comp.arch.embedded)
  • Re: Letter to US Sen. Byron Dorgan re unpaid overtime
    ... and what do you think of the following for loop: ... stack in the compiler for all nesting constructions including fors, ... the string a single instruction, which scans a string for a character, ... This single instruction, however, takes n cpu cycles. ...
    (comp.programming)

Loading