Re: primitives vs cleverness vs readability



Anton Ertl <anton@xxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
Andrew Haley <andrew29@xxxxxxxxxxxxxxxxxxxxxxx> writes:
Bernd Paysan <bernd.paysan@xxxxxx> wrote:
Processors have changed, and ITC on 32 bit processors already uses 4
bytes per instruction - on 64 bits, native code size doesn't change
much, but ITC doubles size again.

Well, hold on a minute. On a 64-bit processor you can use 32-bit
addressing for code, as long as you have less than 4 gigathings of
code. This is a common optimization used by many programming
languages. It's the default for gcc on AMD-64, for example.

It is not common to use 16-bit code addressing on a 32-bit
processor because no-on wants to be limited to 64 kilothings of
code, but the same reasoning doesn't apply to 64-bit processors.
There's no reason at all to use 64-bit threading on a 64-bit
processor.

There are a number of reasons:

* Nothing at all guarantees that the code is all in the lower 4G of
the address space (and on at least one platform it isn't), and the
gcc maintainers and others
have a tendency to break the gcc behaviour that we rely on;

Of course, this isn't something gcc maintainers have any control over.
I don't know who "others" may be!

e.g., we used to rely on the code being in the lower 32M on PowerPC,
and there is no reason for it not to be there, and it used to be
there, and then one day it just was no longer there. I then did a
linker script to put it there, but that stopped working a little
later (and on reporting this as a bug I learned that one should not
use linker scripts or somesuch).

* It's just simpler to have a uniform cell size that also covers the
threaded code, especially since we need to do this for other 64-bit
platforms anyway.

* It would buy very little to support 32-bit threaded code on 64-bit
platforms. Threaded-code size does not consume much memory there
(compared to what's available), and it also does not cause many
cache misses.

Fair enough. The last two, which are more or less "I can't be
bothered to change it, and it doesn't matter anyway" are rather weak,
but OK, there may be some legitimate reasons.

The claim was that ITC doubles size from 32-bit to 64-bit processors.
It doesn't need to be that way: you might choose to do it that way,
and on some operating systems you might even be forced to do it that
way, but it ain't necessarily so.

BTW, I just looked at the switch tables generated by gcc, and they
use 64-bit entries on AMD64, while according to you they could use
32-bit entries. If, as you say, there's no reason to use 64-bit
entries, why do they use them?

I don't know, but:

It may be a bug.

The compiler may always generate switch tables as arrays of pointers,
so perhaps it's a side-effect of using generic code to generate them.

Maybe a suitable 32-bit reloc type doesn't exist for AMD-64, but I
doubt that.

Maybe a simple jump indirect instruction is used, and that instruction
always uses a 64-bit pointer in memory.

.... etc.

Andrew.
.



Relevant Pages

  • Re: primitives vs cleverness vs readability
    ... There's no reason at all to use 64-bit threading on a 64-bit ... this isn't something gcc maintainers have any control over. ... compiler then chooses to add another instruction that just slows ... the JMP and use a 32-bit switch table: ...
    (comp.lang.forth)
  • Re: primitives vs cleverness vs readability
    ... There's no reason at all to use 64-bit threading on a 64-bit ... this isn't something gcc maintainers have any control over. ... work with ELF files, and I should write a linker script, which I then ... single instruction instead of two is worth the higher number of cache ...
    (comp.lang.forth)
  • Re: Openssl compilation and gcc options
    ... >> any reason it would be faster, and I can think of reasons it would be ... Neither instruction set is more native than the other. ... There is the possibility of using "denser" CPU instructions (that do ... So a tight loop with 32bit pointers may fit ...
    (comp.security.ssh)
  • Re: US Military Dead during Iraq War
    ... the exact reason why there is no reason to split it in the ... The fact that some traditional Assemblers still keep with ... dense Source, at a readability point of view, the traditional, ... one Instruction per line way, ...
    (alt.lang.asm)
  • Re: JES2 Exit6 - Changing Class= Based on PGM=
    ... For me, the primary reason for coding CLC =C'IOLQ',3is that the ... This instruction will compare (correctly, ... presume) four bytes of data because the first operand is four bytes long. ... a good reason for redundantly carrying the same value in two places. ...
    (bit.listserv.ibm-main)