Re: new IL: C (sort of...).




"Marco van de Voort" <marcov@xxxxxxxx> wrote in message
news:slrnh3d354.1o3m.marcov@xxxxxxxxxxxxxxxxxx
On 2009-06-13, cr88192 <cr88192@xxxxxxxxxxx> wrote:
yeah, far pointers, ...

And 16-bit structure and array limitations.


there were ways around this...
not like 64k issues are such a big deal when one only has maybe a few
hundred kB available...


what I remember of Pascal at least, was from a compiler I was using in
the late 90s, and with my reference material being books from the
80s...

Which compiler? Afaik the only other option is GNU Pascal and that
supports
arithmetic too (wouldn't be surprised if it did since the eighties)


I remember I used FreePascal / FPK Pascal back then, and a few others,
only
I don't remember which it was...

I can't remember that this didn't work, and my memory goes back to about
summer 1997, with some shorter dalliances before. I'll ask Florian


yeah...

well, my memory may be wrong as well, it was long ago...


I think the issue is that the language GNU Fortran accepts is too much
newer, and so F77 is no longer accepted...

We emulated some 16-bitisms for TPs sake in a separate dialect mode. Fun
part was that most of the real code ran faster in FPC's 32-bit with a bit
of
realmode emulation than in 16-bit mode:-)


interesting...


Speaking about archaic.....

yep, I guess this is what higher-end computers were using at the time...

(PC's then, had floppy drives, and a FAT-style filesystem, or at least in
the late 80s...).

Some of mine still do. Floppy and FAT that is. The floppy is only obsolete
after Windows XP. (XP still requires it as only option to postload drivers
for storage systems, short of slipstreaming them in)


yep...

I used to really like floppies because I could boot to them, so they were
useful for OS dev and testing (back when I did this...).

now, we have USB, and computers can generally boot to USB instead (nevermind
that likely one will need to likely retain a real-mode interface to the BIOS
to do so...).

but, alas, I don't do OS-dev anymore (not since I came to the realization
that, in the face of Windows and Linux, custom OS dev is kind of
pointless...).


C doesn't need a string type...

Well, I don't agree. At least not anymore since the era that compiler's
have enough memory.


but, it has 'char *', which can do strings fairly nicely (nevermind UTF-8
and wchar_t strings...).

I never saw "char *" as nice. Even the fact that it could do as lowest
common denomitor type in low level routines is offset by the disasterous
choice to make it nul terminated.


in practice the terminator is rarely a problem...
granted, with a little care, it can be handled in UTF-8 (embedded overlong
nul's...).


in my compiler, I made wchar_t a builtin type (in most cases, aliased to
unsigned short, FWIW).

In Delphi/FPC it is too, widechar. Delphi already made it default
(D2009), FPC is catching up (but that will probably take another year.
Parttime slowliness :_)


can't be "the default" as this would break C compatibility.
it is used mostly with non-C frontends.

from what I remember, it was "inherited" from the IA-64 name-mangling
scheme, which was the template for my original signature system. the JVM was
a later influence...


the C typesystem, and the change to a more language neutral form (AKA:
integers types are based on size, ...)

left some anomolies (such as, apart
from making the frontend mildly CPU-specific, I can either not support
LP64
on x86-64, or force LP64 on x86, both of which would raise issue...).

Brr. That's bad. You need ILP32 on x86 and both LP64 and LLP64 on x86_64.
Maybe even ILP64 for AIX iirc. For non x86(_64) you most definitely need
ILP64.


yep...

I am now thinking fixed-sized types may not be ideal for the upper-end IL.
it still needs some flexibility, and the frontend compiler needs to be able
to not really care what the arch is...

so, I will probably adopt a convention (for the upper IL):
byte/sbyte
short/ushort
int/uint
long/ulong
....

char/wchar: C-style char, wide-char

I may also drop the "__" everywhere, and instead make it so that any names
which may clash with a keyword are prefixed with '$' (like in C#...).

it is that or "__byte", "__sbyte", ...


this is because 'sizeof(long)' depends on arch, and previously the sig
handling code for 'long' had made it variable-sized, but now I have
designated it as being a fixed 64 bits, which otherwise makes a problem
for
x86 (where much code on x86 may end up assuming a 32-bit long...).

I typically do not map headers to base types. IOW I map per header the
used
types to the basetypes, sometimes OS dependant.


yes, ok.


wonder, since it was designed by Delphi's author. It is roughly Delphi
syntax with curly braces and a few C operators.

odd, I had thought C# had grabbed C's pointer system as-is, but granted I
had not looked at this aspect much...

Afaik you can't even make a linked list. And for other things, while
supported, you have to manually lock types in the VM, which means that if
you change an existing method to use pointers, it can turn "unsafe", and
then potentially "safe" callers have to mark objects that they send as
locked.

An implementation detail that changes the calling conventions is evil
IMHO.
Having very limited pointers is a fact of life in a VM language though,
and
not really a problem. But it makes drawing analogies from C# to C a bit
moot. One could only confuse C#'s pointers with C pointers if one is very
confused.


maybe...

afaik, their power is unleashed mostly in "unsafe" sections...


so, you are saying, they are close enough so one could probably just do
an
alternative parser and compile it along with these other C-family
languages?...

No idea. I just reacted on
1) vague claims about Pascal's pointer support. There is truth in there,
but
only in versions dead for twenty years, give or take a leftover mainframe
or
two.
2) analogies between C# and C with respect to pointers, which are IMHO
false.

If your IL can do C, you can do Pascal pretty much. The hurdle will be
more the module/unit concept to combine pieces of codes to mainprograms
(contrary to C having inclusion as single instrument)


I need something very similar to this to get C# working, so the mechanisms
are already largely in place...

if modules can be mapped more-or-less to C# style namespaces, it should
work...
actually, it is my intent to handle Java packages with more or less the same
machinery...


I use them for memory copying as well...

but they are used for many other tasks:
floating point;
vectors and quats;
128 bit integers and floats;
128 bit pointers;
'long long' (x86);
...

Yup. But not blind and general purpose. Usually in specialized code.
Because
if you run the average numerical code over SSE2 instead of the FPU, you'll
get into trouble wrt precision and IEEE compliancy (exception handling)


ok.

my FPU support is essentially broken at present.
if re-implemented, the FPU would likely be handled with trickery to make it
look like registers...

I hadn't really taken into account precision and compliance issues...


recently, I have added SSE half-register support, which could be useful
mainly for long-long (x86), and possible register spillover (x86 and
x86-64...).

Doing such things might bring you into trouble with the systems ABI, and
slow down due to the ABI requiring preservation of those registers. Make
sure you can turn it on and off (or maybe only for leaf-routines)


not really, since these are not callee-preserve, the ABI need not care much
what I do with them...
(I never said I would be preserving them...).


mostly this is for internal operations, but when done everything typically
flattens back out into actual variables, and in cases where GRPs run out.
typically everything is re-synchronized on calls (regs flushed out to stack,
....).


in the long-long case, it would likely be primarily beneficial as LL will
only take about 1/2 the register space (presently, LL uses an entire SSE
register). I am a little less certain though, as certain LL operations
may
be made less efficient (since the LL may be in the upper-half of the reg,
and otherwise has to "cooperate" with whatever other value it may be
sharing
the register with, meaning I can't just do full-register operations in
these
cases...).

On x86 how many algorithms are dependant on 64-bit integer support? Only
AES I guess?


LL shows up occasionally...
luckily it is fairly rare...


I guess if one wants to regard a string as an array, then UTF-8 is
awkward
since characters may be different sizes, and so it is felt the
closer-to-uniform indexing is justified (since indexing by char in a
UTF-8
string requires scanning through the string).

UTF-16 has surrogate pairs too. I never really found out if those are used
a
lot. Some say the Chinese pages above the BMP are in active use, and some
not.


yes, but typically the surrogate pairs are assumed to be pairs of
codepoints...


It's both, just like C++ also includes C.

yes, but C is not C++...

That's a matter of definition.

as I see it, Pascal is the older definition, which may include all the
stuff, and varieties, generally accepted to exist to begin with.

Delphi and newer Delphi-alikes may be a superset of Pascal, and far more
normalized than the rest of Pascal land, however, there is much in Pascal
land which is not Delphi, such as Ada, many older varieties, ...

Ada is not Pascal, and never was. And nearly all varieties are dead.


Ada sure looks like Pascal though...
other stuff I had read had classified Ada as "a Pascal...".


it is like, how C++ and Objective-C are both OO-based C supersets, but
each
is a very different beast from the other...

I think if C had been dead, people would routine refer to procedural parts
of C++ as C (and in fact they do now)


I disagree...

C is a language in its own right...


the C camp has actually put much emphasis on standardization, such that
nearly any compiler can be written "to the standards", and have code
generally work for it (source works between compilers, often binary code
will link between compilers, ...). one is safe so long as they don't rely
to
heavily on compiler extensions...

That has been the case with Pascal too. The original subset is mostly
supported by all, though slightly less workable than C's. However the
Borland ones branched off in the early eighties. I work with it since
1990,
and can't remember any other way.


ok.


Pascal has much weaker standards, and so people resort to an alternative
means: each implementation clones other popular implementations...

That's not true. That is a picture C people like to paint to boost their
own
prejudices and superiority feelings.

Fact is that since the early nineties, Pascal is way more homogenous
around
the dominant Borland versions and clones than modern C ever was. If you
forget about the odd mainframe pascal user.

While C has a bit standarization, the standarized part is so slow, that
any
non trivial app is non-portable. That is pretty normal for standards (the
ISO Pascal standards are the same), but it leads to a false sense of
portability.


potentially the case...

sadly, most of the common de-facto C libraries (such as OpenGL, ...), are
not parts of "the standard", none the less, GL is standardized in its own
right.


this would be much the same as if, in C land, as opposed to people
writing
compilers following the standards, everyone just started using gcc and
MSVC
as their reference implementations...

Which partially happens. But it is very difficult to compare these matters
black and white.

ok.

FWIW, my effort is not really a gcc or MSVC clone...
I do my own extensions, and many gcc'isms are not supported in my case...

FWIW, core language exitensions are typically rare in practice, and most of
the variation is in the libraries...

thus, a compiler written to the standard can generally handle most code just
fine.




.



Relevant Pages

  • Re: new IL: C (sort of...).
    ... Having very limited pointers is a fact of life in a VM language though, ... only take about 1/2 the register space (presently, ... nearly any compiler can be written "to the standards", ... That is pretty normal for standards (the ...
    (comp.lang.misc)
  • Re: unresolved external symbol
    ... >>Microsoft has announced that it does not intend to support export. ... and I would be interested in any pointers you can provide. ... > Not only gotten used to, but if using a compiler which does ... Is it trivial for users of the EDG front-end to support this feature? ...
    (comp.lang.cpp)
  • Re: Raw Convertors
    ... Those of the languages GCC claims to support, ... I'd find out which relevant ISO, ANSI etc. standards exist for C ... If I wasn't interested in GCC as a compiler but in the process ...
    (rec.photo.digital)
  • C-like interpreted language WITH pointers
    ... with support for pointers and memory aliasing. ... Pawn was a good candidate but it does not support pointers. ... I was thinking of using an actual uC compiler and ...
    (comp.compilers)
  • Re: if(trees == people)- not a OT
    ... But I have never bemoaned the lack of support for "inline". ... Another task of a standards body is to make sure that later standards ... who compile code for use with switchs telling the compiler to use ISO C ...
    (comp.programming)