Re: new IL: C (sort of...).
- From: "cr88192" <cr88192@xxxxxxxxxxx>
- Date: Fri, 12 Jun 2009 23:48:38 -0700
"Marco van de Voort" <marcov@xxxxxxxx> wrote in message
news:slrnh35i7c.2ifl.marcov@xxxxxxxxxxxxxxxxxx
On 2009-06-12, Rod Pemberton <do_not_have@xxxxxxxxxxxxx> wrote:
What exactly is C specific in this point, what doesn't go for sayThe implementation of arrays and strings and pointers are different.
Pascal,
Modula-2,Ada etc?
Pascal
is severely lacking in low-level features, and was lacking pointers.
I've no idea what you mean here. Even bitpacking is part of the nearly
every
dialect's syntax (though there is no requirement to implement it).
In the past (the long forgotten standards, 16-bit segmented models and the
virtual machine experiments) had other rules, but every 32-bit Pascal I
ever
used (read the Borland dialects used by 99.9% of the users) implements the
C
subset, except some weirdness as the ? operator and post and preincrement
as
expression (a similar feature as statement exists). And of course the
preprocessor is not mandatory.
only for "recent" Pascals (read: past 10 years), which could almost more
correctly be called Delphi's than Pascal's...
what I remember of Pascal at least, was from a compiler I was using in the
late 90s, and with my reference material being books from the 80s...
(at the time, I had also partly learned Fortran77, but noted that F77 style
code didn't apparently work in GNU Fortran...).
similarly, I originally learned ASM from an 80s era book, which talked of
nothing newer than the 8088 and 8086.
but, alas, the book I had from the 80s gave no real mention of pointers...
I think it did give mention though of "record-based magnetic disk storage"
or somesuch...
Over things like ? and ++ one could discuss if they are baroque remnant of
ancients architectures or not, but lets skip that discussing by stating
whatever your opinion on them is, they are by no means crucial for
lowlevel
programming.
++ is convinient, but yes, not crucial...
C effectively has no string type, and its library emulation of one is 1:1
convertable to Pascal. No difference there.
C doesn't need a string type...
ok, ok, granted for other reasons much of my compiler stuff internally uses
a string type, just it happens to nicely map to 'char *'...
actually, by default my compiler stuff also varies a little from what C is
"officially" supposed to do:
I don't really support locales and code-pages...
pretty much everything is UTF-8...
As far as I know Pascal even has been slightly upper hand,
performancewise,
because it didn't have pointer aliasing rules to hinder the compiler
(which
afaik you can only can escape off in C since C99)
not sure what is meant here...
Basically, you're trapped to programming in a limited box that you can't
escape from.
I wonder how this is possible
I'm not familiar with Modula-2 or Ada. I'm familiar with a
variant of PL/1 which was very Pascal-ish.
Pascallish like C# is Cish? IOW in syntax or in language?
hmm... Lua is Pascal...
C# is C'ish, or at least a lot more than Java is, in both syntax and
semantics...
after all, C# has structs and pointers for those who want them...
- C requires all objects to be at different addresses.
Union?
as noted, this is more "in theory" than in practice...
in theory, each address is a different object (and no too objects hold the
same address).
in practice, this is not strictly the case, as there are many special case
"pseudo-objects" which may show up at compile time which may not follow this
rule exactly, among many other cases which would neither be expected at the
C or ASM levels...
it can also be noted that I end up using SSE for all sorts of things, many
of which are not strictly supported by the processor (and a lot of code goes
into simulating a higher level of "orthogonality" than actually exists...).
(consider the ninjitsu magic needed to allow pointer-addressing to work via
a register that is actually just the upper-half of an SSE register, one may
soon notice that more than a few useful opcodes do not exist, and a good
amount of codegen contortion is needed...).
(I may end up deciding against general addressing via low/high XMM regs, as
the overhead is likely to be much worse than via GPRs, even in the face of
the high register pressure on x86...).
- C requires all C objects to map onto a contiquous sequence of C
characters.
- C implements an "offset operator" which indexes contiguous sequences of
C
characters.
I don't get these, could you explain what you mean here, and what the
relevance for lowlevel is? I think you mean that most constructs are
either
static structures or pointers to them.
I think the point is that, C bases its fundamental operating model on a
byte-mapped address space, which many other languages do not adhere to (as
such, C is much closer to the Von Neumann ideal than are many other
languages...).
- C doesn't implement arrays.
- C does implement an array declaration. That array declaration is
effectively converted into a pointer that can be used with the offset
operator in order to simulate arrays.
While it saves maybe on the compiler in a 4k environment, I'm more
interested in lowlevel TARGET than HOST. So while this may true, could you
please explain why having a normal array in addition to a pointer is a bad
thing?
I am not sure the intent of this point exactly...
for C99 at least one needs an actual array type (in addition to pointer
types), it just happens that arrays can be converted to pointers fairly
easily...
- C doesn't implement strings.
- C does simulate "strings", using an all bit-zero character to terminate
a
simulated "array"
(a decision regretted by many security consultant and frustrated
programmers)
it is not that bad IMO...
note, however, that FWIW one can add their own "string" type via library
code...
I typically do so, and typically regard strings as immutable...
I guess eliminating buffer overflow could require bounds-checking and
similar...
but, for whatever reason, many newer languages think it a good idea to waste
lots of memory with unicode strings (vs UTF-8 strings...).
- etc.
In other words, you only need a contiquous sequence of C characters and
an
address, to implement the (non-specialized) features of C, including
"strings" and "arrays". Just due to the way strings are implemented in
Pascal and PL/1 etc., I'd say C fits better with the underlying computing
platform. However, C has numerous other low-level features that most
high-level languages lack. These too add to it's usefulness as an IL.
You are on the wrong track here. First, there is no C string construct
that
I'm aware of that can't be matched in Pascal. There is a pointer to char,
you can increment it, check for zero, subtract them, add them. Stronger
even, Delphi/FPC come with a strings unit with the common null-terminated
char* routines.
IOW it is roughly like C++, you can simply do the pointer trick if you
feel
masochistic or (more rarely) see a need.
Note even that the Delphi higher level string type is fully compatible (as
in compiletime cast, no conversion whatsoever) with char *, as long as you
don't try to reallocate it.
not much comment...
although, one can still debate if Delphi is really Pascal, or more like
Pascal++...
.
- Follow-Ups:
- Re: new IL: C (sort of...).
- From: Marco van de Voort
- Re: new IL: C (sort of...).
- References:
- RFC: new IL: C (sort of...).
- From: cr88192
- Re: new IL: C (sort of...).
- From: Rod Pemberton
- Re: new IL: C (sort of...).
- From: Rod Pemberton
- Re: new IL: C (sort of...).
- From: Marco van de Voort
- Re: new IL: C (sort of...).
- From: Rod Pemberton
- Re: new IL: C (sort of...).
- From: Marco van de Voort
- RFC: new IL: C (sort of...).
- Prev by Date: Re: new IL: C (sort of...).
- Next by Date: Re: new IL: C (sort of...).
- Previous by thread: Re: new IL: C (sort of...).
- Next by thread: Re: new IL: C (sort of...).
- Index(es):
Relevant Pages
|
Loading