Re: Is Assembler Language essential in compiler construction?



"Bartc" <bartc@xxxxxxxxxx> writes:
I've tried targetting C and it was completely unsatisfactory.

First, there are a number of extra hoops to jump through in order to
generate C source code (syntax, formatting, creating suitable names
when your language uses namespaces perhaps).

Then, to implement certain features of your language might involve
using casting and other tricks in the generated C when it's datatypes,
and using a lot of gotos and labels when it's syntax.

Then you make the discovery that you can use casting and gotos for
nearly *all* your language constructs, meaning most features of C are
not needed

Yes, C is a little too high-level for that job, but it's still the
best portable assembly language we have. One can work around most of
these issues as you point out, though.

If your source language is close enough to C, you can even map some of
the source features directly to the corresponding C features (in
particular if these source features are more restricted).

Functions are still
needed, but only just.

Actually functions and other stateful control flow are the worst
problem with C as portable assembly language. C offers no usable
lower-level way to do this kind of stuff. If your source language has
stateful control flow that does not map to functions, function calls,
and setjmp()/longjmp(), you will have a hard time implementing the
programming language, and you may lose a lot of performance. Examples
of such features are guaranteed tail calls, exceptions (may be
mappable to setjmp/longjmp), and backtracking.

One way to work around that would be to map all control flow to gotos,
and manage the state explicitly, but this would typically mean
translating all the source code into one C function, which will blow
up most C compilers for larger source programs, and will disable any
separate compilation features the source language has. Also,
simulating indirect jymps (as necessary to implement, e.g., returns)
in standard C is slow.

Other ways are to compile the program into several C functions, and
then top arrange some calling and returning between them to satisfy
C's restrictions even if that would not be necessary in a lower-level
language. E.g., one way to implement tail-calls is to compile every
source function into a C function, have each tail-calling function
return the next function to be called, and have a call loop that calls
these functions one after the other.

So you end up with a target language which is a travesty of C,

It's certainly not something a human would write as C source code.
But then the assembly output of a compiler is also not something that
a human would write as assembly language source code. Is it a
travesty of assembly language then?

and
while it might be portable, won't compile to great code because the
structure the C compiler depends on is missing.

What kind of "great code" do you have in mind?

The code generation quality an optimizations that I expect from a C
compiler don't depend on a structure; a good C compiler will perform
decent code selection, register allocation, and instruction scheduling
without a particular structure in the source code. That's what I
expect from a portable assembly language.

One other optimization I relied on in my generated C code is copy
propagation/register coalescing, and that also does not need a
particular structure.

An optimizing C compiler will typically also recover, e.g., the loop
structure from code written with gotos, and may then perform loop
optimizations (although that's already not very reliable with
hand-written source code).

(And there's the headache of converting errors/line numbers in C
source back to your original code.)

The #line directive helps here, and you don't have anything else when
you compile to assembly language.

If the errors you have in mind are compile-time errors, they should
not happen with something called a compiler; i.e., you should not
generate code in your target language that the target language
processor (whether it is a C compiler or an assembler) rejects.

- anton
--
M. Anton Ertl
anton@xxxxxxxxxxxxxxxxxxxxxxxxxx
http://www.complang.tuwien.ac.at/anton/

.



Relevant Pages

  • Re: Teaching new tricks to an old dog (C++ -->Ada)
    ... > the standard language. ... Or did they just implemented some 80% of the new features? ... there was a fully compiant C compiler available. ...
    (comp.lang.ada)
  • Re: Teaching new tricks to an old dog (C++ -->Ada)
    ... > the standard language. ... Or did they just implemented some 80% of the new features? ... there was a fully compiant C compiler available. ...
    (comp.lang.cpp)
  • Re: "STL from the Ground Up"
    ... high-level intermediate language than can interoperate with many other ... If your language lacks expressive features then you cannot write code ... memory management in comparison. ... Mostly because type errors mean that the programmer and compiler disagree ...
    (comp.programming)
  • Re: A note on computing thugs and coding bums
    ... It would handle international characters if the execution character ... method I used in "Build Your Own .Net Language and Compiler". ... work areas and counting on Nul is an illusion. ...
    (comp.programming)
  • Re: access(FULLPATH, xxx);
    ... with "trial& error" to just silence the compiler. ... void *foo); ... given that the language in the specification _was_ abiguous and both ... documentation was paramount. ...
    (freebsd-questions)