Mandatory tail call elimination (was: Needed features in upcoming standards to support VM's & language compilers)



Here's a draft proposal for how C could define mandatory tail call
elimination. I think I've addressed the issues raised so far.

It's big and hairy enough that I expect it's more of an argument for why
this doesn't belong in C, but it was quite fun to write anyway:-)

Note that I've never written more than a toy compiler, and it looks like
I really ought to before talking too much about this. Not to mention
that I should have looked closer at the Scheme feature first.

Do anyone know of a language with mandatory tail call elimination but no
garbage collection?

[[[
Repeating from previous messages:

Informal definition:

A tail call is a function call which is the last operation in a
function. Mandatory tail call elimination means, Quoting the R5
Scheme report, that the implementation must support an unbounded
number of active tail calls.
http://square.umin.ac.jp/~hchang/ksm/r5rs-html/r5rs_22.html

Purpose:
Programs in languages like Scheme which require tail call elimination
might more easily be auto-translated to C if C also supported it.
I don't know if that applies to this rather limited version of the
feature, however. It can't pass pointers into local variables as
arguments, for example.

In any case, mandatory tail call elimination also allows C programs
to be structured differently than today. E.g. instead of writing
a loop one might use recursion, and depend on the compiler to turn
it back into a loop instead of eating up more and more stack memory
for each "iteration".
]]]

Each semantic element in this draft has its own symbol. In a real
proposal, some of the symbols might be merged.

Symbols:
Function attributes _TailCallable, _TailEntry, _TailCaller.
Operator _TailEntry_of(function) ==> function.
Tail-call syntax return _TailCall(function)(args); and variants.

Example:
/* This has 2 entry points: For regular calls and for _TailCall. */
int _TailCallable foo(char *s);
int _TailCallable foo(char *s) { /* ... */ }

/* Pointer to the function entry point when using _TailCall. */
int _TailEntry (*tfoo)(char *) = _TailEntry_of(foo);

int bar(char *s);
int _TailCaller bar(char *s) { /* Function doing a tail call. */
/* ... */
return _TailCall(foo)(s); /* Or return _TailCall(tfoo)(s). */
}

Semantics:

When function F does the tail-call "return _TailCall(G)(args);", F's
scope is left *before* the call to G. Conceptually, it works like this:

1. F() returns a struct with {G, args} as if it had that return type,
2. G(args) is called instead of resuming F's caller,
3. If G also does a tail call, it goes to step 1 (with its own struct).
4. When G or a callee does a regular return, it returns to F's caller
as if F had done so.

Thus F's local variables and setjmp targets are dead when entering G.
In C++, this ensures destructors for F's local variables are invoked,
after <args> are saved using their copy constructors. I hope C++ lets
the compiler omit <args>'s constructor-destructor calls if <args> do not
move in memory. Also try-except-finally or whatever C++ call them are
finished before the tail call.

All this is needed because (1) F must not retain any resources such as
the memory for its stack frame, and (2) F cannot clean up after itself
after the tail call since control will never return to it.

An ordinary function call never has _TailCall semantics. So much magic
should not be invoked without spelling it out where it happens. Also,
just because a function supports _TailCall() doesn't mean it would work
right to use _TailCall() wherever that would compile successfully.

Conceptually, this can be implemented with wrappers and hand-waving as
below. In practice, this machinery can hopefully be omitted, since tail
calls are supposed to be as efficient as loops.

For function foo():
- The compiler puts the function body in a function named "_TAIL_foo".
- "foo" becomes the name of a wrapper function which loops through
steps 1-3 above, initially calling _TAIL_foo().
- "return _TailCall(foo)(args)" returns {_TAIL_foo, args} to the
wrapper, so it'll know what to tail-call. Plus info about calling
conventions if necessary, so it'll know how to do the call.
- A regular return from _TAIL_foo() first leaves _TAIL_foo and then
returns from the enclosing wrapper function. (Step 4 above.)

More hand-waving required: if foo() does a 3-argument tail call to bar()
which then does an 8-argument tail call, where can _TAIL_bar() put the
remaining 5 arguments it wants to return to the wrapper? Preferably not
in malloced memory, we're trying to match loops for efficiency here...
The constraint that the functions have the same prototype (below) helps,
but does not prevent this scenario for variable-argument functions.

Both with or without the conceptual model above, there might be trouble
with managing stack frames and the like, and with machine instructions
that manage stack frames. Maybe also signals and C++ exceptions at the
wrong time could cause trouble.

Even without implementing the wrappers, it can take some work to
arrange the arguments right. Consider f(a, b) doing _TailCall(f)(b, a):
In C++, swapping the arguments (tmp = a; a = b; b = tmp) involves
3 constructor calls and 1 or 3 (I don't remember) destructor calls.

Function attributes:

* _TailEntry: The function may only be called using _TailCall().
The attribute also causes name mangling: external name =
"_TAIL_<function name in the C program>".

* _TailCallable: Function foo with this attribute has two entry points:
_TAIL_foo() which is the _TailEntry function, and
foo() which is the wrapper above.

* _TailCaller: Required for functions that use _TailCall().
Applies only to the function definition, not the declaration.
Works like _TailCallable so the function can handle steps 1-4 above,
except the _TailEntry function is not exposed to the programmer.

Operators:

_TailEntry_of returns a _TailCallable function's _TailEntry counterpart.

_TailCall(func) looks like an operator but is a syntax which turns a
regular function call into a tail call. It cannot be used in other
contexts, e.g. assigning to a function pointer. (Sorry, it was the
best syntax I could think of at the moment.)

Constraints:

* Calls to _TailEntry functions must use the _TailCall keyword.

This is to ensure that the tail-call semantics is spelled out at the
place where it occurs. Otherwise the _TailCall and _TailEntry_of
keywords could have been merged. Well, they can anyway, but such
syntax-vs-operator overloading may complicate things for the compiler.

* _TailCall's operand must be a _TailCallable or _TailEntry function,
while _TailEntry_of's operand must be a _TailCallable function.
TODO: Maybe a _TailCallable operand may only be a named function, not
e.g. function pointer, so mapping to the _TailEntry function can be
done at compile time.

Further constraints for _TailCall:

* The enclosing function must have attribute _TailCaller.

* The result may be used only where the function call actually can
be performed as a tail call - the last operation in a function.
E.g. return x || _Tailcall(f)(y) but not x + _Tailcall(f)(y).
The compiler must be able to verify this. TODO: How smart
should the compiler be required to be for this verification?

* Caller and callee must have the same prototype.

This is because one function will handle the return from the call to
the other function. So they must have the same return type, and the
same or cooperating calling conventions. E.g. put the return values
in the same register, destroy the stack frames in the same way.

At least for external functions, the compiler only has the prototype
to deduce the calling conventions from. Plus implementation-specific
details, but not anything the standard can depend on.

The obvious way would have been to let _TailCaller/_TailCallable in
the prototype affect calling conventions. But _TailCaller may be just
an implementation detail, so its presence in the function declaration
should have no effect other than infecting the function definition.
Then _TailCallable cannot be required to affect calling conventions
either, since it must match _TailCaller.

--
Hallvard
.



Relevant Pages

  • Re: CAS, Lambda Calculus, Continuation and Functional Programming
    ... >>that makes tail call elimination less attractive for compiler writers. ... > I don't know why exception handling should prevent tail-call ... so a local array would be best (in languages ...
    (sci.math.symbolic)
  • Re: Iteration in lisp
    ... Nearly all compilers seem to do tail call elimination ... My remarks were directed not any user making a properly informed ... Common Lisp, I generally mean "any Common Lisp"; when I want to talk about ...
    (comp.lang.lisp)
  • Re: Was not making tail recursion elmination a mistake?
    ... >> I have recently been annoyeed by the fact that tail recursion ... >> elimination and must use loops or some simular structue. ... > have been an optional feature. ...
    (comp.lang.lisp)
  • Re: TAIL (optimised tail call)
    ... call generation is really a compiler optimization. ... Some programmers don't want tail call optimization because it would ... What's needed is a standard way to turn off tail call optimization ... And, finally, if you define FOO EXIT to be a tail call, what will EXIT ...
    (comp.lang.forth)
  • Re: TAIL (optimised tail call)
    ... foo then; ... should be considered tail calls, ... A very simple compiler of the form ... Optimized RECURSE EXIT can be replaced by AGAIN (and BEGIN at the ...
    (comp.lang.forth)

Loading