Re: Question about 6.2.4 of C99



On 17/03/2011 4:23 AM, Tim Rentsch wrote:
Wojtek Lerch<wojtek_l@xxxxxxxx> writes:

On 13/03/2011 12:23 PM, Tim Rentsch wrote:
Wojtek Lerch<wojtek_l@xxxxxxxx> writes:
...
The current C99 semantics forbid
compilers to reuse memory in some cases, unless the compiler goes
through the pain of proving that the as-if rule applies.

Yes, and I don't see anything wrong with that. In almost all
cases the burden of doing optimization (a space optimization
in this case) should be pushed on to the compiler rather than
muddying up the semantics and/or making the programmer work harder.

Muddying the semantics? You really believe the semantics are clerer
when non-VLA variables are declared and "initialized" in the middle of
their lifetime, than it would be if the declaration created and really
initialized the variable, the same way it does for VLAs?

In terms of program transformation and simplicity of
correctness arguments... Yes, yes I do.

Why is program transformation so important? Is it more important than consistency between VLAs and non-VLAs?

How do the existing semantics improve simplicity of correctness arguments? If my semantics were in effect, you could trivially transform a "My C" program into an equivalent C99 program by adding a pair of braces to mark the lifetime of every automatic object, and then you could forget about the difference between my semantics and C99. Would that hurt simplicity of correctness arguments too much?

Making the programmer work harder? You mean the programmer who has a
habit of jumping back to a spot before the declaration and using a
saved pointer to access the object from there, and finds it hard to
move the declaration up to before the first place in the code that
needs to access it?

No, I mean the programmer who has the misfortune to work on
code previously or also worked on by programmers like the
one you describe.

But if C99 had chosen my preferred semantics, such code would be just as invalid as code that jumped out of a block and tried to access the block's automatics from there.

Do you have any evidence of his existence?

Generically, yes. The key thing is, based on past
experience it seems a safer bet to suppose they do
than to rule out the possibility.

Well yeah; obviously it would be foolish to assume that there are no programmers who write broken code. But if the rule had always been that you're not supposed to jump to a place before a declaration and access the object from there through a pointer, I don't see how that would make anybody work harder compared to the C99 rule that you're not supposed to jump to a place before the opening brace of the block containing the declaration and access the object from there through a pointer.

....
Why
do you believe that applying that rule consistently to all automatic
objects would have made the semantics even more complicated than they
are when the rule for non-VLA objects is different? One rule for all
objects seems simpler to me than two different rules for two different
kinds of objects.

The VLA rules make correctness arguments more difficult.

You mean the lifetime rule, not the ban on jumping into the scope bypassing the declaration, right? Why do they make correctness arguments more difficult?

Having
only one rule for both would be easier to state but makes reasoning
about the program more difficult.

Easier to state, remember, and understand, right?

If it does not, reaching the
declaration makes the object's value indeterminate. If there's code
in the block that accesses the object without using its name, moving
the declaration between that code and the first time the name is used
can change the semantics. [snip elaboration]

Yes, you are right about this, and that was my oversight. However
that doesn't affect the basic idea, because we can get around the
problem of this rule by doing, e.g.,

goto SKIP_IT;
int a;
SKIP_IT: ;

which still allows the in-block declaration without getting
the spurious indeterminate value setting. (Note: in
cases where it might be a good idea to safeguard against
that.)

I have to admit that I have no idea why you find it so important to be able to move declarations around without changing the meaning of the program, but if you find the above acceptable, then what would be wrong about also adding a pair of braces to make your correctness arguments insensitive to whether the lifetime starts at the declaration or the beginning of tha block?

Have you actually encountered such a case in the real life, outside of
comp.std.c? Did you think it was good code? If the answer to both is
yes, I would be very curious to see the code.

Personally I tend to prefer putting declarations only at
the beginning of blocks. What matters to me is how easy
it is to reason about code I see that other developers
have written. The C99 rule (for regular variables) makes
such reasoning easier than if the VLA rule applied to
the non-VLA variables.

Can you give me an example?

Well no, I'm not proposing that C99 be changed; I'm trying to
understand why C99 originally made a choice that I find ugly and
counter-intuitive.

Oh okay. I don't find it counter-intuitive; perhaps not the
most immediately obvious, but comparing this choice to the
contortions of VLA semantics does show it clearly (IMO anyway)
to be the better choice.

As I said, I strongly suspect that what you refer to as contortions
are not caused by when the lifetime of VLAs starts, but are meant to
make it impossible to access a VLA whose size has never been computed.

I don't care what causes the contortions;

My point is that they have nothing to do with this discussion. We're arguing about whether C99 would be better if it had decided to start the lifetime of non-VLAs at the declaration instead of the beginning of the block; the "contortions" intended to forbid jumping around the declaration of a VLA have nothing to do with that argument.

having to take
them into account requires additional mental effort so
I would rather not have to in any cases where it isn't
necessary to do so.

....
Seems like a non-issue, since the compiler can (and I think most
will) handle all the easy cases, and those C programmers who care
about space efficiency mostly _will_ realize that adding the
extra pair of braces can help with this. Wouldn't you agree?

No. A compiler will often have to give up, especially if the code
calls functions outside of the current translation unit and the
compiler has no way to prove that they don't save or use pointers to
the objects. When dealing with large arrays, calling functions is not
a rare thing to do.

I suspect you haven't thought this through. Code like the
example you gave earlier (but since snipped somewhere along
the way) is easy to analyze in almost all cases where it
actually occurs, whether or not there are function calls
or pointer usage.

It's easy to notice, if you bother to try, that there's no goto or a setjmp that could lead to between the beginning of the block and the declaration; but if there is, figuring out whether the address of the variable can possibly be used after the jump may be much trickier, especially if the address is passed to external functions such as read() or strcpy() or user-defined functions that are not known to the compiler.

And again, I don't have any statistics about programmers who care
about space efficiency, but I know I am one of them, and it took me
more than ten years to realize that the extra braces are necessary.
And most programmers I know are less familiar with this kind of
obscure quirks of the C standard than I am, and will never read this.

I find this statement surprising. A large part of why Algol
introduced blocks was so variables could be used temporarily
and then space for them reclaimed (without the compiler having
to be very clever). I guess I expect anyone who has programmed
in C for a long time would be aware of this idea (and undoubtedly
C got some of this influence from Algol).

I don't think we're talking about the same idea.

I'd expect anybody who has programmed in C89 for a long time to be used to the fact that an automatic object cannot be accessed from any spot before its declaration. Doesn't the same rule apply in C++, even though C++ allows mixing statements with declarations? It's only C99 that introduced the possiblity to jump to before the declaration and access the object from there. Since I don't find it obvious, useful, or horribly harmful, and I don't recall it being loudly advertised when C99 was new, I only feel a little embarrassed about not noticing it for so long.

Did Algol allow it too?

Then again, these days compilers tend to do this sort
of space performance optimization (or "optimization")
regularly and fairly well, so putting in nested blocks
is hardly ever necessary for that reason, much like
the 'register' keyword which is hardly ever needed
anymore.

I just made an experiment with the compiler I'm using (GCC 4.4.2) and the result is disappointing:

$ cat foo.c
#include <stdio.h>

int main( void ) {
{
char arr1[ 1000 ];
printf( "%p...%p\n", arr1, arr1+1000 );
}
char arr2[ 1000 ];
printf( "%p...%p\n", arr2, arr2+1000 );
return 0;
}
$ cc -ofoo foo.c && ./foo
8046e90...8047278
8047278...8047660
.



Relevant Pages

  • Re: Question about 6.2.4 of C99
    ... compilers to reuse memory in some cases, ... muddying up the semantics and/or making the programmer work harder. ... than it would be if the declaration created and really ... I understand why the VLA rules are the way they are. ...
    (comp.std.c)
  • Re: Question about 6.2.4 of C99
    ... compilers to reuse memory in some cases, ... muddying up the semantics and/or making the programmer work harder. ... the same way it does for VLAs? ... I write an intra-block declaration and every time I read one I ...
    (comp.std.c)
  • Re: VLA feature of C99 vs malloc
    ... are declared to use the feature of variable lenght arrayin C99. ... Does the standard specify any limit regarding the size of VLA. ... the overhead of true memory allocation. ...
    (comp.lang.c)
  • Re: C-AUX / C and binary portability
    ... trigraphs, digraphs, and K&R declarations are, in many C compilers, currently at the level where they will cause compiler warnings. ... which is ambiguous, but in this case, it will just assume that the declaration was intended. ... the reason is that, if one does infact include headers and expand macros here, then the generated code will depend on whatever macros/constants/... ...
    (comp.lang.misc)
  • Re: Porting apps
    ... >> shut the compiler up. ... As I said, the usual suspects. ... If you don't have a C99 ... and then you only have to fix one declaration per port in the ...
    (comp.programming)