Re: Insufficient guarantees for null pointers?



Harald van D.k wrote:
Wojtek Lerch wrote:
Harald van D.k wrote:
....
conversions to/from char *. When combined with unions and offsetof, how
will the compiler know what the bounds are after converting that char *
to an int *, if it could point to either of two arrays which happen to
overlap?

That may be easy or difficult or impossible, depending on the exact
rules; unfortunately, the text of the standard doesn't tell us the exact
rules. If you read it literally, the only thing it guarantees about
converting your int* pointer to char* and back is that the result
compares equal to the original pointer. There are no words there that
promise that the double conversion preserves the array bounds associated
with the pointer; in particular, there's no guarantee that the converted
pointer points to an object rather than to one past the end of an array.
Everybody knows that the intent was to promise more than the literal
reading does (as a matter of fact, a lot of people seem not to notice
how little the literal reading guarantees); but I doubt everybody agrees
on the exact interpretation of what exactly the promise was meant to be,
especially when applied to really obscure cases involving pointer
conversions and unions or allocated memory.


I'm not sure I'm understanding correctly. I will assume offsetof is
supposed to be useful.

I'm sure it was meant to be useful, just like the whole C standard was meant to be clear and unambiguous and to define a useful language. But since it was created by human beings, there's no guarantee that it has completely achieved all those goals. Reconciling the assumption that some mechanism is useful with what the letter of the standard actually promises about it may require some creative interpreting, and may produce different results depending on who's doing the interpreting and what his criteria of usefulness are. Maybe the only legitimate use of offsetof is to produce pointers that you're going to feed to memcpy()?

struct S { union U {
int a[1][3];
int b[2][2];
} u; } s;
int *p1 = &s.u.a[1][0];
int *p2 = (int *) ((char *) &s + offsetof(struct S, u.a[1][0]));
int *p3 = &s.u.b[1][1];
int *p4 = (int *) ((char *) &s + offsetof(struct S, u.b[1][1]));

As far as I can tell, the standard doesn't say anything about what happens when you convert an arbitrary char* pointer to int* (other than that if it's properly aligned, you get a value that can be converted back). Even if you can prove that it points to the first byte of a structure member or array element whose type is int, that doesn't help much -- since the committee's official response to DR260 allows compilers to keep track of where a pointer came from, there's no guarantee that all char* pointers that point to that int's first byte can be safely converted to int*.

Still, I'm sure the authors of the text meant to promise more than a literal interpretation of the words guarantees -- most likely, the intent was for the converted pointer to point to the object whose first byte the char* pointer points to. Assuming, again, that it's properly aligned and that there are at least sizeof(int) bytes where your char* pointer points to. (But how do you count the bytes -- do you have to use the char* pointer's bounds, or is it sufficient to just prove that the bytes in question belong to a bigger object that provides enough storage for an int, even if that storage doesn't fit within the bounds of your char* pointer? Frankly, I have no idea. In your particular example, it makes no difference anyway.)

But when the int object is part of a bigger object whose declared type allows seeing the int as an element of several differently located arrays inside a union, or even as the phantom object just past the end of yet another array, the standard doesn't say how to pick the array that determines the range of integers that can be safely added to your pointer, does it? I have to admit that I have absolutely no clue about the intent here. And the trouble is that since on pretty much all practical implementations, the result of overflowing the bounds, whatever they are, is what you'd normally expect, few people seem to care, even if they actually understand that there's a hole in the spec.
.



Relevant Pages

  • Re: Memory Structure Pointer Problems
    ... typedef struct sta { ... char* name; ... int num_cmpnds; ... A pointer to a struct cmp is almost ...
    (comp.lang.c)
  • Re: problem with memcpy and pointers/arrays confusion - again
    ... int line, unsigned long *total_mem) ... That's a long pointer address... ... If sizeof > sizeof which is ... if you allocate for char with sizeof < sizeof, ...
    (comp.lang.c)
  • Re: Typecast clarification
    ... If 'int' is a four-byte type, there's 24 different byte orders theoretically possible, 6 of which would be identified as Little Endian by this code, 5 of them incorrectly. ... A conforming implementation of C could use the same bit that is used by an 'int' object to store a value of '1' as the sign bit when the byte containing that bit is interpreted as a char. ... there are implicit conversions between void* and any other pointer to to object type. ... If the value is currently of a type which has a range which is guaranteed to be a subset of the the range of the target type, safety is automatic - for instance, when converting "signed char" to "int". ...
    (comp.lang.c)
  • Re: Typecast clarification
    ... If 'int' is a four-byte type (which it is on many ... that 'char' and 'int' number the bits in the same order. ... because you cannot dereference a pointer to void. ... safety is automatic - for instance, when converting "signed char" to ...
    (comp.lang.c)
  • Re: Request critique of first program
    ... Returns a pointer to an asplit_result struct (unless unable to ... Success is indicated by SUCCESS, ... const char *out_file_b, ... const long int num_lines); ...
    (comp.lang.c)