Re: Trigraphs forever



On Mon, 18 Aug 2008 01:20:47 +0200, jacob navia <jacob@xxxxxxxxxx> wrote:

Geoff wrote:
GCC disables trigraphs by default.

You must specify GCC_ENABLE_TRIGRAPHS or use the -trigraphs command line
switch to get the preprocessor to support them or use -ansi or certain -iso
dialects to get it to support them. In those cases it emits a "parse error"
message.

Visual C versions 6.0 and later emit "warning C4010: single-line comment
contains line-continuation character" at the offending line followed by
syntax errors on the lines that get parsed incorrectly as a result.

I think I prefer GCC's defaults.

I also like their caveat about trigraphs and the option to warn if the code
contains them:

"Trigraphs are not popular and many compilers implement them incorrectly.
Portable code should not rely on trigraphs being either converted or
ignored. With -Wtrigraphs GCC will warn you when a trigraph may change the
meaning of your program if it were converted. See Wtrigraphs."

http://gcc.gnu.org/onlinedocs/cpp/Initial-processing.html

Sounds like the GCC team got to work on their implementation and created
the necessary flexibility in their conformance modes and didn't waste their
time bitching about whether trigraphs belonged in the standards or not.

Yes, "wasting their time bitching about trigraphs"...

If you think that discussing the current C standard is a waste of time
you are in the wrong group.


Discussing the current C standard is not a waste of time. Discussing the
latest bug reported by a user of lcc-win is not discussing the latest C
standard.

It is clear that your compiler malfunctioned and didn't issue a proper
diagnostic when encountering what it interpreted to be a ??/ trigraph in a
one-line comment. In ANSI C or C89 that one-line comment would have been
illegal anyway.

While legal C99, it is a pre-processor bug to have parsed the // Why
this??????/ line as a continuation line since everything from the // to the
end of that line should have been ignored as comments by the preprocessor
anyway. (GCC is guilty on that score in -std=C99 mode as well.) It faults
on the closing brace line.

Eating the opening brace is not proper behavior in any case because a
continuation line of:

if (a < 0) \
{
//stuff
}

is legal C99 and is correctly parsed.

It seems to me the correct behavior of any compiler accepting one-line
comments should be to ignore all tokens following the // until the newline
which is, if I am not mistaken, what the standard specifies. This whole
issue arises out of the fact that single-line comments rely on the
occurrence of newline for closure and just about nothing else in C ever
has.

Even so, the string "if (a < 0) // Why this??????/\n{\n" shouldn't have
parsed to "if (a < 0) \n\n", which is essentially what is happening.

I think it is very important to have a correct defining document
for the language we are using. That's why I participate in the
discussions about the C standard.


The standards document is correct, by definition, since it documents the
intent of the standard but 5.2.1.1 interacts poorly with 6.4.9 since it
appears to require parsing of trigraphs before parsing of comments when in
fact the compiler should be in the single-line comment state and (IMO)
looking for newline and nothing else.

6.4.9
"2 Except within a character constant, a string literal, or a comment, the
characters // introduce a comment that includes all multibyte characters up
to, but not including, the next new-line character. The contents of such a
comment are examined only to identify multibyte characters and to find the
terminating new-line character."

Given 6.4.9 and 5.2.1.1 the compiler should parse "if (a < 0) // Why
this??????/\n{\n" to "if (a < 0) \n", making the { part of a two-line
comment and explaining why it's a bug in a compiler that tries to satisfy
both requirements.

GCC correctly emits a warning on the offending line when -Wtrigraphs or
-Wall is specified even though it has the same bug as lcc.

Personally, I think it was a mistake to allow a single-line comment to be
continued with \ to become a multiline (block) comment and 6.4.9 should be
amended to read:

"...The contents of such a comment are examined only to find the
terminating new-line character."

All the silly examples that follow, of single-line morphed into multi-line
comments, disappear. You will also notice that not a single example in
6.4.9 includes a trigraph set.

Single line comments ought to be single line comments, block comments are
for multiline comments, anything else makes the code hard to read.

Maybe the gcc people do not care about C any more (they only
participate in the C++ discussions).


They fixed their compiler to deal with it in a manner to be selected by the
programmer and it emits diagnostics that can identify the programmer error.

By your own admission, this is "an obscure syntax error in my compiler".

C never had // until it adopted it (poorly, apparently) from C++.
Makes me wonder what the C++ standard has to say about trigraphs.

.



Relevant Pages

  • Re: extended operators
    ... still a relevant character set encoding standard when you are considering ... Trigraphs do have the nice property of using characters that ... use the same code in all of the standard EBCDIC code pages. ... allow the programmer's code to use a different code page encoding. ...
    (comp.std.c)
  • Transponder Protocol Open Standard rec.pets-2005a
    ... But anybody can write an open standard, ... transponders readable by the standard is not intended to be identical ... as 20 character readings under this standard unless the reader designer ... If an excerpt string shows four or fewer transitions ...
    (rec.pets)
  • Re: Herbert Schildt, author of The Complete C++ Reference (NOT C Unleashed) rehabilitated on wikiped
    ... well as on Amazon and wikipedia, on Schildt, Heathfield et al. did try ... Annotated Annotated C Standard" at http://www.lysator.liu.se/c/schildt.html, ... auditing EACH AND EVERY LINE for character width dependent operations ... standard (something that is not mentioned in the annotations), ...
    (comp.programming)
  • Re: TR 24731 approved
    ... What is it about invalid reasons that a polite debator is ... though not sufficiently so to justify keeping trigraphs. ... that the ISO C Standard would not differ from the ANSI C Standard. ... I didn't say it is a political necessity to retain them. ...
    (comp.std.c)
  • Re: Segfault City
    ... Those who ignore the standard library condemn themselves to rewriting it. ... programmers have to use a dictated and not fully standard compiler. ... I'm saying that to rely on character ordering that is not guaranteed ...
    (comp.lang.c)