Re: Soupçon of cedilles and aperçus



In article <460c8e42$0$16554$afc38c87@xxxxxxxxxxxxxxxxxxxx>,
Peter Moylan <peter@xxxxxxxxxxxxxxxxxxxxxx> wrote:
Leslie Danks wrote:
Martin Ambuhl wrote:

Mike Lyle wrote:

Serious question: isn't this ASCII? alt-0199: Ç . . . alt-0231:
ç.
No. All ASCII codes are in the range 0 ... 127 (decimal).

Above that are the so-called "extended ASCII codes", which you can
read about at:

<http://en.wikipedia.org/wiki/Extended_ASCII>

For the benefit of those who didn't read that wiki article, it should be
underlined that there isn't just one extended ASCII set. There are many,
all mutually incompatible. (Or partially incompatible: some characters
turn out by good fortune to have the same encoding in two or more
different codes.) An example of the incompatibility can be seen by
comparing the postings of Martin Ambuhl and blmblm (who both quoted
Mike's question) in this thread: same text, different character codes,
and therefore different end results.

Huh. Contrasting data point:

When I view all three postings (Mike's, Martin's, and mine) in my
newsreader (trn), what I see is:

In Mike's post, neither of the non-7-bit-ASCII characters display.

In Martin's post, the first one displays as "^G" and the second as
a lower-case c with cedilla.

In my post, things display as for Martin's post.

When I look at all three posts in Google's archives, using Firefox,
all three display both characters as Mike intended (upper-case
and lower-case c's with cedillas).

A bit more below ....

[ snip ]

So that the receiver can know which code the sender is using, an
indication of the code is included in the MIME header lines of the
message. Unfortunately Mike does not have MIME enabled in his software,
which means that he can only send the 127 ASCII characters reliably. He
can have the impression of being able to send some extended characters
as well, but what the reader sees might or might not be garbled.

Aha. Makes perfect (?) sense.

(Geek stuff here, skip if not interested.)

I notice when I look at the headers of all three posts (Mike's,
Martin's, and mine), Martin's is the only one with MIME header lines:

Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit

Mike's has a line:

X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1807

which may or may not be relevant.

This would seem to account for why you see a difference between
Mike's post and mine. I'm not sure what my newsreader (trn) is
doing with these lines -- apparently not the right thing, since
I don't see the characters displayed correctly in Martin's post
either -- but it does at least seem vaguely aware of the existence
of MIME, judging by what I find in a quick skim of its source code.

The present state of the art is that almost all modern newsreaders
support MIME - although Outlook Express, for some reason, has it
disabled by default - and something like half of them support some form
of Unicode. We must, however, also take into account those who cling to
older newsreaders such as tin and slrn. These have very limited
character set support, but remain popular because in most other respects
they are superior to the modern newsreaders.

Happy to hear someone else say so. Just out of curiosity, what
are some of the ways in which you think these old newsreaders
(and I'd put trn in the same class as tin and slrn) are better?
What keeps me sticking with trn -- aside from familiarity and
sheer stubbornness, neither trivial -- are that it runs happily
in a text terminal, which means driving from the keyboard and
more-easily-controlled font size, and it calls my text editor of
choice (vim) for composing messages.

--
Decline To State
(But the e-mail address in the header is real.)
.



Relevant Pages

  • Re: POS Printer
    ... printer These special codes are non-printable ... characters ... I see no reason to be concerned with it at all. ... I know that a POS printer has an internal buffer so just because we send ...
    (microsoft.public.vc.mfc)
  • Re: ?? ??
    ... When the original ASCII standard was defined, ... When the control codes were defined, ... characters as and end-of-line, as it said in the official ignored standard. ... and an operating system requiring CR LF. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: How to get the plain text (without "||||||||||") from word ducment includes table?
    ... I have tried the Paragrph-By-Paragraph resolvent, ... I guess those additional marks might be defined ... cause in that those codes always be out ... >> number of characters retrieved from Range.Text doesnt equal to it that ...
    (microsoft.public.office.developer.com.add_ins)
  • Re: 2 PCs not visible in net view or network browsing - Why?
    ... The 16 characters names (counting the ending HEX codes ... ALL NetBIOS names are really 16 characters. ... > into DNS name? ...
    (microsoft.public.win2000.networking)
  • Re:
    ... In practice, other differences can be more important, such as special treatment of U+00A0 as table cell content by web browsers. ... Technically, when appearing in an element, the difference of the codes of the characters is important, since it's the code that is sent as part of the form data. ...
    (comp.infosystems.www.authoring.html)