Re: accentuation mark



In article <5xt*I8ens@xxxxxxxxxxxxxxxxxxxxxxxxxxx>, Theo
Markettos <theom+news@xxxxxxxxxxxxxxxxxxxxxx> wrote:

However you'll get into trouble if any of the following
happens: Your outbound text is tagged as us-ascii not
iso-8859-1 Your news/mail stream is only 7 bit clean
(unlikely these days, though there exists
Quoted-Printable to avoid this - I don't know if any RISC
OS clients send in this way) Your recipient is ignoring
the charset and interpreting it as that of their own
machine (which might be another ISO-8859, UTF-8 or
another Far Eastern encoding)

There is an annoying bug / feature in Pluto, which causes it
hang on to a prviously declared non-ISO-8859-1 charset.

If you have read an e-mail that declares ISO-8859-2, and I
get quite a few, even from within the UK, then **any**
subsequent e-mail that you write **as a reply** also
routinely declares ISO-8859-2.

This happens to news postings too; see
<4fe1cdfdacsee.sig@xxxxxxxxxxxxxxxxxxxxxxxx> for an example,
where the Polish upper Case dark L is substituted for the
GBP sign.

I should have gone to Pluto's queue and edited the charset,
but I forgot.

You can, of course, use this in reverse: if you **want**
eastern European characters in your e-mails, you can use
!XChars to insert them. Initially they will display on
screen as ISO-8859-1 characters, but if you then edit the
raw e-mail in the queue from ISO-8859-1 to ISO-8859-2, they
display correctly.

In the case of Pluto which originated this thread the
posting was correctly tagged as iso-8859-1 and I could
read it fine in my UTF-8 news client.

It does not, though, work correctly the other way round.

I get an increasing number of e-mails from Germany and
France that declare UTF-8. They display correctly the first
time, but subsequent openings produce text where every
accented character plus the following charachter is replaced
with #.

The cure is to export the e-mail, change the charset from
UTF-8 to ISO-8859-1 and reimport into Pluto. A pain.

I am getting increasingly adept at reading both German and
French with missing characters and # in!

--
Russell
http://www.russell-hafter-holidays.co.uk
Russell Hafter Holidays E-mail to enquiries at our domain
Holiday specialists for Germany, Alsace, Austria, Belgium, Luxembourg, Czech Republic
.



Relevant Pages

  • Re: [kde] Character sets / encoding
    ... viewed with UTF-8. ... page for incoming mail to either ISO 8859-1 or IBM cp 1252. ... If the characters you typed were umlauted, ... wants to show the bits from the net in a readable form) which Charset (and ...
    (KDE)
  • Re: Input Character Set Handling
    ... that compares a UTF-8 string to a string that a user has inputted into ... rather often if they have any clue at all about Unicode). ... Unicode is a *charset*: a set of characters where each character unit ...
    (comp.lang.javascript)
  • Re: FSO Issues with Unicode
    ... That didn't show me which characters were the problem. ... support in the HTML Help Compiler was to change the meta tag to UTF-8. ... If you find that "simple script", please post it, I would appreciate it. ... charset value above accordingly. ...
    (microsoft.public.scripting.vbscript)
  • Re: Interconversion of and
    ... If you're going to use such characters, ... really shouldn't declare your Content-Type charset to be ... Plagued by character loss again! ... UTF-8 it isn't. ...
    (sci.lang)
  • =?utf-8?B?UmU6IFN0cmluZyAiw6LigqzihKIiIHRyYW5zbGF0ZWQgdG8gYXBvc3Ryb3BoZS4gV2h5Pw==?=
    ... it works), though it seems to use mostly just Ascii characters, representing ... but the author is not making the best possible use of UTF-8. ... They don't map it to ASCII apostrophe, ... Latin 1 encoding. ...
    (alt.html)