Thunderbird bugs [was: lots of other topics]
- From: Peter Moylan <peter@xxxxxxxxxxxxxxxxxxxxxx>
- Date: Thu, 23 Mar 2006 22:53:41 +1100
[Warning: this posting is long and painfully technical. In fact it
wouldn't even belong on aue if it weren't for the fact that I know that
lots of people here do read the painfully technical stuff.]
Paul Gorodyansky wrote:
Peter Moylan wrote:[...]paulgor@xxxxxxxxxxxxxx wrote:No, Thindebird does NOT have such errors - please see my message
about non-Western texts
This is still sounds strange (I did not see such complain in Russian
forums of Thuderbird users, for example, on this one: http://forum.mozilla.ru/viewforum.php?id=7 ), I still think that it's
some of your settings (or lack of) and not a software bug (have you looked at my instruction?) - for example you may have Auto-Detection of encoding turned ON while (saw on forums) this feature is VERY buggy in Internet Explorer and buggy sometimes in Mozilla products, too:
but in UTF-8 the Cyrillic characters might or might not turn into question marks, depending on luck. Usually they reappear when you do a followup to such a posting. Often the best way to read a UTF-8
message in Thunderbird is to do a "Reply", read the message, and then not send the reply.
Question marks are very, very specific thing and has very, very specific cause - written down in my previous e-mail - or in my Outlook Express instruction (same issue in Thunderbird and OE):
http://ourworld.compuserve.com/homepages/PaulGor/oe_e.htm#qm
I did look at your information, but for a couple of reasons it doesn't
apply to me. You're mainly talking there about working around bugs in
some Windows software to use the KOI8-R character set. I hardly ever
use Windows, and I almost never write anything in Russian. (The fact
that I recently took part in a thread that involved some examples in
Russian is a coincidence; in fact I know very little Russian.) My
interest in non-ASCII character sets comes partly from the fact that I
do need to write in French now and then - but for that Latin-1 covers
everything except for a couple of ligatures - partly from a hope that we
will some day be able to use genuine IPA in this newsgroup, but mostly
because I'm the author and maintainer of mail server software.
(As a side issue, I think it's a great pity that KOI8-R became the de
facto standard for Russian e-mail, because in that character encoding -
I think that's the troublesome one - the code for the letter 'ya' is a
Telnet control character. As a result, it is not possible to send
Russian e-mail through a mail server that conforms with the SMTP
standard. I had to add a "cripple Telnet" option to my server in order
to allow it to be used in Russia. I think Unix sendmail also solved
that problem by removing Telnet compatibility. For Windows servers it
was less of a problem because Microsoft doesn't believe in following
standards anyway. There was a better ISO encoding for Cyrillic that
avoided this bug, but for some reason it never became popular in Russia.)
Now that I think of it, there's a third reason why your advice doesn't
apply to me. Until recently, I used real e-mail software rather than
using browser-based products for e-mail. The only reason why I switched
to Thunderbird is that the author of my previous e-mail program stopped
maintaining it, and the people he sold it to showed no real interest in
fixing the outstanding bugs. For a while Thunderbird looked pretty
good, on the whole, and by the time I discovered that it had a
fundamental design flaw I was pretty much committed. Changing mail
software is a painful job when you're trying to port about 20 years'
worth of old mail, address books, etc.
The design flaw comes from the fact that people who write web browsers
should never have tried to include an e-mail client as part of the
browser. They should instead allow linking to a genuine e-mail client
of the user's choice. I could insert here a rant about software design
courses that concentrate on coding and ignore design, but I'd better not
get sidetracked. This error was originally made by Netscape, and it
has been repeated by everyone who has copied the Netscape code without
understanding it.
(I'm not sure how Internet Exploder managed to duplicate the bugs in
Netscape, given that it came out before Netscape released its source
code. My guess is that IE was developed by Netscape and sold to
Microsoft in a secret deal. The alternative would be software piracy,
something that an ethical firm like M$ would never engage in.)
Why is this a problem? It's a problem because the web browser designers
never noticed an important difference between HTTP (the mechanism for
transferring web pages) and the e-mail standards. In HTTP you can
specify a language at the sending end and a preferred language at the
receiving end. In hindsight that too is a design error, because it
means that more than 95% of the world's languages end up in the "other
languages" category; but it's too late to change that aspect of the
standard. (It also means that languages like Russian and Chinese appear
as 7 or 8 different "languages", because of competing standards. That's
the main reason why Russian and Chinese users, and a few others, need
special instructions on how to configure their browsers, rather than
being able to just start the browser and use it.) Anyway, the
consequence is that in a web browser you talk about a "language" rather
than about a "character encoding".
In e-mail you have exactly the opposite situation. The way e-mail was
extended beyond 7-bit ASCII was via the MIME standards, plus the
definition of a whole bunch of ISO character sets. (Plus,
unfortunately, a few character sets that are not ISO-compatible; but at
least we know what they are, and we can assume that all the
non-Microsoft ones will remain stable over time.) The concept of
"language" is never mentioned. Instead, the MIME headers tell the
software what character encoding the sender is using, and it can use
that knowledge to give an appropriate display at the receiving end.
This worked out so well that newsreaders also adopted the MIME standard.
There are really only two situations where this goes wrong. One is
when the receiver doesn't have the right fonts installed (e.g. I don't
think I have any Korean fonts installed, but this hardly matters because
I can't read Korean anyway.) The other is when the sender has a really
primitive mail agent (e.g. Outlook Exmess) that doesn't insert correct
MIME headers.
Summary so far: the Netscape designers could never get their e-mail
right because they thought of e-mail as a special sort of web page, and
they didn't give enough thought to the question of the ways in which
e-mail is different from web pages.
I think it was Netscape, too, who introduced that horrible mess called
HTML mail. Nobody seems to like receiving HTML mail, and nobody except
spammers and ignorant beginners send it. Netscape's attitude seemed to
be "We've gone to a lot of trouble to develop an HTML rendering engine,
therefore you will bloody well use it whether you like it or not." This
sounds a bit like the stories about "choice" that used to circulate
during the worst of the Soviet era - and, for that matter, even in the
Hobson era. I've recently discovered, to my great dismay, that the
Thunderbird people have the same attitude. When you create an account
in Thunderbird, the sending options are set to "send HTML" - you don't
get to choose - so you have the extra overhead of having to go back and
turn it off again. Anyone just starting to use Thunderbird is put in
the embarrassing position of sending out unwanted HTML without realising
it. Apparently there have been many bug reports submitted about this,
but the Thunderbird response is always "won't fix". They've put all
their effort into HTML, so people who won't use it are punished by never
getting the bugs in the "plain text" section fixed.
One of the reasons why Mozilla became so popular so quickly was that the
Mozilla team promised to split the huge monster into separate web
browser and mail program and newsreader products. People had wanted
this for a long time, for various reasons. The main reason was, I
think, "Why should I have the overhead of loading a huge dinosaur of a
web browser when all I want to do is read my e-mail?" People didn't
want a massively huge program that did web browsing AND news AND mail
AND (for all I know) polishing their shoes. It went against all the
principles of clean program design.
Thunderbird is part of the result. But, strangely enough, it didn't get
significantly smaller, which was supposed to be the whole point. The
reason, quite simply, is that Thunderbird is still at heart a web
browser. It doesn't do web browsing, but it contains most of the web
browsing code, to handle things like the hated HTML mail. And the
people who are developing Thunderbird are still thinking in web browser
terms.
Let me prove it to you. Look at the configuration page for setting
fonts (it's hard to find, but you'll get there if you're persistent) and
tell me how to set the fonts to be used for Unicode. It's not there.
It's not there?? No, because those font options are organised in terms
of language rather than in terms of MIME character set. The concept of
"language" doesn't exist in the mail standards, but it does in the
Thunderbird settings because that's how Netscape and Mozilla did it.
At least Thunderbird gets it right for outgoing mail (except for buggy
line wrapping). I'm composing this message in Latin-1, but if I
inserted a Cyrillic character I would get a message saying "That
character does not exist in this character set. Do you want to switch
to UTF-8?" UTF-8 is probably not the best choice, but at least it's a
correct choice that ensures that all my characters are encoded correctly.
For incoming mail there's still a bug. If the MIME header says KOI8-R,
the mail reader says to itself "Aha, that's one of the Russian
encodings. I can handle that." But if the header says UTF-8, the mail
reader can't figure out what language UTF-8 is, so it tries to use my
default of Latin-1, and replaces all the "invalid" characters with
question marks. Those characters are still there, because they reappear
if I quote them in my reply; but the program can't show them because it
only knows what a "language" is, and doesn't think in terms of "MIME
character set".
Anway, sorry for the long rant. I feel better now.
--
Peter Moylan http://www.pmoylan.org
Please note the changed e-mail and web addresses. The domain
eepjm.newcastle.edu.au no longer exists.
My e-mail addresses at newcastle.edu.au will probably remain "live"
for a while, but then they will disappear without warning.
The optusnet address still has about 5 months of life left.
.
- Follow-Ups:
- Re: Thunderbird bugs [was: lots of other topics]
- From: Robert Bannister
- Re: Thunderbird bugs [was: lots of other topics]
- From: Lars Enderin
- Re: Thunderbird bugs [was: lots of other topics]
- References:
- Proper verb agreement when referring to a company or firm
- From: cpoplawski
- Re: Proper verb agreement when referring to a company or firm
- From: Stephen Calder
- data, news, pants, scissors, people [was: Proper verb agreement when referring to a company or firm]
- From: Alexei A. Frounze
- Re: data, news, pants, scissors, people [was: Proper verb agreement when referring to a company or firm]
- From: Peter Moylan
- Re: data, news, pants, scissors, people [was: Proper verb agreement when referring to a company or firm]
- From: Alexei A. Frounze
- Re: data, news, pants, scissors, people [was: Proper verb agreement when referring to a company or firm]
- From: Wayne Brown
- Re: data, news, pants, scissors, people [was: Proper verb agreement when referring to a company or firm]
- From: Alexei A. Frounze
- Re: data, news, pants, scissors, people [was: Proper verb agreement when referring to a company or firm]
- From: Linz
- Re: data, news, pants, scissors, people [was: Proper verb agreement when referring to a company or firm]
- From: Alexei A. Frounze
- Re: data, news, pants, scissors, people [was: Proper verb agreement when referring to a company or firm]
- From: Peter Moylan
- Re: data, news, pants, scissors, people [was: Proper verb agreement when referring to a company or firm]
- From: Peter Moylan
- Re: data, news, pants, scissors, people [was: Proper verb agreement when referring to a company or firm]
- From: paulgor
- Re: data, news, pants, scissors, people [was: Proper verb agreement when referring to a company or firm]
- From: Peter Moylan
- Re: data, news, pants, scissors, people [was: Proper verb agreementwhen referring to a company or firm]
- From: Paul Gorodyansky
- Proper verb agreement when referring to a company or firm
- Prev by Date: Re: "steel" vs "iron" [was:Re: Competing unions [was: Re: Native English]]
- Next by Date: Proposal: Higher-Order Dialect Notation (HODN)
- Previous by thread: Re: data, news, pants, scissors, people [was: Proper verb agreementwhen referring to a company or firm]
- Next by thread: Re: Thunderbird bugs [was: lots of other topics]
- Index(es):
Relevant Pages
|