Re: RfD: XCHAR wordset (Version 3)
- From: Bernd Paysan <bernd.paysan@xxxxxx>
- Date: Thu, 27 Nov 2008 11:14:59 +0100
Anton Ertl wrote:
Hm, what about "distributing an internationalized program as source code"?
How do you do that when you don't know what kind of charsets the systems
use?
As long as the program does not contain non-ASCII data, it does not
need to know the encoding of the data it processes in order to work.
Any internationalized and localized program *will* contain non-ASCII data:
The strings for the translated texts.
Changing encodings is messy. The e-mail case you mentioned allows to change
encodings; there is an RFC that even defines all the available encodings. It
is messy, that's why the W3C dropped support for many different encodings in
XHTML (HTML supports it).
Let's rephrase it:
* A standard system can provide ways to change the internal and external
encoding, to support legacy applications.
* A legacy application that uses one or several non-Unicode encodings is not
a standard program. This doesn't matter, it never has been. It can use the
XCHAR words to deal with the non-Unicode encodings if the vendor provides an
extension to change the encoding, and use this in a transition phase before
converting the data to UTF-8.
* How to deal with multiple different encodings is outside the scope of the
current xchar proposal. IMHO this probably should stay outside the scope of
any standard, because different systems might have different legacy
requirements.
Not on the Unix systems I use. I set LANG=C, and whenever I work on
an account that doesn't, something fails pretty soon and reminds me to
set LANG=C there, too.
Hm, setting LANG=C makes your local terminal UTF-8 unaware. Certainly things
break when you log into a system which assumes UTF-8. I usually only set
LC_NUMERIC=C, and leave LANG=de_DE.UTF-8 (that fixes most of the annoying
problems like programs printing into postscript files using German number
conventions). Gforth's io.c sets the LC_NUMERIC locale internal to C
particularly for this reason (otherwise f. would print 123,456 instead of
123.456).
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
.
- Follow-Ups:
- Re: RfD: XCHAR wordset (Version 3)
- From: Anton Ertl
- Re: RfD: XCHAR wordset (Version 3)
- From: Stephen Pelc
- Re: RfD: XCHAR wordset (Version 3)
- References:
- RfD: XCHAR wordset (Version 3)
- From: Bernd Paysan
- Re: RfD: XCHAR wordset (Version 3)
- From: m_l_g3
- Re: RfD: XCHAR wordset (Version 3)
- From: Bernd Paysan
- Re: RfD: XCHAR wordset (Version 3)
- From: Peter Fälth
- Re: RfD: XCHAR wordset (Version 3)
- From: Bernd Paysan
- Re: RfD: XCHAR wordset (Version 3)
- From: Anton Ertl
- Re: RfD: XCHAR wordset (Version 3)
- From: Bernd Paysan
- Re: RfD: XCHAR wordset (Version 3)
- From: Anton Ertl
- RfD: XCHAR wordset (Version 3)
- Prev by Date: Intellasys
- Next by Date: Re: RfD: XCHAR wordset (Version 3)
- Previous by thread: Re: RfD: XCHAR wordset (Version 3)
- Next by thread: Re: RfD: XCHAR wordset (Version 3)
- Index(es):
Relevant Pages
|