locale (was: Accented characters in 'less' and vim)



On Thu, 06 Apr 2006 12:35:15 +0100, Dave Love
<fx@xxxxxxxxxxxxxx> wrote:
: Maybe, but I'm responding about UK usage.

I know; I would think it was pretty analogous though.

: I'd have thought most
: people in .no would want no_NO,

I have always wondered what difference that would really make.
Most people would use both languages interchangably; but I have
not yet seen a reason to reset the environment all the time.

Besides it being pretty hard to do for instance between different
messages in the MUA.

Is there any reason that I have missed?

Yeah, I see there is some reason when using e.g. OpenOffice, but then
the locale is really a property of the document, and not of
the system, hence the correct locale should not be taken from
the environment. (I have to remember to look up how to change
language for the entire OO document, I never found it and have
not used it that much either.)

Is there a good howto for people multilingual use?
The only texts I have seen focus on how to use a different
language instead, not how to use several languages.

: and en_US.iso88591 isn't a valid GNU
: locale.

So it should be .ISO-8859-1, then? I always thought the locale did
not do what it should have done... maybe this is why.

This has caused a lot of headache; the particular computer I am
posting from is UTF-8 by default, but my terminal is on a .uk
box with Latin-1 (and the X server is on a Windows box, which
presumable has its very own character encoding). I have also
worked the other way around, on a UTF-8 terminal against a Latin-1
or -15 remote computer. There is always some software which I cannot
get to work, the MUA being particularly troublesome.

Would anyone know how this should be done to be failsafe and
technically correct? For instance to get the MUA to show
characters correctly from a UTF-8 message on a Latin-1 xterm
(considering characters existing in both sets). Which programs
need which locale?

: > Shouldn't $LANG always include an encoding defintion?
:
: No. It should be the name of a locale that exists on the system (for
: which there is no standardization as far as I know). On a GNU system
: you probably have the complete list in /usr/share/i18n/SUPPORTED,

Unfortunatly, this is not a probable computer; that directory contains
only charmaps/ and locales/, and uname calls it GNU/Linux.

: > (I was told that the _US definition was more complete than the
: > _GB one,
:
: That makes no sense.

Does it make no sense on linux, or no sense at all in any context?
It is possible that the advice was based on a Solaris, or even
a more heterogenous legacy network.

I'll make sure that I change to _GB as soon as I start changing things.

: (Note that there is a lot more to a locale than
: the character encoding.)

Well, to most programs there isn't. I know, though, that to some
programs there should be a lot more. (That is, of course, among
the programs I actually use.)

: LC_CTYPE is more specific. See locale(1).

Thanks.


--
:-- Hans Georg http://www.ii.uib.no/~georg/

`This Universe never did make sense; I suspect that it was built
on government contract.' (Heinlein)
.



Relevant Pages

  • Re: RfD: XCHAR wordset (for UTF-8 and alike)
    ... >encoding can be used; latin-1 is most widely used, ... >languages, different char-sets have to be used, several of them ... How does this fit in with the wide character and internationalisation ...
    (comp.lang.forth)
  • Re: UCS Identifiers and compilers
    ... Some of their languages have both case ... context dependent glyphs for the same character, ... changed our locale between calls. ...
    (comp.compilers)
  • Re: How to get the encoding table ?
    ... > a specific Japanese character such as 平仮名 and 漢字.But they are ... > My question is when I get a multibyte character such as 間(kanji ... how can I get the correspoding encoding value. ... Since I have a locale based on utf-8, the response I get is the utf-8 ...
    (comp.unix.shell)
  • Re: check for non-english
    ... Chinese from mainland China use GB encoding, whilst Chinese from Taiwan uses Big5. ... The characters used in each set of encodings is slightly different, and from any text the character codes fall into certain ranges which can be used as a guess as to what encoding, and hence language it comes from. ... This requires some knowledge of the languages concerned. ...
    (sci.lang)
  • Re: OT: Re: Why do people in the UK put a u in the word color?
    ... Using "Jaegermeister" is locale independent. ... The whole character mess is going on and on. ... everybody to just use UTF-8 as character encoding. ... To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx ...
    (Debian-User)