Re: Representing futuristic English
- From: kaih=9dBx7yWHw-B@xxxxxxxxxxxxxxxxx (Kai Henningsen)
- Date: 03 Sep 2005 16:44:00 +0200
jamesd@xxxxxxxxxxx (James A . Donald) wrote on 26.08.05 in <pq5vg1h5ko4dclouoja768psdkavkr5hsp@xxxxxxx>:
> Wilson Heydt:
> > Not only is there a lot of material now that isn't in
> > HTML, but is in older, proprietary formats, but the
> > assumption going forward about ASCII is very liekly
> > false given some general moves in the direction of
> > unicode.
>
> If I load up an ascii file in a unicode editor, it is
> usually near hundred percent readable, except for a few
> glitches which I can guess from context. (Unicode, when
> encoded as UTF-8 looks very much like ascii, and a
> unicode editor always tries to guess which of the four
> UTF encodings are in use. Since ascii looks much like
> UTF-8, it guesses UTF-8, which is incorrect, but close
> enough.)
Actually, ASCII is exactly a subset of UTF-8, so if this process produces
glitches, you didn't have ASCII to start with. (That was a design
principle of UTF-8 - i.e, this is not an accident.)
Maybe you think of Latin-1, which was a different scheme to extend ASCII
and hence is compatible with UTF-8 only insofar as text is in the common
subset, i.e., ASCII.
> If UTF-8 had vanished from use, he would see gibberish.
> He would then bring it up in a binary editor, and would
> at once *guess* the ascii encoding, after a few moments
> of thought, and in about an hour whip up an ascii to
> unicode translator.
Especially as the ASCII encodings would be quite familiar, just using less
00 bytes. This, incidentally, is true of Latin-1 as well, as *Unicode* was
designed to keep the numbering (but not the encoding) of everything in
Latin-1.
Kai
--
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
- Russ Allbery (rra@xxxxxxxxxxxx)
.
- Follow-Ups:
- Re: Representing futuristic English
- From: James A . Donald
- Re: Representing futuristic English
- Prev by Date: Re: Representing futuristic English
- Next by Date: Re: Representing futuristic English
- Previous by thread: Re: Representing futuristic English
- Next by thread: Re: Representing futuristic English
- Index(es):
Relevant Pages
|