Re: Representing futuristic English



--
Wilson Heydt:
> Not only is there a lot of material now that isn't in
> HTML, but is in older, proprietary formats, but the
> assumption going forward about ASCII is very liekly
> false given some general moves in the direction of
> unicode.

If I load up an ascii file in a unicode editor, it is
usually near hundred percent readable, except for a few
glitches which I can guess from context. (Unicode, when
encoded as UTF-8 looks very much like ascii, and a
unicode editor always tries to guess which of the four
UTF encodings are in use. Since ascii looks much like
UTF-8, it guesses UTF-8, which is incorrect, but close
enough.)

Your newsreader will probably save this message in
either ascii, or in unicode. If in ascii, read it in
unired, a unicode editor. It will be almost wholly
readable. If in unicode, read it in notepad, an ascii
editor. Still almost wholly readable either way. If a
moderately competent engineer had only unicode tools all
his life, and had never heard of ascii, and you gave him
an ascii file, and he noticed something was wrong, it
would take him about half an hour to whip up an ascii to
unicode translator.

If UTF-8 had vanished from use, he would see gibberish.
He would then bring it up in a binary editor, and would
at once *guess* the ascii encoding, after a few moments
of thought, and in about an hour whip up an ascii to
unicode translator.

--digsig
James A. Donald
6YeGpsZR+nOTh/cGwvITnSR3TdzclVpR0+pr3YYQdkG
ie9c/HV4EEJuQFClhyhRuz+lQkcnL66u69nkGAZq
4Uz5X3poKbkblIgOvqDFjq+6BQr2MaYtW6o9Ip8X7


--
http://www.jim.com
.



Relevant Pages

  • Re: Zeichenkodierung in der shell
    ... Erfinder zu benutzen - statt sie zu vergewaltigen - werden in der ... auf 8 Bit durch UTF-8? ... dass mit Unicode (egal welcher ... an bestimmten Stellen einfach ASCII _vorgeschrieben_ ist, ...
    (de.comp.os.unix.linux.misc)
  • Re: CFile::Read problem ???
    ... As far as the C compiler is concerned, ... you can pretty much always assign a char ... as ASCII and wchar_t as Unicode. ...
    (microsoft.public.windowsce.embedded.vc)
  • Re: Representing futuristic English
    ... > If I load up an ascii file in a unicode editor, ... > UTF-8, it guesses UTF-8, which is incorrect, but close ... > at once *guess* the ascii encoding, ...
    (rec.arts.sf.composition)
  • Re: D2008 - VCL Makeover details?
    ... new TEncoding parameters so you can specify what format to use when loading/saving data (Ascii, UTF-7, UTF-8, Unicode, etc). ...
    (borland.public.delphi.non-technical)
  • Re: Opening a text file that may be ASCII *or* Unicode
    ... It could well be ASCII empty -- no bytes.) ... UTF & BOM ... Positively Must Know About Unicode and Character Sets ... > regards, Andy ...
    (microsoft.public.scripting.vbscript)