Re: ASCII convention



Peter T. Daniels:

> But people routinely use the Unicode list for information about writing
> systems.

They routinely *abuse* the Unicode list for this purpose.

Even without any errors, Unicode is not suitable for writing systems.
Different writing systems share characters; then the characters are the
same (only one Unicode code-point) and yet their significance for the
writing and reading process is entirely different. Unicode does in no way
specify the latter issue, it does only define what the characters are and
how they may be represented as sequences of codes.

Moreover, a technical standard is hardly ever a good source of
information, except on the standard itself. If someone wants a list of
French départements, he should search for one and not for a list of French
car license number codes or for postal (ZIP) codes, even though these
contain a number for the département.

> How are they supposed to know when it preserves legacy errors?

This is a problem with *all* standards. When the problem is understood, it
is too late for a standard. As long the problem is not well enough
understood, errors will creep in and can hardly be ironed out.

One solution is to include the standard version number into all instances
of standard application. For instance, XML requires that a document begin
with a specification of the standard to which the document is meant to
adhere. With character codes, however, this option falls flat ("The next
letter is coded according to Unicode 3.0.2: "A").

> It's almost worse than wikipedia, because it comes with a presumption of
> authoritativeness.

Authoritativeness as a standard of character codes, but no
authoritativeness as a specification of writing systems.

If Unicode works as designed, you can use it in the description of writing
systems, but it is not meant as a replacement for such a description.

Helmut Richter
.



Relevant Pages

  • Re: ASCII convention
    ... >>Hardly anybody who has been using details information from Unicode would ... Writing a standard for use in the industry ... >>is a quite different task from writing a scientific paper. ...
    (sci.lang)
  • Re: wchar_t
    ... > So you haven't read anything about Unicode at all have you? ... > think it makes sense to have the accent of one character in a different ... some involving multiple codes in arbitrary order? ... Who would do a thing like that in an international standard? ...
    (comp.lang.c)
  • Re: UTF-8 practically vs. theoretically in the VFS API (was: Re: JFS default behavior)
    ... space that has to be controlled by one standard body, ... I have seen what people use in Russia, and it's not Unicode. ... that use a single charset, the decisions that are made for this small area ... Other charsets don't have language identification, ...
    (Linux-Kernel)
  • Re: [OT] Re: wchar_t
    ... Now tell me the practical upper limit that we can use to standardize the all-singing, all-dancing physical address for now and all future times. ... Consortium) even saw fit to pass a resolution that UTF-16 will forever more be adequate to express all expansions of ISO 10646 (the ISO standard corresponding to Unicode). ... and a new standard would be required. ... But memory size and character sets are different things. ...
    (comp.lang.c)
  • Re: CD-RW Formatting
    ... ISO Standard writing is done with a CD burning application (Nero or Easy ... ONLY UDF writing involves "formatting" the CD AT ALL. ...
    (microsoft.public.windowsxp.general)

Loading