Re: RfD: c-addr/len



anton@xxxxxxxxxxxxxxxxxxxxxxxxxx (Anton Ertl) writes:

Aleksej Saushev <asau@xxxxxxxx> writes:
anton@xxxxxxxxxxxxxxxxxxxxxxxxxx (Anton Ertl) writes:

Aleksej Saushev <asau@xxxxxxxx> writes:
anton@xxxxxxxxxxxxxxxxxxxxxxxxxx (Anton Ertl) writes:
So, you use CHARS to convert from number of chars to number of aus.
Then I wonder why you are asking us to get rid of CHARS
<87zl8oof1v.fsf@xxxxxxxx>.

I'm asking you to state your goals clearly: you should take either one
way and another but not both. One way is you support CHARS and let it
have arbitrary value (introducing comminication level concepts like "octet"),
other way is removing it. As for now, you don't look to have strong
position on what the practical value of CHAR is.

I guess you mean CHARS, right? I have already given my position on
that <2009Sep21.133336@xxxxxxxxxxxxxxxxxxxxxxxxxx>:

|[1 CHARS = 1 is] common practice, and hopefully someone will work
|out a proposal to standardize that.

Using CHARS in a program has no practical value. Implementing CHARS
in a system has the practical value of supporting those few programs
that actually use CHARS.

Now you tell me that having one program to support text rather than
several ones, for regular text and for UTF-8, is impractical.

No, I don't tell you that. What I tell you is written above.

This is the same, you tell me that there's no practical value,
what do you count for practical value at all, if not this?

JFYI, of several major sites I've just probed, only two use UTF-8,
others use different kinds of unioctet Russian Cyrillic.
Thus you still have to recode text you receive during communication.

And this has what to do with this discussion?

This voids your argument about what is impractical.

I don't understand where you take your phantasies from.

I don't know what phantasies you are referring to.

Your phantasies about practical side of CHARS.

Show me how you derive need of your beloved UTF-8 from unioctet
encodings and MIME, when common practice is using MIME.

Common practice where? Certainly not in Forth.

ASCII-compatible 8-bit encodings are compatible with Forth-94 and with
the xchars proposal; the reference implementation of xchars contains
an implementation for 8-bit encodings.

Again. Show me how the need for _non-uniform_ length UTF-8 arises from
_uniform_ unioctet encoding, which was common practice before UTF-8
acceptance (leaving the numbers for acceptance outside the scope of
discussion).

Contrary to what you tell, your xchars are _incompatible_ with Forth-94:
they cannot be copied with CMOVE ("copy u consecutive characters"),
they cannot be read with READ-FILE ("read u1 consecutive characters"),
you cannot do anything with them, unless you rely on low level
representation. How does that make xchars compatible?


Thus what we see for now, you design standard, you make another series
of chages making new standard largely incompatible with previous one.
What makes it horrible, is that you are rationalizing your phantasies
about what is practical and what is not. You fail to demonstrate
necessity of reverting the standard feature and thus assert it is
impractical without any support. You demonstrated your perfect knowledge
of common practice in other programming languages pretty well. So, would
you (and Berndt too) be kind to retract your character sets proposals or
stay away from standard process at all until you (either of you) publish
survey to support your point? I question your expertise in this domain.

This is to hold you from making delusional points in future, you both
originate from Latin-writing countries, you hardly have enough
experience with localization issues, and even more so, Germans are
well-known for their attempt to strip their umlauts and convert to bare
Latin. And we're not going to convert to another encoding just because
it is proclaimed standard, we had this experience with ISO/IEC 8859-5.


Another thing is more procedural.

From my side I see that you're so in haste to make new standard out,
that you forget about standardizing goals. What do you aim for?

Previous standard was controversial and the only way it got acceptance
is the lack of another standard for too long. Now you seem trying to
make modern one out as soon as possible at any price. If you start
controversy yourself, it doesn't matter, "make haste." If you revert
previously standardized practice, it doesn't matter, "I don't know
anyone using it, no practical value, make haste."

I _am_ pure practical in regard to Forth, I don't have much time to
participate in standard process, I do use and want to continue using
Forth in practice, but this is becoming tiresome. You started FORGET
controversy removing it unilaterally from gforth without paying any
attention if anyone uses it, I had to adapt a good deal of code and
work process to that. Now you want to strip it from standard despite
voices contra. Sure, of course, "make haste," 2009 is coming to the end.

I want to remind you that you misdesigned MARKER in gforth, and this
made me to waste about a week to find where the actual bug lies. Thus
you can't argue on MARKER vs. FORGET issue, because you don't use any of
them.

Now you're going to do the same with CHARS, another feature I actively
use. What the hell? Can you play somewhere else? I use it consistently
with other languages I use, "text is text, raw data is data, data may
be text but not necessarily so."


I've just realized one more point against your character sets proposals.
You assert that your "xchars" are compatible with regular CHARS or, to
be more precise, octets. Then dump this *** altogether, it doesn't
belong to the standard. Provide it as public domain library. If we find
many users past 5 years of usage, we may return to standardizing it.
Standardize those features that may lead to incompatibilities. Oh, you
will need octet access for your UTF-8 library. This is the very part to
standardize, because it has more uses than only to UTF-8 lovers and may
lead to incompatibilities, since 1 CHARS may be more than 1 already and
it's been so for 15 years.


--
HE CE3OH...
.


Quantcast