Re: RfD - Escaped Strings (long)
- From: stephenXXX@xxxxxxxxxxxx (Stephen Pelc)
- Date: Mon, 21 Aug 2006 22:54:35 GMT
On 21 Aug 2006 15:10:45 -0700, "Alex McDonald"
<alex_mcd@xxxxxxxxxxxxxxx> wrote:
Way back in IBM360 BAL land, "abcd""def" was the answer; a double
doublequote colapsed to a single doublequote, and parsing of the string
continued. Could S" be extended to accept "" without breaking existing
code?
Has it any common practice in Forth?
Why do we need two representations, both of variable length?
This proposal selects the hexadecimal representation, requiring
two hex digits. A consequence of this is that xchars must be
represented as a sequence of pchars. Although initially seen as a
problem by some people, it avoids the endian problems involved
in storing an xchar.
Here I would propose
\unnnn
and
\Unnnnnnnn
for UTF16 and UTF32 support. Python iirc supports this construct. It
avoids any ambiguity over endianess problems.
What terminates it? If you want say '00' immediately after
\Uxxxxxx do you write \Uxxxxxx00 which I believe to be
ambiguous. Variable length extensions without a terminator
are dangerous!
The use of hex characters is not just to provide wide
character support, but also allow insertion of control
codes into comms channels, e.g. Telnet IAC handling.
Anton is pushing hard for UTF-8 support. I argue that separated
octets supports UTF-8/16/32 without any required changes.
Another advantage of the octet approach is that it enables
16 bit embedded systems to support characters of any size
wider than a cell. With UTF-8 this is required even on a
32 bit Forth.
Stephen
--
Stephen Pelc, stephenXXX@xxxxxxxxxxxx
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
.
- Follow-Ups:
- Re: RfD - Escaped Strings (long)
- From: Alex McDonald
- Re: RfD - Escaped Strings (long)
- References:
- RfD - Escaped Strings (long)
- From: Stephen Pelc
- Re: RfD - Escaped Strings (long)
- From: Alex McDonald
- RfD - Escaped Strings (long)
- Prev by Date: Re: RfD: Structures
- Next by Date: Re: RfD - Enhanced local variable syntax (long)
- Previous by thread: Re: RfD - Escaped Strings (long)
- Next by thread: Re: RfD - Escaped Strings (long)
- Index(es):
Relevant Pages
|