Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: Marco van de Voort <marcov@xxxxxxxx>
- Date: Fri, 1 Jul 2005 09:39:52 +0000 (UTC)
On 2005-06-30, Jason Burgon <gvision@xxxxxxxxxxxx> wrote:
>> > what Delphi (and FPC, and GPC?) had to do with strings. The principle
>> > is just the same.
>>
>> The difference is that strings become large only if the user
>> explcitly puts long data in them which doesn't normally happen
>> accidentally, whereas, say, a set of all Unicode letters implcitly
>> requires more space (probably in any representation as it's rather
>> irregular) than in an 7/8 bit charset.
I agree with Scott that set of <16-bit> is maybe still doable, however
32-bit of course must be sparse.
> (1) The vast majority of code in any complex program is library code (be it
> your own or someone else's), and that needs to be as flexible as practical.
> So library code would be better if it could handle huge sets.
Agree. So say the "new code" perspective.
> (2) The computer world is more complex than it's ever been (eg Unicode)
> and will just get more so. So why make life even more difficult for Pascal
> programmers by obsoleting their (eg: character) set library code? Again
> the Delphi WideString type is a good example of providing familiar
> mechanisms for dealing as seamlessly as is possible with the added
> complexity of the 21st century.
Clean code will indeed remain working, since they operate on the basis char
tricks. A reference counted system like ansistring ensures some performance
with not that optimal existing code.
Widestring has as problem that it is different between the Windows and Linux
editions of Delphi. In one it is a COM bstr, in kylix more like an
ansistring (but then 16-bit). So I wouldn't use it as an example, unless you
mean the Kylix version.
> (3) Like your ~average~ string, a clever huge set implementation (like mine
> ;-) of an ~average~ huge set is likely to be quite sparse or have large
> areas of contiguous members, and wouldn't therefore use up huge amounts of
> memory.
True. And refcounting (copy on write) would ensure that original code that
is read-only, but passes somehow by value will still work not to shabby.
> (4) Sure, a typical set of Unicode chars (say all uppercase characters) will
> likely use more memory than a set of 7/8bit char (but in my case, not that
> much more).
Unicode is 32-bit, though only a (magnitude) 100000 codepoints are assigned.
While the 16-bit (first 40-48k) chars are most used, I wouldn't rule out the
codepoints of the 32-bit space with a new set/string design.
Such problems have been discussed in FPC circles before, and I think it will
eventually come to _three_ stringtypes UTF8/UTF16/UTF32 with autoconversions
between them and directives to alias one to default identifiers (now:
widestring, in the future maybe also string).
{$stringtype short/ansi/utf8/utf16/utf32/comstr}
Note that ansi->wide conversion is codepage sensitive. I haven't reached a
conclusion if this must be set runtime (from now on, assume all ansi->wide
conversions are cp857 or some windows convention) or compiletime (directive,
compiler links in correct conversion code or table).
The good part of doing this runtime you can make your program's user specify
what encoding he uses for all plain text. The bad part is bloat with a few
tens of kbs (even 100s) of conversion tables _IF_ they cannot be gotten from
the OS or shared libs.
Microsoft had valid reasons at the time to go for 16-bit, but that doesn't
mean we should repeat that.
> (5) Compilers would still be free to represent small sets in a
> speed-efficient (eg: linear bitmap) way. A really clever one would even
> allow the programmer to decide for speed vs size.
Yes, there are 4 types:
1. registers
2. static sets
3. ref counted, dynamically allocated sets.
4. dynamically allocated sparse sets, possibly ref counted.
The order 1 -> 4 is also roughly how you would change the type if the
amount of elements get higher.
One could specify the transitions from 2->3 and from 3->4 on the cmdline,
e.g. to mimic behaviour of a legacy pascal compiler. Changing this probably
requires RTL recompilation though.
Conversions are not necessary, since only set of x; and set of y with x<>y
are not compatible anyway.
> (6) If my code is typical, then the number of [character] set instances is
> at least 2-3 orders of magnitude less than the number of string instances
> and other variables I have. IOW, the (huge) sets I do have might be larger,
> but still only constitute a tiny fraction of the total memory requirement of
> my programs.
True. And you could always recode the worst library routines.
.
- Follow-Ups:
- References:
- Is ISO Pascal compatible with J&W (original) Pascal ?
- From: Scott Moore
- Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: 2metre
- Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: Scott Moore
- Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: 2metre
- Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: frank
- Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: 2metre
- Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: frank
- Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: Scott Moore
- Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: Marco van de Voort
- Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: Scott Moore
- Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: Jason Burgon
- Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: frank
- Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- From: Jason Burgon
- Is ISO Pascal compatible with J&W (original) Pascal ?
- Prev by Date: Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- Next by Date: Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- Previous by thread: Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- Next by thread: Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
- Index(es):
Relevant Pages
|