Re: UCS Identifiers and compilers
- From: Chris F Clark <cfc@xxxxxxxxxxxxxxxxxxxx>
- Date: Thu, 11 Dec 2008 23:54:00 -0500
wclodius@xxxxxxxxxxxxxx (William Clodius) writes:
As a hobby I have started work on a language design and one of the
issues that has come to concern me is the impact on the usefulness and
complexity of implementation is the incorporation of UCS/Unicode into
the language, particularly in identifiers.
1. Do many of your users make use of letters outside the ASCII/Latin-1
sets?
We have one major Yacc++ customer that has a series of languages that
support Unicode identifiers. Some of their languages have both case
sensitive and case insensitive features in the same language. My
experience relates primarily to supporting them.
3. Visually how well do alternative character sets mesh with a language
with ASCII keywords and left to right, up and down display, typical of
most programming languages? eg. how well do scripts with ideographs,
context dependent glyphs for the same character, and alternative saptail
ordering work, or character sets with characters with glyphs similar to
those used for ASCII (the l vs 1 and O vs. 0 problem multiplied)
The glyphs that look like ASCII are a definite problem and that is
made worse if the glyphs that look like ASCII characters have
different properties. In particular, a fair amount of effort went
into dealing with the Turkish character that is an i without the dot.
Apparently, there is no capital from of this letter (or it shares the
captial with some other letter) and the system toupper/tolower
routines did not deal consistently with it across locales. As a
result, we had to take care to make certain that we used a consistent
approach when calling those routines to make certain we had not
changed our locale between calls. The difficulty being that some
tables were built at the time the compiler was built (and thus under
one locale), which may not be the same as the locale the user has
specified when running the compiler.
Hope this helps,
-Chris
******************************************************************************
Chris Clark Internet: christopher.f.clark@xxxxxxxxxxxxxxxxxxxxxx
Compiler Resources, Inc. or: compres@xxxxxxxxxxxxx
23 Bailey Rd Web Site: http://world.std.com/~compres
Berlin, MA 01503 voice: (508) 435-5016
USA fax: (978) 838-0263 (24 hours)
------------------------------------------------------------------------------
.
- References:
- UCS Identifiers and compilers
- From: William Clodius
- UCS Identifiers and compilers
- Prev by Date: Re: UCS Identifiers and compilers
- Next by Date: Re: UCS Identifiers and compilers
- Previous by thread: Re: UCS Identifiers and compilers
- Next by thread: Re: UCS Identifiers and compilers
- Index(es):
Relevant Pages
|
Loading