Re: First class types and their representation
- From: "cr88192" <cr88192@xxxxxxxxxxx>
- Date: Tue, 27 Jan 2009 18:32:50 +1000
"James Harris" <james.harris.1@xxxxxxxxxxxxxx> wrote in message
news:3945bead-85e3-4361-9488-489824d6fdff@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Has anyone else considered allowing types to be first-class? If so did
you come up with a representation for them or do you know of one that
another language uses? It makes sense to use a common representation
if there is one.
James
in my framework for my own uses, much of the code represents types as
strings...
now, the notation is a little hairy, but it is possible to map much of a C
style typesystem (including structs, function pointers and signatures, ...)
to a string based representation (and, more so, allow "reasonably" efficient
processing).
the idea is not to represent each type as a type name, but rather each base
type might have an assigned letter, and each modifier might have a different
letter (for example, I use lower-case letters for base types and upper-case
letters and symbols for modifiers and other things).
for example:
"void *" is "Pv";
"int (*)(double, double)" is "P(dd)i";
a compiler extension type might be "Uquat;";
a class ref might be "Lapp/foo/Bar;";
a struct pointer might be "PXfoo_s;";
....
my particular notation actually draws bits and pieces from many other
notations, such as:
the SysV ABI name-mangling scheme (basic types and core notation);
the JavaVM (many of the syntactic elements of the notation, the basic
notation for flattening names and signatures into C-style names, ...);
....
note that no whitespace (and few separators) are used in the notation as,
for the most part, elements are discretely terminated (except in a few rare
cases, nothing following the type itself actually matters, the rare case
being that of arrays with specified bounds, which are pretty much the only
present use of numeric elements...).
"int [40][30]" is "i40,30".
as a result "dd" is known to be 2 doubles, and never a single element,
however "(dd)i" is regarded as a single element terminating after the the
'i'. this allows composite types to be formed easily.
"int (*(foo)(double, double))();" is "(dd)P()i" or "foo(dd)P()i" (this
latter form being for a declaration).
function taking 2 doubles and returning a function pointer returning an
integer (and is, IMO, both more understandable and easier to parse than the
C version...).
as-is, there is no inline syntax for representing classes or structure
types, but thus far this has not been needed (if I were to add a syntax, I
would probably use "C{...}" and "G{...}" for this purpose), although the
main issue is that usually classes and structures need a lot more
information than just their raw physical layout, and so I have not thus far
had much reason for an inline syntax for this (it usually being sufficient
to define a class or structure externally, and then reference it with the
provided syntax).
part of the reason that the syntax is designed the way it is is that this
actually allows most efficient processing...
in general, this has proved much more versatile and convinient to work with
than the use of packed integer formats, ... although, for many operations,
the raw speed might not be as good...
I could give a spec if needed, but unless someone is interested I may not...
.
- References:
- First class types and their representation
- From: James Harris
- First class types and their representation
- Prev by Date: Algol 68 Genie Mark 15.1
- Next by Date: Re: First class types and their representation
- Previous by thread: Re: First class types and their representation
- Next by thread: Algol 68 Genie Mark 15.1
- Index(es):
Relevant Pages
|