for my effort: GC (and Dynamic Types), General Spec
- From: "cr88192" <cr88192@xxxxxxxxxxxxxxxxxx>
- Date: Fri, 5 Oct 2007 09:34:14 +1000
ok, this was an attempt at beating out a GC spec, this will be one I am intending to use in my compiler.
I don't know if it would really be much of anything suitible for a standard (includes a dynamic type system, which is something in its own right...). so, likely, this is mostly just an idea for a compiler-specific lib...
more or less, I will be grabbing the existing GC from another of my projects, and giving it a major API restructuring and probably somewhat cleaning up the internals.
the existing form names functions things like 'ObjType_AllocType', 'NetVal_WrapInt', ... and the lib in which it is used is often torn between using the dynamic type system and DOM trees. likewise, the dynamic type system was built later and in a more or less piecewise manner, so it is a very fragmented and ugly mess (the code for the typesystem spread in many parts of the codebase, and is not really used in a consistent manner).
though my compiler itself also includes a GC, I would feel hesitant to make this a front-end usable feature (it is a beast based around tagged references, precise mark/sweep collection, and reference counting, and being used by frontend code is likely to make a broken mess of its overly brittle operation).
(I had partly considered 'unioning' them, for example trying to build a conservative collector on top of a precise one, but right now I will figure this is probably not the best plan).
the idea presented below will be based around a conservative mark/sweep collector.
actually, I may include a type 'dyvar' (or 'dy' or 'dyt'), which would be mostly semantic.
it would be included in the gc header and, in truth, a simply 'void *' pointer (though it could potentially have some meaning to the compiler and/or code-processing tools).
----
GC Basics:
This will be a conservative GC.
Pointers may be freely stored on the stack, in global variables, or in other garbage-collected objects. The pointer may point anywhere within the allocated object.
Caution needs to be excersized when mixing GC'ed data with non-GC'ed data (such as memory gained from malloc, mmap, ...). Likewise when "mutilating" pointers. In these cases, it is no longer required that the GC be able to correctly locate or identify these pointers (resulting in 'accidental' collection).
GC API:
void *gcalloc(size_t sz);
void *gcrealloc(void *p, size_t sz);
void *gctalloc(char *type, size_t sz);
void *gcfree(void *p);
char *gcgettype(void *p);
int gctypep(void *p, char *ty);
int gcsetfinal(char *ty, void (*fcn)(void *p));
Uncertain Features:
void *gctallocu(char *type, size_t sz); //unaligned alloc
void *gcmalloc(size_t sz); //alloc initially-locked object
void gclock(void *p); //prevent collection
void gcunlock(void *p); //allow collection
gcalloc():
Allocates an object on the garbage-collected heap.
The type of such a value will be viewed as being NULL.
gcrealloc():
Reallocate an object to a larger or smaller size.
The type will be be retained from the parent object.
gcfree():
Forcibly free an object. This should only be done when it is known that this object will no longer be accessed.
gctalloc():
Allocates a typed object on the GC heap.
The type is a string whose structure is not specified by the GC. For custom types, effort should be used to ensure that the string does not clash with a builtin type or one provided by another library.
The creation of a new type may consist simply of allocating an object of the particular type.
gcgettype():
Return the type associated with an object.
If the pointer is not part of the GC's heap, or was gained from gcalloc, then the return value is NULL.
gctypep():
Returns a nonzero value if the type of p is the same as that given in ty. This determination will be based on string equality.
gcsetfinal():
Set the finalizer called for a specific type.
This function will be called while freeing objects of this type.
gcmalloc():
If included, would create objects by default resembling those from malloc (assuming manual memory management, ...).
However, unlike malloc(), the memory returned by gcmalloc() may be searched for pointers.
Dynamic Type System:
GC type names beginning with '_' will be reserved for the implementation.
General Conventions:
void *dy<type>(...);
Will create an object of the type.
int dy<type>p(void *p);
Will return a nonzero value if p is the correct type.
<type> dy<type>v(void *p);
Will get the value for a specific type.
char *dygettype(void *p);
int dytypep(void *p, char *ty);
Will get or check the type of an object.
These will exist as there is no strict requirement that some of these builtin types will have the same GC type (or even necessarily be on-heap). In these cases, this function may need to do additional work in order to locate the correct type string.
If only a single type needs to be checked, the provided specialized predicate functions will be viewed as the preferred means.
Note:
For Complex Types (such as arrays, conses, and hashes), it is acceptable to use any kind of pointer (it does not have to be from this type system, or necessarily even the GC). However, that is the intent that this be used mostly for dynamic types and GC'ed data, and other GC restrictions apply.
Primitive Types: int; long; long long; float; double; fcomplex; dcomplex.
Basic Types: string; wstring; symbol; keyword.
Complex Types: array; cons; hash.
....
so:
void *dyint(int v);
void *dylong(long v);
void *dylonglong(long long v);
void *dyfloat(float v);
void *dydouble(double v);
void *dyfcomplex(float _Complex v);
void *dydcomplex(double _Complex v);
char *dystring(char *s);
char *dysymbol(char *s);
char *dykeyword(char *s);
wchar_t *dywstring(wchar_t *s);
....
int dyintp(void *p);
int dylongp(void *p);
int dylonglongp(void *p);
....
int dyintv(void *p);
long dylongv(void *p);
long long dylonglongv(void *p);
....
Integer Type Disjointness
It will not be required that integer types be disjoint.
As a result, a value created with dyint() may still return a nonzero value with dylonglongp().
Likewise, a value created by dylonglong() may be nonzero with dyintp() in the case where dylonglong() was called with a value sufficiently small to fit into an integer.
As such, these predicates will mostly serve to indicate range requirements, rather than a strict type.
Float and Double
These will, however, be regarded as disjoint types.
Real will exist as a psuedo type, which will be true for values decodable as a double.
int dyrealp(void *p);
double dyrealv(void *p);
Arithmetic
Functions will be provided for arithmetic on dynamic types.
void *dyadd(void *a, void *b);
void *dysub(void *a, void *b);
void *dymul(void *a, void *b);
void *dydiv(void *a, void *b);
void *dymod(void *a, void *b);
void *dyand(void *a, void *b);
void *dyor(void *a, void *b);
void *dyxor(void *a, void *b);
void *dyneg(void *a);
void *dynot(void *a);
int dyeqp(void *a, void *b);
int dyneqp(void *a, void *b);
int dygtp(void *a, void *b);
int dyltp(void *a, void *b);
int dygep(void *a, void *b);
int dylep(void *a, void *b);
int dytruep(void *a);
int dyfalsep(void *a);
String:
Will behave much the same as a GC'ed dynamicly typed strdup.
Additionally, string merging may be performed, so the result should be regarded as immutable.
Symbol and Keyword:
These will be 2 different types, but each has essentially the same semantics.
Beyond the different types, they will have another important property:
Any 2 calls to the same function with the same input string produces the same output (allowing '==' to be used for equality checks).
Array:
void *dyarray(int cnt);
Create an array with initially cnt pointers.
void *dyarrayidx(void *p, int idx);
void dyarraysetidx(void *p, int idx, void *q);
Get and Set array index.
void **dyarrayv(void *p);
Get a 'void **' array representing the contents of the array.
Cons:
void *dycons(void *car, void *cdr);
void *dycar(void *p);
void *dycdr(void *p);
void *dycaar(void *p);
void *dycadr(void *p);
void *dycdar(void *p);
void *dycddr(void *p);
(up to 4 levels).
....
void *dylist1(void *a);
void *dylist2(void *a, void *b);
void *dylist3(void *a, void *b, void *c);
void *dylist4(void *a, void *b, void *c, void *d);
void *dylist5(void *a, void *b, void *c, void *d, void *e);
void *dylist6(void *a, void *b, void *c, void *d, void *e, void *f);
(up to 10).
....
Hash:
void *dyhash(int cnt);
Create a hash with initially cnt slots in the table. Calling this with '0' will specify that the implementation use a default value.
void *dyhashget(void *p, char *s);
void *dyhashset(void *p, char *s, void *q);
Get or Set the value associated with a given string index.
Set will return the previous value (or NULL).
The hash may be resized if it is full.
.
- Follow-Ups:
- Re: for my effort: GC (and Dynamic Types), General Spec
- From: Douglas A. Gwyn
- Re: for my effort: GC (and Dynamic Types), General Spec
- Prev by Date: Re: Garbage collection
- Next by Date: Re: Garbage collection
- Previous by thread: new application for understanding millions of C/C++ source lines (www.integrech.cn)
- Next by thread: Re: for my effort: GC (and Dynamic Types), General Spec
- Index(es):
Relevant Pages
|