Re: Fat references



Robert A Duff wrote:
GNAT (the gcc Ada compiler) uses fat pointers by default to represent
pointers to unconstrained arrays. (Unconstrained means different
objects of a given array type can have different lengths.) The fat
pointer consists of the address of the bounds and the address of the
data.

I see.

The purpose of these fat pointers has nothing to do with GC, though.
GNAT does not currently support GC, except for the versions targeting
.NET and the JVM.

Right. With my approach, the pointer to the run-time type can be used to
fetch a GC-specific function that traverses the references in a given type,
or a user-function to pretty print a value of that type.

I don't fully understand the problem you're trying to solve, nor how
fat pointers are the solution.

My original motivation was to evade typeless null pointers that afflict many
VMs including the JVM and CLR. Various operations hit a dead wall with
these nulls because they don't provide any type information. For example,
pretty printing null is a serious problem in F# because lists, sets, option
types and other types all want to use null as an efficient representation
of "empty" but the run-time system just sees a typeless null and can only
print "null". So F# wraps the empty list in a heap allocated object,
massively degrading the performance of most uses of lists.

However, with the benefit of hindsight, there is a simpler but still
efficient solution to this problem on the CLR: wrap the null in a value
type instead of an object. That way there is no heap allocation and
(assuming my performance results from HLVM carry) you don't see any
slowdown at all. Unfortunately, this would work perfectly in theory but, in
practice, the CLR has problems with structs inhibiting TCO. HLVM doesn't
have that problem though (thanks to LLVM!).

Are you saying all pointers are fat (4
words per pointer!?)? Or just pointers to arrays?

Yes, all pointers. Every single reference in the heap now occupies a
quadword. I even blindly load and store entire quadwords when reading and
writing references.

Under what circumstances are you worried about an extra copy?

Extra copy of what?

If the array is allocated on the C side, you have to concoct the meta data
somehow...

That's no problem. If the array came from the C side then I must JIT compile
a stub function that accepts HLVM's fast calling convention and invokes the
external function using C's calling convention anyway, so I can put the
extra code to convert the C array into a fat reference in that stub,
injecting type information into the typed fat reference derived from the
FFI definition of the external function.

Why not just say that arrays allocated on the C side are not
subject to GC?

Simplicity is *very* important for me. I want to create a useful language
implementation with the minimum of effort (where useful means
high-performance parallel numerics to me) by drawing upon research without
doing anything new myself. My entire VM is still well under 2kLOC of OCaml
code. If I started adding special cases like different kinds of arrays then
it would complicate everything from the type system to the GC and shadow
stack.

By the way, if you're interested in GC, you should join the
GC list, which our esteemed moderator set up some years ago.

I'd love to. Where is it?

HLVM has another unusual characteristic in that it defines a high-level
language (e.g. with tuples and tail call elimination) that is not only the
target language but also the language the GC is written in. This has the
advantage that I can easily JIT compile type-specialized GC code to improve
performance. The approach seems to work very well because it makes
metaprogramming trivial (e.g. instantiating generic definitions per type is
trivial and very efficient) and was easy to implement.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?u

.



Relevant Pages

  • Re: Array comparison
    ... > Deep and recursive. ... the pointers and comparing their referents instead of comparing the ... >> arrays, then I think you also need to argue against assigning arrays. ... semantics (notably, reference counting). ...
    (alt.comp.lang.borland-delphi)
  • Re: Fortran and .NET (C#)
    ... Was there "garbage collection" in Fortran 90 or was there reference counting? ... For functions returning arrays that are either explicit-sized or allocatable, the language is designed such that you don't need either of those. ... And if one is a novice, well, one pretty much abandons all hope of a bug-free program if one uses functions that return pointers. ... Recall that, unlike C, Fortran doesn't force you into using pointers just because you have arrays. ...
    (comp.lang.fortran)
  • Re: Fat references
    ... Or just pointers to arrays? ... Yes, all pointers. ... quadword. ... on reference counting says that this is so and, hence, 1-bit reference ...
    (comp.compilers)
  • Re: A doubly linked-list in C
    ... It appears that the standard has been crafted with that distinction ... effect for arrays is unusual: because of the interoperability of arrays ... and pointers in C, what is passed by value is a pointer; ... key pragmatic issue (does the implementation pass a value or a reference?) ...
    (comp.lang.c)
  • Re: Real World Flunks (Re: Why is Object Oriented so successful)
    ... AndyW wrote: ... "modeling the real world". ... CADD screen closely matches the object pointers in a typical real- ... I think the fact one has a reference to it ...
    (comp.object)