Re: How to escape the Ocaml's superfluous parentheses and type declarations?




andrewspencers@xxxxxxxxx wrote:
> # let compose x y z = x(y(z));;
> val compose : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b = <fun>
> # compose abs abs 1;;
> - : int = 1
> # compose mylength Mycons (1, Mycons (2, Mycons (3,E)));;
> The constructor Mycons expects 2 argument(s),
> but is here applied to 0 argument(s)
>
> What?! The compiler is wrong on two counts:
> 1. Here Mycons is applied to 1 argument, not to 0.

No - the above is the same as:

# compose (mylength) (Mycons) (1, Mycons (2, Mycons (3,E)));;

so the first Mycons has zero args, since all the components of a
functor/constructor application are just primitive expressions.

> 2. Mycons doesn't take two arguments; it takes 1 (a two-element tuple),
> as proven by the following:

Yes you are right. However the "2 arguments" refers to the fact that
Mycons requires a tuple with 2 arguments. Unfortunately the ML type
system is based on tuples instead of just using the simpler notion of a
constructor applied to zero or more arguments. Thus any user defined
data element is either a nullary constructor or a unary constructor
applied to a tuple (or list or record ...) which in turn may have
multiple arguments.
I think this is just a result of the way ML was developed, with
constructed types being tacked onto an original language based only on
tuples etc rather than considering tuples, records, and lists to be
just syntactic sugar for constructed types (as in Haskell) - in any
case it means extra parentheses and commas for the programmer,
unnecessary complexity, and a headache for anyone trying to implement
an efficient compiler that undoes all this mess to remove the
indirection between the constructor and the components of the tuple...


>
> # Mycons 1 E;;
> Syntax error
> # Mycons (1, E);;
> - : int mylist = Mycons (1, E)
>
> Investigating further:
>
> # abs;;
> - : int -> int = <fun>
> # Mycons;;
> The constructor Mycons expects 2 argument(s),
> but is here applied to 0 argument(s)
>
> So Mycons is _not_ actually a function.
>
> > Because you haven't declared the union of char and int. There can be no
> > such thing because char and int are separate types. What you have
> > declared is a completely new type called charint, whose values are
> > constructed from the built-in types as follows: given a char the
> > constructor function called Char gives you a value of type charint, and
> > given an int the function Int gives you a value of type charint. ie
> > Char and Int are just functions. The only thing that's special about
> > them is that you can use constructor application syntax in pattern
> > matching to get back the original char or int respectively
>
> If that were true, then Int should have type "int -> charint", and if I
> say to the Ocaml repl simply
> "Int", then it should respond with
> "-: int -> charint = <fun>", right?
> Let's try:
>
> # Int;;
> The constructor Int expects 1 argument(s),
> but is here applied to 0 argument(s)
>
> Besides, if Int and Char are functions, then are they defined in terms
> of the type charint (which would require the type charint to have
> already been previously defined), or is the type charint defined in
> terms of the functions Int and Char (which would require Int and Char
> to have already been previously defined, in which case the result type
> of Int and Char can't be the not-yet-defined type charint)? You can't
> say "both", because that would be a circular definition.

It's the same as if you had in C:

struct charint{
enum {charint_int, charint_char} kind;
union {char c; int i;} u;
};

charint Int(i){charint q; q.kind= charint::charint_int; q.u.i=i; return
q;}
....

ML generates the tags automatically, one for each constructor in the
type declaration, but the constructors themselves are not part of the
type, thus there is no circularity.


>
> So, what _are_ Mycons, Int, and Char, since they're obviously not
> functions?

Sorry for the confusion.
Mycons, Int, and Char, are "constructors".
The syntax "Int x" has two separate meanings, depending on where it
appears in the source: if in a place where an expression is expected,
it is compiled into a special function which takes x and yields a
charint, and if in a place where a pattern is expected, the compiler
will insert code to look at the tag of the charint value and bind x to
the contents if the charint is holding an int otherwise fail the match.

In other words, Mycons, Int, and Char, are not "things" in themselves,
but are just syntax which lets the compiler generate tags and insert
creation or matching code for values of their corresponding types.

When used for constructing values, they behave like functions, and are
parsed as such. "Int" *should* really be a first class function with
type int->charint when it appears in this context, but probably for
historical reasons, the decision must have been taken not to make it
so. (why????)

Hope I'm not making things more confusing. Languages evolve and also
ideas and I think some things that are obvious nowadays, like the
horrible folly of making tuples and records primitives instead of
syntactic sugar for a normal constructed type, were not known at the
time ML was created, and it's probably too late to change these things
now. Still, one gets used to it, and compared with C or C++, ML makes a
lot of things much easier eg no memory management issues, less
possibility for run-time bugs, and more succinct source code :-)

Regards, Brian.


.



Relevant Pages