Re: Symbols, metacircular evaluation, external representation.



Rudy <a.r.dabrowski_NOSPAMPLEASE_@xxxxxxxxx> writes:

Hi,

I'm new to Scheme (please, forgive my shortcomings ;-)) and I don't grasp
some issues connected with it.

Firstable, what's the point in symbols? Are they only to make Lisp
homoiconic?

In scheme, yes. In lisp, symbols are not there _only_ to make it
homiconic, I'd say.


I think that without them any expression (e.g. (lambda (x) (* x
2))) wouldn't be a proper Lisp's list structure because nobody would know
what do these lambda, x, and * mean. Am I right?

Well, without symbols, how would you write a symbolic expression?
You could only use strings, and you would have to use parsers all the time.

(parse-sexp "(lambda (x) (* x 2))")
(parse-c "int fun(int x){ return x*2; }")

and deal with strange data types for the parse trees...


I also wonder why aren't they treated as strings instead of a
separate data type? Strings and symbols are quite similar, aren't
they?

Are the strings obtained from evaluating the form: "abcd"
and the form (string-append "ab" "cd")
the same or not?

Depends on the definition of being the same.
Have a look at: http://www.nhplace.com/kent/PS/EQUAL.html

But you can guess that we have two different strings (they don't share
the same object identity), but they contain the same characters, at one time.


C/SCHEME[10]> (let ((compare (lambda (r s)
(display "(string=? ")
(display "\"") (display r) (display "\"")
(display " ")
(display "\"") (display s) (display "\"")
(display ") --> ")
(display (string=? r s))
(newline)))
(r "abcd")
(s (string-append "ab" "cd")))
(compare r s)
(string-set! s 0 #\X)
(compare r s)
#f)
(string=? "abcd" "abcd") --> #T
(string=? "abcd" "Xbcd") --> #F
#F


On the other hand, symbols are immutable, and two interned symbols
that have the same name are actually identical, which allows to test
for equality quickly, and to use them in hash-tables, and other places
in a efficient way.


Of course, scheme symbols are rather washed out, since the only thing
you can do with them is to create a symbol from a string, get the
string naming a symbol, test if an object is a symbol and compare two
symbols for identity. But historically, and in other lisps, symbols
are like a structure with more fields than just the name: they usually
contain a value slot, and a property list (and in lisp-2, a function
slot). They can therefore be used more easily in a wide range of
situation without more ado, while in scheme you'd have to implement
some specific "dictionnary" to associate values to symbols, or other
properties. In the first LISPs, there was no string datatype. Since
the memory of the time were very small (eg. 32 Kwords), all strings
were interned into symbols. (There was no character data type either,
they were represented too as one-character named symbols).


Let's go further. What is so great about homoiconic languages? They are
amenable to metacircular evaluation but what is so super about this thing?
I would like to hear some practical arguments.

Well metacircular evaluation is rather the bulldozer of this aspect.
In practice, most metaprogramming involves only macros. What symbols
allow, is to consider symbolic expressions both as data and program
source code. Since both data and program are represented with the
same kind of symbolic expressions, it's very easy to write functions
to manipulate programs. Macro systems in lisp are trivial to
implement and the most powerful.


So now, instead of writting programs to manipulate data, you can write
program that write programs to manipulate data. Metaprogramming
allows you to increase your productivity, by automatizing your
programming process.

This is something that compiler writters do for mere programmers, and
writting a metacircular interpreter (or a lisp compiler), allows to do
the job of compiler writters, but this is a complex job (even if we
can present "toy" examples in one page). Macros are like hooks in the
compiler that allow mere programmers to modify the compilation process
at a higher level more easily than writting a new interpreter or
compiler.



The last issue I'm grappling with is external representation of objects.
What's the point in it? It is very often used in this way:

'(1 2 3) instead of (list 1 2 3) or
'#(1 2 3 4 5 "foo" foo) instead of (vector 1 2 3 4 5 "foo" 'foo)

I doubt that's all the application of external representations.


There is a problem: most people elide too much. When they say "Let's
consider the list '(1 2 3)", they should say: "Let's consider the list
resulting from the evaluation of the form '(1 2 3)".

I talk of the list (1 2 3). The way this list is obtained is rather
irrelevant. I could get it by inputing it to the program, by writting
it literally in a quote form, or building it in several ways, including
cons or list.

C/SCHEME[11]> (read)
(1 2 3)

(1 2 3)
C/SCHEME[12]> (quote (1 2 3))
(1 2 3)
C/SCHEME[13]> (cons 1 (cons 2 (cons 3 (quote ()))))
(1 2 3)
C/SCHEME[14]> (list 1 2 3)
(1 2 3)
C/SCHEME[15]>

If you talk me about a list '(1 2 3), I will take you're considering
the list of two elements, the first being the symbol quote and the
second the list (1 2 3).


Now the external representation of a list containing as element the
integers one two three is a string of characters: "(1 2 3)", which
when output to your terminal (or written in a file), of course shows
as:

(1 2 3)

But external representations are not very interesting. Of course we
need a way to be able to see symbolic expressions, and to enter them,
but the details of the representation doesn't matter. One property
that is interesting however is that the printed representation be
readable, since that allows us to feed back output as input (card
punch/card read yesterday, copy/paste today):

C/SCHEME[27]> (list '+ '3 '4)
(+ 3 4) ; copy the output
C/SCHEME[28]> (+ 3 4) ; paste it as input
7

Nice! So we can even write programs that store symbolic expressions
into files and read them back again, and if it happens that these
symbolic expressions are themselves programs, you can execute or
compile these files.


I've been also thinking what's the purpouse of quoting (except for the
above-mentioned trivial usage and "obtaining symbols").

The purpose of quote is to prevent the evaluation of its argument.
Quote is needed in slisp because there is no difference between code
and data: both are symbolic expressions. So we need an operator to
stop evaluating code, when we want to consider a symbolic expression
as data.

For example, consider the symbolic expressions (+ 1 2 3) and (* 3 4)
and assume we want to compute the sum of the length of these two
lists.

(+ (length (+ 1 2 3)) (length (* 3 4)))

since both code and data is represented as symbolic expressions,
without quote we could not do it, since we couldn't know where to stop
evaluation. The intepreter would evaluate the form (+ 1 2 3), and try
to evaluate (elngth 6) which is meaningless. With quote, we can
signal that some expression is not code, but data:

(+ (length (quote (+ 1 2 3))) (length (quote (* 3 4))))

Evaluating the form (quote (+ 1 2 3)) results in the list (+ 1 2 3)
itself. It can then be passed to length, which can compute its
length: 4. Same for the other subform, and eventually we can call the
function resulting from the evaluation of + with the arguments 4 and 3
obtaining the result: 7.


Examples like
symbolic differentiation from SICP don't appeal to me, beacuse such tasks
could be done by strings as well. Could you enlighten me?

Yes. You could also implement all our software with Turing Machine,
setting bits on a tape, and configuring state transitions. But it
would be impractical, as it would be impractical to try to implement
symbolic differentiation or a compiler working only with strings.
(sed works only with strings, and sed is Turing complete. We can have
fun implementing an addition with sed, but it would foolish to try to
implement manually a compiler in sed).

What happens, is that any software implementing symbolic
differentiation, or interpreters or compilers, or any other kind of
symbolic processing, will implement higher level abstractions than
bare strings, including symbols, and symbol tables (where symbols are
interned).

The difference, is that symbols and symbol tables are in general
hidden in the depths of the compilers, and if you want to go
metacircular, you have to implement the whole compiler from scratch
(and using tools like lex/yacc and other compiler generators to ease
the task since it's not trivial).

In the case of lisp, the compiler is not closed and distinct from the
running program. The compiler is wide open and available at all times
(including run-time), and symbols are therefore a public API allowing
programs to communicate with the compiler, be it thru eval, or thru
the macros ("compiler hooks"), or just for the program to do its own
interpretation or compilation..



This is what allows you, in the same running program, to take some
data (eg a formula), compute the symbolic derivate, and from the
result, build some symbolic expression representing a procedure, then
compile this data/symbolic expression/program source into a native
procedure that the program can call:

C/SCHEME[23]> (list 'lambda '(x) (derivate '(+ (* x x) (* 3 x) 5) 'x))
(lambda (x) (+ (* 2 x) 3))
C/SCHEME[24]> (define f (eval (list 'lambda '(x) (derivate '(+ (* x x) (* 3 x) 5) 'x))
(scheme-report-environment 5)))
f defined.
C/SCHEME[25]> (f 0)
3
C/SCHEME[26]> (f 1)
5
C/SCHEME[27]>

(Note how environments are used to "give meaning" to the symbols such
as + and * when you pass the symbolic expression to eval; that's where
the association between the primitive functions and the symbol is stored).


Try to do that in C with strings! (Note it's perfectly feasible, Apple
did it in its Xcore IDE, which allows you to modify running programs
compiled from Objective-C sources, but it is much more complex and
slower than what happens in a lisp image).



It seems to me that all these problems are somehow related but I cannot put
them together. S.O.S!

--
__Pascal Bourguignon__ http://www.informatimago.com/

NOTE: The most fundamental particles in this product are held
together by a "gluing" force about which little is currently known
and whose adhesive power can therefore not be permanently
guaranteed.
.



Relevant Pages

  • Re: foreach enhancement
    ... > typed lists and strongly typed lists; ... >> This is where the compiler comes in. ... > automatically break down and enumerate string objects in the List. ... does what I do want *without* causing this behaviour in strings. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: foreach enhancement
    ... typed lists and strongly typed lists; ... > This is where the compiler comes in. ... automatically break down and enumerate string objects in the List. ... strings to them. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Tired of 100s of stupid Getter/Setter methods
    ... > lists, and dicts as built-in types, and their contents are not ... > except the language is designed intelligently so doing that does not ... let the compiler find out more about types. ... contains Strings -- that generics are worth it for that alone. ...
    (comp.lang.java.programmer)
  • Re: Is C99 the final C? (some suggestions)
    ... > Microsoft compiler, or do you draw the line at C89? ... in the same sense that the implementors of the Lua language considered C to be ... Now why would they bother if portable mutex implementations ... C doesn't have strings. ...
    (comp.lang.c)
  • Re: Verbose functional languages?
    ... A lazy language might, in the end, not be what you need. ... if the compiler unboxes them for you by way of optimization. ... and strictness can be infered to a great extent. ... But strings were always a very slow and memory ...
    (comp.lang.functional)

Quantcast