Re: Problems with Gambit



On Mar 25, 12:09 pm, "comp.lang.scheme" <phi50...@xxxxxxxx> wrote:

However, what impressed me was the performance of Larceny in a certain
class of problems that I am studying right now (I got a scholarship to
do the work of choosing a good Scheme or Common Lisp): Adaptative
filtering. Adaptative filtering is used to recognize signal patterns.
For instance, consider a guy who lost his arm. People pick an EMG
signal from his shoulder, and use it to discover how he wants to move
his prosthetic limb. When I was living in Logan, Utah, I saw a TV
program where a guy who could do almost everything with his prosthetic
arm. I was a student in Mount Logan Middle School, and I wrote a work
about the arm for a science assignment. My father also has a strong
interest in this kind of problem, and he helped me a lot. Things work
thus: Given a known signal, one must choose a bunch of functions, and
combine them in order to match the signal. The program creates random
functions, and tries to combine them. It chooses the functions that
require few coefficient to match the signal. When it faces an unknown
signal, it finds its coefficients. If the coefficients match the ones
in its data base, it has recognized the signal, and must move the arm
accordingly. The math is well beyond my knowledge of the subject, but
I know that there are two approaches for dealing with the problem. One
of them uses a thing called genetic programming, and the other uses
adaptative filtering. My father, and his friend Dr. Lombardo, who work
with this branch of knowledge, say that the two approaches produce the
same result, but adaptative filtering is faster.

I picked a few programs for recognizing EMG signals, and tried them
using many Schemes. The first thing that I noticed is that Scheme does
not seem to have a standard. I needed to modify all programs to make
them run on different Schemes. However, the surprise was that the
Scheme that I thought that was very slow, proved to be the fastest for
this kind of problem: Larceny. It is interesting that adaptative
filtering uses large arrays to represent the EMG signal, and Larceny
did not show good performance in array benchmarks. Do you have an idea
why this is so? I guess that I didn't program Bigloo and Gambit
correctly, but I was lucky with Larceny. BTW, SBCL also has very good
performance when dealing with LMS.

I used Gambit for genetic programming to discover algorithms for image
processing. (This was with Jens Palsberg, James R. Lee, and Sudhakar
Mamillapalli back in the 90s.) Here we evaluate each program at each
pixel location, so for a 512x512 image each program was evaluated
262,144 times. There were 2,000 programs per generation, so we
evaluated 524,288,000 programs x data each generation.

This is often enough that it proved very worthwhile (by more than an
order of magnitude) to compile the programs and to do preprocessing to
speed them up. We ended up doing a lot of partial evaluation to
eliminate computations, then we embedded the code for each program in
a loop that was totally lambda-lifted, converted to continuation-
passing style, and with all calls in tail position, so each program
was compiled to a state-machine with all variables in registers by
Gambit. Execution overhead was minimal (11 cycles per pixel, compared
to 29 cycles per pixel with a loop calling a single C function passed
to it as a pointer).

For example, the best program in generation zero for an edge detection
problem (#t means an edge) was

(< (divide-by-two (+ (* -1 (value (left point)))
(value point)))
(- 0 4))

The functions are

value: Evaluates an average value of the image near some point.
left: moves to the point to the left
divide-by-two: obvious

so this function catches vertical edges that jump at least 8 pixel
values from one pixel to the next (either as you go from right to left
or from left to right, I can't think clearly enough to see which is
right).

This function was partial-evaluated and cps'ed (but you can't see that
effect here), and embedded into a lambda-lifted loop as

(lambda ()
(let ((local-*arg* *arg*))
(let ol ((r 0) (t 0) (f 0) (local-*arg* local-*arg*))
(if (f< r 512)
(let ((tr-row (vector-ref *truth* r)))
(let l ((r r)
(t t)
(f f)
(local-*arg* local-*arg*)
(c 0)
(tr-row tr-row))
(define (general-result r t f local-*arg* c tr-row
result)
(if (vector-ref tr-row c)
(if (eq? result #t)
(l r t f local-*arg* (f+ c 1) tr-row)
(l r (f+ t 1) f local-*arg* (f+ c 1) tr-row))
(if (eq? result #f)
(l r t f local-*arg* (f+ c 1) tr-row)
(l r t (f+ f 1) local-*arg* (f+ c 1) tr-
row))))
(define (error-result r t f local-*arg* c tr-row)
(if (vector-ref tr-row c)
(l r (f+ t 1) f local-*arg* (f+ c 1) tr-row)
(l r t (f+ f 1) local-*arg* (f+ c 1) tr-row)))
(if (f< c 512)
(if (f< c 1)
(error-result r t f local-*arg* c tr-row)
(let ((v1 (lv 0 -1 0)) (v0 (lv 0 0 0)))
(general-result
r
t
f
local-*arg*
c
tr-row
(f< (fq (f+ (f* -1 v1) v0) 2) -4))))
(ol (f+ r 1) t f local-*arg*))))
(cons t f)))))


All these functions were compiled, yet compile time was a small part
of the total time. In Gambit we passed the option (using current
syntax)

-cc-options "-U___SINGLE_HOST"

so that each small thunk was compiled to its own separate module
instead of having Gambit compile all 2000 programs to one single
module in each generation. (At that time I did rewrite the code in
Gambit's compiler to calculate transitive closures to speed up the
entire compiler by a factor of 3 or 4 on these problems.) This speeds
up the Scheme->C transformation by the Gambit compiler and the C-
object code transformation by gcc.

So, I would recommend this approach of using program transformation
and compilation, even through C code, if each sample program is to be
evaluated on a lot of data (like hundreds of thousands or millions of
pixels in images).

There is a paper that discusses some of this at

http://www.math.purdue.edu/~lucier/692/

Brad
.



Relevant Pages

  • Re: How do you do this?
    ... Python can execute statements ... language while Scheme is designed to be compiled. ... lexical bindings are always resolvable at compile time. ...
    (comp.lang.scheme)
  • An experience with Xilinx 8.1.02i
    ... It is the first time to compile with 8.1.02i. ... Those signals couldn't be trimmed for a correct design. ... "Preserve Hierachy on Sub Module", ... After no is selected with selection "Preserve Hierachy on Sub Module", ...
    (comp.arch.fpga)
  • Re: Saving breakpoints
    ... I once considered to add a breakpoint ... Scheme 1: ... Const DebugNewCustomer As Boolean = True ... ' A compile time constant which is also set during testing ...
    (microsoft.public.vb.general.discussion)
  • Re: How do you do this?
    ... language while Scheme is designed to be compiled. ... lexical bindings are always resolvable at compile time. ... Common Lisp is designed just the same in this respect, ...
    (comp.lang.scheme)
  • Re: code critique?
    ... Now consider that most decent Scheme implementations involve compilers of some kind, whether they compile to some kind of bytecode, to C, to native code, or even just to a macro-free "core Scheme". ... If you use eval in such a Scheme implementation, depending on the implementation, it'll either interpret your code at runtime, or it'll compile the input code when eval is called, and then run it. ... seems to be differently viewed when using javascript vs. scheme. ...
    (comp.lang.scheme)