Re: What did that thread indicate?



humiguel@xxxxxxx wrote:
> Post to: comp.ai.philosophy
> Subj: What did that thread indicate?
>
> curt@xxxxxxxx (Curt Welch) wrote:
>
> >humiguel@xxxxxxx wrote:
> >> curt@xxxxxxxx (Curt Welch) wrote:
> >
> >> >My ideas are exactly the same as what you wrote above. How silly can
> >> >that be?
> >>
> >> The silliness is in your "strategies for generating secondary
> >> behaviour". Judging only from our recent exchanges in another
> >> thread, I don't think that you have anything of substance to
> >> contribute in this matter. I called your bluff at least twice, as
> >> others have done in the past and you've chosen to ignore those
> >> challenges.
> >
> >I've not ignored any of your "challenges". I've answered every one as
> >far as I'm aware.
>
> Then your idea of meeting a challenge is different from mine. Let's
> try again, for the last time. Some time ago, you claimed that your
> system is capable of learning how to control a robotic arm so that
> it would be able to type a random message on a typewriter. You even
> claimed that it would be able to learn the controller language.

The code I have working does very little and nothing exciting (at least for
other people - it excites me :). I've never said otherwise.

I'd have to see the message you are referring to in order to remember what
I was really claiming in that post, but what I was not claiming, is that I
have code that does that today. I was either claiming that 1) strong
general AI code must be able to do a task of that type (I talk a lot about
what I think general AI code should be able to do), or 2) that I _thought_
my approach with my type of net would _lead_ to code that could do that.

>From my memory, I think the context I was talking about the arm was to
point out that the "copy problem" would not in fact have the machine
learning to copy data from input to output (like your SAY X problem for a
chat bot) but would in fact require the program to reproduce some effect in
the environment by producing different data on the output than what it was
sensing on the input.

> As a piece of code to simulate a robotic arm is a simple thing, I
> challenged you to hook that to your system and see how long it took
> to learn the above task.

I'm sorry. What do you want my net to do? You want to know how long my
type of net will take to learn to type a "random" message? So if it what,
makes a robotic arm hit one key on the keyboard that counts as a "random
message" and the task is done? Surely you want it to do more than that.

> So now I ask again. How long does it take?

So I ask again, do what? Make the arm hit one key on a keyboard?

> If your answer doesn't
> include some numbers, don't even bother to post it

42.

Wouldn't want you to get upset.

> because it would
> just be a waste of time, yours and everybody else's. You do realize
> that in this situation anything short of a numeric answer is just
> water vapour and that everybody can produce their own?

Ok, so lets say the robotic arm is nothing but a lever which can go up a
down. If it goes down far enough it will hit a key. My network sends
pulse signals to two inputs on the arm controller. One input makes the arm
go up a small fixed amount for each pulse it receives. The other input
makes the arm go down by a small fixed amount. The arm starts off in a
postion 10 units above the key. If the net sends it 10 more down pulses
than up pulses, the lever will move down far enough to hit the key and at
that point, your "type a random message task" will be done.

The net starts off generating random noise and after a short time (lets say
10 seconds) the level ends up hitting the key.

So, there's your answer. 10 seconds.

And the point of this was what again?

If you make a challenge that's a bit more interesting, my current net might
in fact already be able to do it. So let me know what you were really
thinking about with your challenge and I might even write the code to show
you what it can do.

But, let me give you a lesson on reinforcement learning problems so you
know enough to ask an intelligent question next time.

A machine that learns by reinforcement always has some fixed set of
behaviors that guides it's actions. Reinforcement creates new behaviors,
only by shaping old behaviors.

How long the machine will take to learn some new behavior is a function of
how much guidence the machine gets from the environemnt to transform the
current beahviors, into the new desired behavior. The less guidence, the
longer it will take before the machine happens, on it's own, to create some
unique new behavior by chance.

For example, in the theme of "random messages". If you want to train the
machine to type Shakespher, you could use a trainer which would reinforce
one letter at a time. So, every letter it got right, would be reinforced.
With a problem like that, a learning machine would quickly pick up one
letter at a time and be typing a book in no time at all.

But, on the other hand, you could use a trainer which gave it no
reinforcemens until it managed to type the entire thing correctly.

Now, the first example would in effect be led by the hand, to the correct
final behavior, shaping one small part of the entire behaivor set at a time
and it would learn the entire behavior in a resonable amount of time.

The second example, would take billions and billions of years before it
would get it right the first time, and most likely never learn the task,
because it's short term learning memory was too short to correlate the
reward it got with the keystrokes that happaned at the beginning of the
book.

So, with reinforcement learning, it's trivial to define problems so hard
that no machine (or human) could ever hope to learn it, and trivial to
define problems so easy that it doesn't even look like a problem and would
be learned in seconds by even a very dumb learning machine.

So, what is it exactly you want my net to learn to do with the robotic arm?

> Antonio Esteves

--
Curt Welch http://CurtWelch.Com/
curt@xxxxxxxx http://NewsReader.Com/
.



Relevant Pages

  • Re: 10-15 hours of practice gets you where, now ?
    ... Pity about your arm? ... learning I am popping a few painkillers? ... Amazingly you also mentioned holding the saddle!! ...
    (rec.sport.unicycling)
  • Re: Ben G on reinforcement-learning and the wirehead problem
    ... to proceed that moves the arm to press the button. ... Because of my focus on learning, when I use the word "decision" I'm ... It's probably more commonly used to reference high level language based ... behaviors where we tend to use our verbal powers to direct our actions. ...
    (comp.ai.philosophy)
  • Re: Learn to throw with other arm?
    ... Over the years I've been getting bad tendonitis in my elbow and shoulder ... So I figured maybe learning to throw with my other arm would help. ... Last spring I kicked a goal with my left leg, and most of my right-legged shots don't go where I want them to. ...
    (rec.sport.baseball)
  • movie maker 2 - do amazing things book
    ... for soem reason I seem to do better sometimes ... learning stuff when kicking back in an arm chair with a book... ...
    (microsoft.public.windowsxp.moviemaker)