Re: Energy constraints was Re: Androids & Strong AI in the Future



"JGCASEY" <jgkjcasey@xxxxxxxxxxxx> wrote:
Curt Welch wrote:
Well, I believe the brain creates all these concepts
of "situated" from nothing more than undefined streams
of data. I believe that general intelligence can be
created. I believe that it is possible to create a
generic intelligence algorithm where you simply plug
sensory data into the generic inputs (no need to
connect vision to the vision input and smell to the
smell input), and all the concepts of being situated
in a 3D world etc will correctly emerge from the data.
I don't get the impression that many other people here
agree with this view.

To the extent that brains "evolved" than clearly most
people would believe that evolving intelligent systems
can develop ways to define "undefined streams of data"
and attempts have been made to build such machines.

And to the extent that "learning" can be seen as "learning
to define undefined streams of data" attempts are also
being made to build those kinds of machines.

But much of your talk is vague and lacks specifics including
what you mean by "undefined data" and what it would mean
to define it. Otherwise we have no way of measuring the
success or failure of "defining undefined streams of data".

John, this really isn't that hard.

I'm talking about generic reinforcement learning algorithms that have 3
types of data. There is input data (sensory data), reinforcement data
(pain and pleasure inputs), and there are outputs.

How you choose to represent the data is unspecified for the generic
specification of the reinforcement learning problem, but must be defined
for any specific implementation. I tend to only consider discrete
signaling systems. I currently like to work with asynchronous temporal
pulse signals but synchronous binary signals are another possible option as
are various analog signal formats.

When we talk about learning we have to specify what it
is that is being learned. Learning implies X learns Y.
We know X is our learning machine but what is Y? This has
to be made explicit.

I've spend 5 years here talking about one very specific type of learning -
reinforcement learning. How much more specific can I get for a problem
definition?

The problem is defined for the the agent (your X), to learn what type of
outputs (one of the three signal types), produce maximum total reward (as
defined by the second signal type).

You can test any potential agent against any potential environment simply
by running it for a fixed amount of time and totalling the reward the
machine received. You can compare different agents against different
environments to see which combinations perform the best. It's extremely
easy to quantify.

I believe that there are strong generic reinforcement learning solutions
(algorithms) still to be found.

And I believe that the foundation of human intelligence is nothing more
than a strong generic reinforcement learning machine (implemented by our
neocortex).

The sensory data simply allows the agent to bias its choice of output
behaviors based on the context defined by the sensory inputs.

I believe the AI problem must be solved at this generic level - we must
find strong reinforcement learning algorithms that are data independent.
That is, the algorithm makes no assumptions about the meaning of the
sensory inputs or the outputs. The only "meaning" it understands, is the
meaning of the reinforcement inputs - aka rewards.

These types of algorithms would allow you to connect any type of sensory
signal, to any input. And they would allow you to connect any output, to
any type of external effector. This is what makes it both generic and data
independent.

In the case of Hawkin's models Y is
the "most likely sequence of a particular length of inputs".
The "reinforcement" in this case would be "confirmation"
of the resulting encapsulation of the high probability
sequences at each level of the hierarchy which is encoded
in a Bayesian network structure.

I would point out that the "most likely sequence" is
not learned as a result of reinforcement but as a result
of the way the program processes the input. Future
input will confirm or deny the probability values and
thus can be seen as adjusting them up or down but never
at any time is some external agency actively selecting
some random behavior as good or bad.

Any adjustment which increases the odds of the machine producing the same
behavior in the future is a reinforcement by the very definition of what
reinforcement is. And if a repeat of a "X" input after a guess of "X"
tends to cause the agent to be more likely to guess "X" again in the
future, then the input had a reinforcing effect on the agent.

Most learning machines (maybe all) can be relabeled as reinforcement
learning systems. You just draw your circles differently and label the
parts differently. This is because the reinforcement learning frameworks
is probably the single more general way to specify the learning problem.

The behavior is
not random nor is it selected by the environment. The
environment simply supplies the input sequences for
the system to process them according to its makeup.
You could say the environment selects the mechanisms
that do the processing by allowing them to survive or
not. In the case of Hawkin's machines they will
survive if they do what is promised of them.

--
Curt Welch http://CurtWelch.Com/
curt@xxxxxxxx http://NewsReader.Com/
.



Relevant Pages

  • Re: Energy constraints was Re: Androids & Strong AI in the Future
    ... of "situated" from nothing more than undefined streams ... and attempts have been made to build such machines. ... And to the extent that "learning" can be seen as "learning ... the "most likely sequence of a particular length of inputs". ...
    (comp.ai.philosophy)
  • Re: Reinforce learn this
    ... controller, which works as well for the robot, that our brain works for us. ... give us a few years of exposure to an environment with ... Only some basic learning hardware was there at ... This is where reinforcement learning comes in. ...
    (comp.ai.philosophy)
  • Re: Ben G on reinforcement-learning and the wirehead problem
    ... And keep in mind that reinforcement learning is not ex nihilo (as you ... which actions yield the most reward by trying them. ... such simple conditioning is not the same as human knowledge. ...
    (comp.ai.philosophy)
  • Re: Reply to Wolf
    ... So how did it ever evolve? ... Machines do not have to be seen as models of anything. ... It has to with the system that does the learning. ... Actually I don't think random behaviors are the only movers ...
    (comp.ai.philosophy)
  • Re: What is innate and how can we determine it?
    ... learning", and "Learning is all that is important", ... It doesn't confuse me, now, but why would you want to confuse ... of reinforcement learning. ... network already answered all those questions. ...
    (comp.ai.philosophy)