Re: Gradual Learning, not Reinforcement Learning
- From: "Jim Bromer" <jbromer@xxxxxxx>
- Date: 16 Jul 2006 08:15:32 -0700
J.A. Legris wrote:
OK, let's get started. What is gradual learning, and under what
circumstances does it arise?
--
Joe Legris
The classical example of logical reasoning is,
All men are mortal.
Socrates is a man.
Therefore, we know -by form- that Socrates is mortal.
This concept of form was also used in the development of algebra where
we know facts like,
2a + 2a = 4a
if a is any real number. So, for example, we know -by form- that if
a=3 then 2*3+2*3=4*3.
One of the GOFAI models used categories and logic in order to create
logical conclusions for new information based on previously stored
information. In a few cases this model produced good results even for
some novel examples. But, it also produced a lot of incorrect results
as well. I wondered why this GOFAI model did not work better more
often. One of the reasons I discovered is that we learn gradually, so
that by the time we are capable of realizing that the philosopher is
mortal just because he is a man and all men are mortal, we also know a
huge amount of other information that is relevant to this problem. The
child learns about mortality in dozens of ways if not hundreds or even
thousands of ways before he is capable of realizing that since all men
are mortal, then Socrates must also be mortal.
I realized that this kind of logical reasoning can be likened to
instant learning. If you learn that Ed is a man, then you also
instantly know that Ed must be mortal as well. This is indeed a valid
process, and I feel that it is an important aspect of intelligence.
But before we get to the stage where we can derive an insight through
previously learned information and have some capability to judge the
value of that derived insight, we have to learn a great many related
pieces of knowledge. So my argument here, is that while instant
derivations are an important part of Artificial Intelligence, we also
need to be able to use more gradual learning methods to produce the
prerequisite background information so that derived insights can be
used more effectively.
Gradual learning is an important part of this process. We first learn
about things in piecemeal fashion before we can put more complicated
ideas together. I would say that reinforcement learning is a form of
gradual learning but there are great many other methods of gradual
learning available to the computer programmer.
It's hard for most people to understand me (or for that matter even
to believe me) when I try to describe how adaptive AI learning might
take place without predefined variable-data references. So it is much
easier for me to use some kind of data variable-explicit model to try
to talk about my ideas.
Imagine a complicated production process that had all kinds of sensors
and alarms. You might imagine a refinery or something like that.
However, since I don't know too much about material processes, I
wouldn't try to simulate something like that but I would instead
create a computer model that used algorithms to produce streams of data
to represent the data produced by an array of sensors. Under a number
of different situations, alarms would go off when certain combinations
of sensor threshold values were hit. This computer generated model
would be put through thousands of different runs using different
initial input parameters so that it would produce a wide range of data
streams through the virtual sensors. It would then be the job of the
AI module to try to predict which alarms would be triggered and when
they would be triggered before the event occurred. The algorithms that
produced the alarms could be varied and complicated. For example, if
sensor line 3 and sensor line 4 go beyond some threshold values for at
least 5 units of time, then alarm 23 would be triggered unless line 6
dipped below some threshold value at least two times in the 10 units of
time before. There might be hundreds of such alarm scenarios.
Individual sensor lines might be involved in a number of different
alarm scenarios. An alarm might, for another example, be triggered if
the average value of all the sensor inputs was within some specified
range. The specified triggers for some alarms might change from run to
run, or even during a run. Some of these scenarios would be simple,
and some might be very complex. Some scenarios might even be triggered
by non-sensed events. The range of possibilities, even within this
very constrained data-event model is tremendous if not truly infinite.
The AI module might be exposed to a number of runs that produced very
similar sensor values, or it might be exposed to very few runs that
produced similar data streams.
Superficially this might look a little like a reinforcement scenario
since the alarms could be seen as negative reinforcements, but it
clearly is not a proper model for behaviorist conditioning. The only
innate 'behavior' is that the AI module is programmed to produce is to
try to develop conjectures to predict the data events that could
trigger the various alarms.
I argue that since simplistic assessments of the runs would not work
for every kind of alarm scenario, the program should start out with
gradual learning in order to reduce the false positives where it
predicted an alarm event that did not subsequently occur.
This model might have hundreds or thousands of sensors. It might have
hundreds of alarms. It might have a variety of combinations of data
events that could cause or inhibit an alarm. Non-sensible data events
might interact with the sensory data events to trigger or inhibit an
alarm. Furthermore, the AI module might be able to mitigate or operate
the data events that drive the sensors so that it could run interactive
experiments to test its conjectures.
I have described a complex model where an imagined AI module would have
to make conjectures about the data events that triggered an alarm. Off
hand I cannot think of any one learning method that would be best for
this problem. So lacking that wisdom I would suggest that the program
might run hundreds or even thousands of different learning methods in
an effort to discover predictive conjectures that would have a high
correlation with actual alarms. This is a complex model problem which
does not lend itself to a single simplistic AI paradigm. I contend
that the use of hundreds or maybe even thousands of learning mechanisms
is going to be a necessary component of innovative AI paradigms in near
future. And it seems reasonable to assume that initial learning is
typically going to be a gradual process in such complex scenarios.
I will try to finish this in the next few days so that I can describe
some of the different methods to produce conjectures that might be made
in this setting and to try to show how some of these methods could be
seen as making instant conjectures while others could be seen as
examples of gradual learning.
Jim Bromer
.
- Follow-Ups:
- Re: Gradual Learning, not Reinforcement Learning
- From: Curt Welch
- Re: Gradual Learning, not Reinforcement Learning
- From: J.A. Legris
- Re: Gradual Learning, not Reinforcement Learning
- From: Michael Olea
- Re: Gradual Learning, not Reinforcement Learning
- References:
- Gradual Learning, not Reinforcement Learning
- From: Jim Bromer
- Re: Gradual Learning, not Reinforcement Learning
- From: Glen M. Sizemore
- Re: Gradual Learning, not Reinforcement Learning
- From: Jim Bromer
- Re: Gradual Learning, not Reinforcement Learning
- From: J.A. Legris
- Gradual Learning, not Reinforcement Learning
- Prev by Date: Re: 50 Years and Counting
- Next by Date: Re: Gradual Learning, not Reinforcement Learning
- Previous by thread: Re: Gradual Learning, not Reinforcement Learning
- Next by thread: Re: Gradual Learning, not Reinforcement Learning
- Index(es):
Relevant Pages
|
|