Re: What did that thread indicate?
- From: curt@xxxxxxxx (Curt Welch)
- Date: 26 Sep 2005 01:07:04 GMT
Traveler <traveler@xxxxxxxxxx> wrote:
> On 24 Sep 2005 16:15:01 GMT, curt@xxxxxxxx (Curt Welch) wrote:
>
> >Traveler <traveler@xxxxxxxxxx> wrote:
> >> On 23 Sep 2005 23:52:01 GMT, curt@xxxxxxxx (Curt Welch) wrote:
> >>
> >> >Traveler <traveler@xxxxxxxxxx> wrote:
> >> >> On 22 Sep 2005 16:41:03 -0700, humiguel@xxxxxxx wrote:
>
> [cut]
>
> >> Yes, a pattern is just a concept. However, I disagree with current
> >> approach to pattern recognition that uses a strictly feed-forward,
> >> pyramid-type hierarchical network. As I've mentioned in the past, the
> >> biological and psychological evidence refutes this approach. Concept
> >> formation is a top-down process, IMO.
> >
> >It's still very unclear to me how much power my feed forward network
> >has.
>
> By feed-forward, I had the hierarchy in mind, not the connections.
> Hawkins' approach is a progressive pyramid that gets more abstract as
> you go up the levels in the tree and ultimately gives you a
> grandmother-type cell at the top. I think it's nonsense. For one, one
> never knows how many levels a grandmother-type cell will require.
I look at the hierarchy as simply a mapping from sensory signals to motor
signals. You can pick any signal in the middle of the mapping, and look
backwards at the sensory signals which define it, in which case you see a
"grandmother-like" signal - that is the signal has some "meaning" defined
by all the sensory signals that it's a function of (it's receptive field to
use a term from the study of the vision system).
These are all just hidden layers in the complex mapping from sensory to to
motor functions. The signals which are closer to the sensory side of the
mapping are more easily seen as sensory data and less "abstract" because
they have a small, and simple mapping from the sensory data. The further
into the network you get, the larger the receptive field becomes, and the
more complex the mapping from the senosry data to this middle-term signal.
But inversely, you can look at the effector field of any mid-term signal by
looking at which motor outputs can be effected by the signal. Near the
sensory side, the effector foot-print of the signal is very large and
complex. It's a very abstract motor signal at that point. But the closer
you get to the effector (output) side of the network, the simpler the
mapping to the motor signals become, and the less abstract it becomes.
So as you look at all these hidden layer signals, they simply transform
from concrete senory signals when they enter into concrete motor signals
when they exit. It's not a single pyramid, but instead, there's always two
pyrimids formed for each mid-level signal - one working backwards to the
sensory signals that define and, and another working forwards, to all the
motor signals it can effect.
To see it as only one pyramid would be missing half the picture. It's two
overlapping pyramids extending in opposit directions with no signals or
nodes or cells "at the top". It's a data transformation network.
> Second, reponse time is critical. The system cannot wait for signals
> to traverse so many levels. The connectivity diameter of the brain is
> six neurons or loess.
The brain uses very slow acting neurons. To keep reaction time low, it
must limit the path length. It makes up for that by using a very large
number of cross connects (10,000 at times). I'm building signal processing
nodes that run 3 to 6 orders of magnatude faster than the brain. So I use
nodes with a fan/in/out of only 2 and make up for the lack of complexity by
using far deeper networks with more switching nodes. But I can afford the
deeper networks and stil maintain a fast reaction time because the nodes
run so much faster. So where the brain is limited to a six neuron path, I
can use path lengts of 6,000 if need be and react just as fast. (or use the
speed to prevent having to build parallel hardware).
If each node has inputs from 10,000 other signals, then each step through
the network has the potential to add great complexity to the sensory foot
print. With only two inputs per level, it requiers more like log2(10,000)
levels to add the same functional complexity to the sensory foot-print of
the signal.
So, for whatever type of complex mid-term "grandmother" signal that needs
to be created for mapping sensory signals to effector signals, the 10,000
input brain can do more complex signal processing in each step, where as
the network with fewer inputs per level has to create more middle-terms.
Just like if you want to create the sum A+B+C+D if you have 4 input
sumation gates, you can do it with one gate, but if you only have two input
gates, you have to use 3 gates and do (A+B) + (C+D).
> I agree that there is a temporal hierarchy in the sensory cortex but
> it's limited. You don't get grandmother-type recognition in the visual
> cortex. All you get are simple fixed time scale recognition such as
> edges, lines, etc... It's really a signal separation/classification
> process: if A arrives after B, it goes down this path but if it
> arrives after C, it goes down this other path.
Well, that's exactly what my nodes do. They sort signans depending on what
time they show up. If they show up before the event, it goes one way, and
after the event, it goes the other. The entire network is based on that
simple primitive.
I believe that's the only primitive you need for mapping sensory data all
the way to effector signals. It's not something that only happens in the
"sensory cortex".
> Signals from the sesnory cortex feed directly into sequence memory
> (varrying time scale correlations)
I'm not sure what your "sequence memory" system does, but my network in
effect molds itself to the temporal characteristics of the data. So it in
effect has "sequence memory" inherent in its operation. But instead of
trying to build hardware that will "remember" a data sequence, my network
only has one purpose - to produce the right output for the current state of
the entire environment. Since the network has "memory" of what has
recently happaned in the past, and produces all outputs based on what has
happaned recently, it can produce any sequence needed, based on any past
sequence that has happaned, even without a specific module to "store a
sequence".
> which, in turn, sends its signals
> to the motor layer. There is very little time to waste. Concept
> formation consists of organizing sequence memory into coherent groups
> of related sequences. Thus the concept formation and attention
> mechanism sits on top of memory. It can activate and deactivate groups
> of sequences using a timing principle that I am still trying to figure
> out. In my scheme, concept cells are grandmother-type cells. They do
> not generate behavior. They control it.
Your system has always struck me as overly complex. I don't see anything
you are doing in your 6 or so modules that I'm not doing in one network.
But I know that you also think I'm missing 5 or so modules. :)
> [BTW, sequence memory stores intervals which are used for prediction.]
>
> >It's clear feedback is needed for multiple reasons (Dan loves the idea
> >of lots and lots of feedback for image recognition), but it's unclear
> >how many different ways it might be implemented.
>
> Dan is right on this issue. There is a need for massive feedback in
> sensory processing. Once you realize what sensory processing is for,
> then it everything falls into place, IMO. It's all about signal
> separation. Multiple signals in an input fiber are separate and sent
> down different paths according to their temporal correlations with
> other signals in other fibers. The correlation is a simple 10 ms
> contiguity in the human brain and the factor that I use is 10 to 1.
Yeah, will I agree, it's all about signal separation, but it's also all
about signal combination - which is why I do both in every node of my
network. Where as you seem to separate and clasify one place, store
another, and combine somewhere else, I do it it all inside each node. Each
node in effect takes two input signals, combines them, analizes them, and
clasifies them into two different output signals. It uses temporal memory
to do the clasification/splitting task.
> > My training system (because it's a
> >reinforcement learning system), is a strong feedback system. So even
> >though the data is feedforward, all the training happens in just the
> >opposit direction (outputs back to inputs) as a feedback loop. Also,
> >because my network is temporal, it natrually has feedback effects
> >without feedback paths that non-temporal nets can only get with the help
> >of feedback paths.
>
> Well, yes. Training signals go in the opposite direction of sensory
> signals.
>
> [cut]
>
> >But at the same time, I think other types of actual data feedback paths
> >are required for pattern generation and I've not yet experimented with
> >that. It might turn out that the same type of feedback is needed just to
> >do pattern recognition correctly.
>
> There is a need to use feedback in sensory processing for finding
> fixed time scale correlations. This important for signal
> separation/classification, which is the only purpose of sensory
> processing, IMO.
Which sounds just like what each of my nodes are doing.
Even though our appraoch is somewhat different, I think are nets are just
kinda inside-out versions of the same things.
> >And the act of "thinking to ourselves - aka private thoughts" is clearly
> >a large internal feedback system of some type at work.
> >
> >So even though I strongly believe all this will turn out to be the same
> >problem in most ways (pattern recognition, concept formation, behavior),
> >I'm not as sure if my type of network has what it takes to solve the
> >problem or not - I only belive it's a big step in the right direction.
>
> Well, at least it's not GOFAI.
>
> [cut]
>
> >My network learns how to get the timing right for all behavior using
> >reinforcement learning.
>
> Reinforcement learning is strictly based on pain and pleasure stimuli.
> There is a shitload of motor learning taking place that does use pain
> and pleasure as corrective signals. Something else is used. It's
> called motor conflict detection. You can call it reinforcement
> learning, if you want, but that is not the conventional meaning of the
> term.
All my conflict resolution is built into the design of my network by the
fact I never split a signal. I started with a foundation where conflict
was impossible so that prevented me from having to build active "conflict
resolution" technology into the design. I suspect the brain is doing what
you do - solving it by fixing the problem after the fact with a conflict
resolution systems.
> [cut]
>
> >But, what happens if this system is being punished because the behavior
> >showed up late? How does such a system learn that it must do the
> >behavior sooner in order to be rewarded? A system which always tends to
> >delay the behavior when it is punished is going to have a very hard time
> >learning it needs to do just the opposit. So the "delay when punished"
> >option has serious problems.
> >
> >So the question is, how do you "learn" the proper setting of a continous
> >value, such as a timing event, with reinforcement learning?
>
> IMO, your system lacks something essential: the ability to anticipate
> the future, not only of pain and pleasure stimuli, bit also of normal
> events.
What does your system actually do with your "predictions" and why is it
important in your design?
I believe the only things that is important is that low level system can
predict what it needs to do now, to prevent future problems. It doesn't in
fact need to predict the future. And that's what my type of network uses
it's value/reward prediction system for. We for example don't need to know
that we are going to burned to a crisp if we open the door, we only need to
know that leaving the door short is a far better thing to do right now than
opening it.
And machine guided by a value system like this can look externally as if it
were predicting the future (look, it "knows" it will get burned so it
didn't open the door). But in fact, all the hardware knew was how
dangerous it was to open the door and not why it was dangerous. I think
this is the only and the correct "future prediction" system needed to be
build into the low level hardware.
Now, we all know that we can have conscious thoughts about the future. For
exapmle, we might be about ready to open a door and think "I'm going to get
burned if I open that" - perhaps without even realizing why that thought
just popped into our heads. And of course we have no problem planing our
future which shows we know a lot about what might happen in the future and
how we might change it (if I don't water the grass, it's going to die in a
week - if I don't get this job done, I'm not going to get paid).
However, even though all that "future prediction" which we do consciously
is very important to our abilty to act intelligently, I don't think any of
that is a result of some "future prediction" hardware in us. It's all just
hardware which is able to remember past events. It's a simple memory
recall device under our active control. We can in effect ask it questions
like, "what will happen if I open this door", and it will answer, "fire and
pain!". We don't ask it questions with words however, we ask it by acting
out a beahvior in our head, and our brain in turn, predicts what it
expectes to see happen in response to our action. So it's a temporal
prediction network.
So, to get a system which has our concious-level powers to predict the
future, what we really need is a system that is trained from experience, to
predict what will happen next, which is also under our active motor control
- so we can "learn" to use it as required.
I belive the same type of network which is used to always create the "right
beahvior" at the "right time", is the exact same system that can be used to
predict "what will happen", given a partial recreation of the state of the
environment. This is because the behavior system must have the power to
pick the behavior which best fits the current environment. That means that
partial data doesn't stop the network from knowing what to do, so that even
though we only see half a cat, we will still act as if the whole cat was
there.
And I belive that all this stuff people talk about as "sensory processing"
or "pattern recognition" is just more behavior generation. We don't for
example have a grandmother cell just because we have seen a lot of
grandmothers. We have it because it was proven to be an important
middle-term in the problem of mapping sensory data to effector data. And
as such, the middle-term or hidden layer siganls are not different from the
effector signals, they are produced by the exact same types of hardware as
that which ends up producing the final motor signals. They are just one
step of 6 (or one stop of 100) in the processes.
So, when you feed this same network, partial sensory data, it's still going
to pick the best "behavior" it can, for the given data, including the
"best" middle terms, even if you are sending it only partial data.
And, for this type of network to control complex behavior, you must also
feedback the outputs, to the network, to allow the past outputs, to part of
the function of the next outputs, along with new sensory data. So, if the
type of network then has the ablity (at some level of the network) to
generator outputs, that get blocked from actually being sent to the arms
and legs, but not blocked from being sent back to the inputs for effecting
what happens next, then it can "act out" beahviors, without moving the arms
and legs. And just sending what the network "thought" you had done, in the
environment, will cause the same network, to produce it's "what's next"
answers based on what it thinks is the best thing to do next - which
includes the what next answers of all the middle-terms - i.e. "grandmother
cells".
So, if you act out opening the door, the network might respond to that with
the prediction of a "grandmother" showing up because that's what the
network has selected as the most likely "best" thing to do at this time
with the middle-term signal on a way to creating the next motor/effector
outputs.
So, all this adds up to me beliving that a network who's fundamential
purpose is to create the next best beahvior, is all that is needed to
create what we like to call our "memory function" which is what we use in
order to act-out potential future events and in doing so, predict the
results of different actions without having to do them.
And at the low level, the correct way to pick behaviors, is not to actually
predict what will happen in the future, and do some sort of silly
computational intensive "search" of options, but instead, to build a value
prediction system which evalutes the worth of different behaviors, so the
low level hardware is always selecting the best behavior now based on it's
knowlege about past experience with this behavior.
So, as I asked above, what type of prediction of the future does your
hardware make and what does it do with it's prediction?
> >> The final
> >> solution will be simple and easy to implement, I'm sure, but finding
> >> it is like searching for the proverbial needle in the haystack.
> >
> >Yeah, it is. for sure.
> >
> [cut]
>
> >The solution was found with the design of my current network.
>
> [cut]
>
> Kurt, I read your post and I don't believe you have the solution.
Yeah, that's because we all believe we are on the right, and only path, so
anyone on a different path is just missing something. :)
> There are a few things missing in your scheme. But you know something
> essential that a generation of GOFAI crackpots (Minsky et al)
> completely ignored for fifty years: it's all about timing.
--
Curt Welch http://CurtWelch.Com/
curt@xxxxxxxx http://NewsReader.Com/
.
- Follow-Ups:
- Re: What did that thread indicate?
- From: Traveler
- Re: What did that thread indicate?
- From: JGCASEY
- Re: What did that thread indicate?
- References:
- What did that thread indicate?
- From: HMS Beagle
- Re: What did that thread indicate?
- From: humiguel
- Re: What did that thread indicate?
- From: Traveler
- Re: What did that thread indicate?
- From: humiguel
- Re: What did that thread indicate?
- From: Curt Welch
- Re: What did that thread indicate?
- From: Traveler
- Re: What did that thread indicate?
- From: Curt Welch
- Re: What did that thread indicate?
- From: Traveler
- What did that thread indicate?
- Prev by Date: Re: What did that thread indicate?
- Next by Date: Re: What did that thread indicate?
- Previous by thread: Re: What did that thread indicate?
- Next by thread: Re: What did that thread indicate?
- Index(es):