Re: What did that thread indicate?



"feedbackdroids" <feedbackdroids@xxxxxxxxx> wrote:
> Curt Welch wrote:

> > General AI requires a data extraction system that learns to adapt to
> > the nature of the data it is given - no matter what content is in the
> > data or how it is encoded. Until someone figures out how to write one
> > algorithm, that can do both image recognition, and sound recognition,
> > without hard-coding the solution for each problem, then we will never
> > have the basis of real AI.
>
> This may be so, but it's more opinion that something that is known to
> be true. So far, the only real existence proof we have is the brain,
> and it seems to have taken a different course [the one I've been
> talking about] during evolution.

There is little evidence that the brain has 30 hard wired vision modules.
If that was so, then someone could identify them with a microscope, just
like you can see the difference bettwen the different parts of the brain
with you eyes. But they can't. In order to find all these areas, you have
monitor the signals created in the cortex. When they do surgery, they have
to ask the patent to help them identify areas of the brain because you
can't find them looking at them.

If all these areas were in fact hard wired, they would be as different as
the neo cortex is different from the celebelum. The neo cortex has one
name for the enture structure, and then a "map" of where different types of
signals can "normally" be found in this structure. This is not evidence
that evolution designed all these different modules like a human would
would the different chips in a computer. It's evidence that it is one type
of module, like memory in a computer is one type of module. If you were to
monitor the different parts of memory in a running computer, you could also
create a physical map to show where the OS was located, and disk data
buffers were located etc.

The physical evidence we have about the neocortex tells us that it's not
100 different hard-wired modules designed by evolution, but that in fact
it's some sort of generic signal processing system that simply carries
different data at different locations.

I don't know how many times I have to repeat this before you grasp the
obvious nature of it. The neocortext hardware is all the same, and it's
substantially different from the rest of the brain.

> We must have this foundation first. Once that is
> > mastered, then we can optimize the implementation to the needs of the
> > machine/data so we have a specialized version of the algorithm
> > optimized for the vision data, and another for the sound data etc.
> >
> > It makes no difference how specialized the brain has become at each of
> > these signal formats because it's trivial to see that behind the
> > specialization, is a generic learning system that has the power to
> > recognize and respond to data in ANY format, from any sensory system.
> >
>
> I think we've been over this any #times in the past. This is your usual
> response, and mine is that it's probably "much" easier to solve the
> problems at hand when there exist preprocessor modules that can
> radically reduce the size of the search or dimensional space of the
> problems.

Maybe the "generic" function (or one of them) I'm talking about is a
processes of "radically reduing the size of the search or dimensional
space". Whatevery it is, I know it's happening all over the neocortex and
that is what we have to figure out.

I don't need to hard-wire 30 vision modules to create intelligence, I have
to figure out the general algorthm at work that allows us to react to data
in any modality as needed, based on experience.

> Especially for vision, the dimensional space is enormous.

The dimentional space of 100 inputs is exponentionally larger than the
dimenstional space of 10 inputs. This is all you need to know. It makes
no difference if it's vision data or audiotory data. For every new signal
you add, the dimenstional space of the inputs grows exponentially.

> Many people see this as the primary reason that GOF-connectionist
> systems have had such limited success - and I trained many 100s of them
> back in the early 90s. They don't generalize nearly as well as the
> old-time connectionists used to contend. The nets are really much more
> brittle than most people realize, and the reason for this is because
> they are really not much more than simply correlators for the data sets
> they've been trained on. Change the input data, and they quickly crash.
> This sort of thing is really very apparent for vision, eg, where you
> rotate the head/etc, and also for audition, where you change the
> data-rate, phase, or other temporal characteristics. I tried them all
> myself, so I know first-hand.

Did you every once work with a temporal net instead of a standard
non-tempral BP trained NN?

Do you have any grasp how completly different my type of network is from a
standard NN?

A standard NN computers a non-temporal function of the input data at ONE
and ONLY ONE, point in time.

So, if you have these three input values changing over time:

Time->

A 0 1 0 1 0 1 0
B 0 1 1 0 1 1 0
C 1 0 0 1 0 0 1

A standard neural network computes function along the vertical columns.

So, output 1 might be 10*A + 12*B - 2*C

That makes the output blind to things that might have happaned in the
previous set of inputs, and blind to thing that might have happaned 30
cycles in the past. Without extra hardware, it's impossible for a NN to
even compute the frequency of the inputs. The cycle length of A is 2
cycles and thye cycle length of C is 3 cycles. But can you train a NN to
produce those numbers as outputs? Not at all without extra hardware.

You have to add some form of memory, and delay system into the network
before it can even begin to compute something simple like the average cycle
length of the input. You can for example add a one cycle delay unit that
takes the output computed at one clock cycle, delays it for a cycle, and
feeds it back as input to the next cycle. This allows data from the past
to be used for future calculations, but the feeback effects get very
complex, and if you delay time is constant (like 1 cycle), then the system
has no easy way to create a response which is a function of data that
showed up 10 cycles in the past.

My networks however are purely temporal. That means they calculate
horizontal functions across the data in the table above instead of vertical
functions. They measure, and respond to, the distance between pulses.
They clasify data based on not it's spatial characteristics, but it's
temporal characteristics.

When you switch to a pure async pulse signal format, then the data no long
has any vertical columns to compute a value from becasue async data never
shows up all at the same time like that. So, in this signal format, the
only thing you can react to is the temporal content because all the data is
in that domain.

For example if you frequency encode the brightness of light, then the
brightness of the light signal is not in some spatial value, but instead,
in the distance in time which pulses are separated. And this is what my
nodes measure, and learn to react to.

I've not developed these nets enough to know just what all you can do with
these, but many things that seem hard with NNs become trivial with this
type of temporal network.

> Where people did have better luck was, for instance, where they used an
> FFT preprocessor ahead of the NN to extract phase-independent
> information [frequency spectra], and then let the NN learn to correlate
> patterns on the abstracted data.

No ***. It's because the NN was working in the wrong fucking domain in
the first place.

My network doesn't need an FFT preprocessor to translate temporal data to
spatial data because my network is already working in the temporal domain.
My nodes are already frequency detectors. Pulse spacing is frequencey and
my nodes measure, and react to pulse spacing, aka frequency.

What a standard NN can't do at all without the help of something like an
FFT preprocessor, my network does on it's own.

> I think the answer to the problem is
> to find the "correct" form of preprocessing, more so than simply using
> learning modules.

No, the answer lies in using the correct type of network so you don't need
a preprocessor.

Think about this. Because a NN is working in the wrong domain, it needs a
FFT preprocessor to work on temporal problems. But then, the output of the
first layer of nodes, is also temporal in nature, so you need a second FFT
preprocessor before you can feed it to the second level nodes. In other
words, the output of an NN is not compatible with it's input. So you have
to add an adaptor at every level of the network in order to solve a
temporal problem (like walking).

But, since my nodes don't have this problem, I don't need adaptors between
each layer of my network.

The correct answer is learning to look at the problem correctly. Seeing it
a spatial function problem is just not correct. It took me 25 years to see
that beause eveything I had learned about engineering taught me to map all
temporal problems in to spatial problems before solving them.

I've also explained in this group many times why we all made the mistake of
thinking about problems in the spatial domain. It's because we make heavy
use of written langauge to record information. And when you write data on
a peice of paper, you loose all the temporal content, and only the spatial
content remains. How fast you make the markes on the paper is lost. The
order that you made the markes on the paper is lost (perserved only
slightly by conventions such as writing left to write top to bottom).

Because of this, if we want to explain what a problem is, we find we need
to be able to write it down. And once you write it down, you have done an
FFT translation on it. You have translated all the temporal data, into a
spatial encoding system. Then, we learn to understand, and solve the
problem, in a pure spatial domain.

For example, we might time an action, and record the time, and the event,
on the paper. Then we take all that data, as one set, and "compute an
answer". Our use of written langauge to comunicate, has made us specialize
in the art of solving wirtten problems (spatial problems).

The weakness of clasic NNs is that it's an attempt to solve a temporal
problem, in the wrong domain. It's an attempt to create a spatial solution
for a temporal problem.

As another example, many times when people try to understand the nature of
pulse signals, they do the same thing. They map the data back to the
spatial domain. For example, they try to calculate the averge pulse
frequnency and use that as a spatial measure of the signal beacuse with
that, we can use all our spatial calculation tools.

Or, when trying to measure the "information content", they try to find ways
to map a pulse signal back to a spatial binary signal, and then "count the
bits" to tell us how much information is in the pulse signal. That's BS.
bits are spatial concepts that exist in the time-free spatial domain that
we like to solve all our problems in. But the real world is a temporal
place, and and real world temporal problems just aren't easy to solve in
the spatial domain.

But, if you use temporal tools, in the temporal domain, then temporal
problems become trivial.

> When you just present raw data to the net, it has to
> learn to both reduce the redundnacy or search space, and also learn the
> patterns. When preprocessing is added, the problem for the learning
> unit is much less difficult.

> As I recall, exactly 2 years ago on august 1, you were thinking you
> would get your inet-onet to starting zeroing in on the answer, but I'm
> not sure what you're doing differently now. I'm not faulting you,
> because these problems are really difficult. I think that your thinking
> is slowly evolving as you find things that don't work.

Sure, my approach and understanding has been slowly evolving for about 30
years now. It hasn't really changed at all in the past year or two because
that's when I found the answer I had been looking for.

The inet/onet approach was the step before the last one which might have
been about 2 years ago (I'd have to check to see for sure). But it's when
I posted the "AI solved" message and first talked about this single pulse
sorting mesh network instead of the two part inet/onet.

The general approach of both was the same rough idea. The inet was the
"signal separator", which acted like a large feature extractor to find and
extract all the important features in the sensory data. The onet, was the
behavior creation network which would take all the information from the
inet, and combine it as required to create whatever behaviors were needed.
The onet was trained, and configured, by reinforcement learning. The inet
did not use reinforecment learning. It responded only to the
characteristics of the data feed to it. It was basically a pre-processor
for the onet to restructure the data and remove redundancy.

I was experimenting with all that in the spatial domain. I got to this
design after many years of evolution before I started to talk about it
here.

The first big evolution to my design came when I had a debate with Louis
about the importance of temporal processing. I knew temporal issues were
important, but thought it would be solved outside the net by adding memory
so I spent a week or so telling Louis he was wrong. But them I started to
see things that opened my eyes to the possiblity that working in the
temporal domain from the start made some things easy. And I started to
explore just how you would build a temporal net.

That evolved though about 5 major different designs (none that worked
right) until I realized that not only do you need to think of it as
temporal problem, but you need to work with async pulse signals, and sovle
the problem in that domain, instead of trying to map it back to the spatial
domain.

This got me to a temporal inet/onet. The inet was at this point, a pulse
sorting decision tree, and the onet, was a reinforced trained pulse
generation system. The pulse sorting inet was doing some amazing and fun
things (like acting as an FFT preprocessor). It was doing what it needed
to do, which I could never figure out how to do in the spatial domain. But
the onet was a complex pulse genertor and the mapping between the inet and
onet pulse generators required a huge 2D matrix which was showing signs of
working, but had exponential scalling problems.

The breakthrough to the current design happaned when I started to look at
the problem as a signal routing problem. That is, how to connect sensory
input A, to effector output X. This led me to the idea of using pulse
routing for the entire net (instead of just the inet), and allowed me
realize that combining signals back together, after they has been
separated, was the right and obvious thing to do.

So now, instead of the inet being a huge pulse separation network, and the
onet being a huge signal combining netwwork, every node of the network is
both an inet and the onet. Every node separates a signal into two
independent signals, and the network then combines together two signals, to
feed to the next level.

The signal separation is both adpatative to the nature of the signal, but
tuned by reinforcement learning.

That was about 2 years ago I think now. I've not made much progess since
them, because there's no more "redeisgn" needed. This works. It's the
first net in 30 years of 100's of attempts, that works. It does exactly
what it needs to do in order to be a reinforcement trained general solution
to the problem of abstracting data, and creating beahvior.

But it's not done. Now there's a lot of careful study and development that
needs to be done to understand the full nature of this type of network, to
see just where it might take us. But the reason I've not made much
"progress" in my "slow evoution" in the past couple of years, is becaue the
design I've been searching for for the past 30 years is here. I like
figuring out the mystery, (solving the design puzzle) and letting others do
the grunt work. I figured out the mystery, but no one else sees any value
in it, and I've really not had the time I need to do the serious grunt work
that I think needs to be done. (like sit down and work through the morse
code problem John came up - and 100 other problems like that I already have
on my todo list).

A couple of years ago I said I solved AI and I still believe I have. This
type of network is the asnwer to why none of those other NNs every seemed
to work well. You have to build a temporal NN in order to solve a temporal
problem with it. AI is a temporal problem not a spatial problem and my
network is exactly how you you can do that. Solving a hard problem is
always a matter of finding the right way to look at it. I did that. I now
know how to look at AI and how to solve it. I just haven't done the work
to show others (or myself) if what I believe is true or not.

However, it looks like I might have finally found some options to change my
life around which if it works out, could lead to me having the spare time
to actually work on AI and maybe even go back to school to give this
approach the attention I think it needs.

> > What Louis and I have in common is the belief that generic indepedent
> > signal processing is the key to AI.
> >
> > What you seem to believe is that if we don't duplicate the hard-wired
> > signal processing created by evolution of the vision system, our
> > machines will never be intelligent.
>
> I've never said this, just like I've never said there are homunculi in
> the brain to do the looking. What I've consistently said is what I've
> just re-iterated above. The brain seems to have solved the problem by
> evolving preprocessor modules which reduce the search space problem for
> learning modules.

Well, I think that's not what they are.

> The many levels of processing in the cortex "compute"
> successively more abstract information from the data - ie, reduce
> redundancy and search space and dimensionality.

Yeah, I like that description. And that's exactly what each level of my
network is doing as well.

> Get a few levels in and
> instead of just responding to pixels blinking on and off, cells respond
> to abstractions like faces. On and on.

Yeah, it's clear a netowrk like that is needed. That's why I built one.
Now I just have to prove to myself how much power the network actually has.

> Also, as I've said many times,
> there seems to be less plasticity at lower levels and more at higher
> levels.

Yeah, that might be needed. It's trivial in my type of network to either
make the lower levels learn slower, or to slow down their learning over
time faster than the higher levels to "lock in" the behavior the lower
levels before the higher levels. That's the type of stuff I need to spend
a lot of time working on with my network to understand the value of doing
those things. The obvious reason it might be needed to to stabalize the
network. You in effect need a stronger and more stable foundation, with
increased flexibilty at the higher levels. So insetead of tall bambo pool
like system that flaps in the wind at all levels, you have a tree, with a
strong trunk and weaker limbs as you climb higher.

> So, it seems to me you have 2 general principles here about how things
> work. Successive levels of abstraction being computed, and more
> plasticity [esp learning, and possibly re-organization] coming in at
> later stages. In the brain, IT and MT

IT? MT?

> might always be computing the
> same thing from animal to animal, but this is just how nature solved
> the problem - probably since brains are specified by genes. However,
> what's more important to me are more so the 2 organizing principles
> mentioned in the last paragraph, than the specifics.

Well then great, as long as you stick to that, we are talking the same
thing. We are looking to understand the fundimential organizing principles
needed to create general signal processing, not the specifics of one
modality like vision.

> I think you're not seeing that FB is the "third" organizing principle,
> in addition to the 2 mentioned above.

I relaized the fundimental need for FB about 20 years ago. It's nothing
new to me. I was building learning networks with feedback in 1985.

A smaller breakthough I had a few years back was that you could use
feedback to cut the problem in half. It seemed to me that you needed your
levels of abstraction to understand the sensory data, but that you also
needed the reverse of that happening to create behavior - where you had
levels of behavior built on top of one of each other to create complex
behavior. You needed a hierarchy of behaviors on the output side (just
like we use hiearchies of subroutines in software to create complex machine
behavior). This was actually the start of my inet/onet ideas. A hiearchy
of abstraction to decode sensory data on the input side, and an inverse
hiearchy to create behavior on the output side. But what I realized one
day, is that if you could solve the input side of the problem, you could
simply add feedback, to solve the ouput side of the problem. This is,
whatever outputs the system created, if you feed that, back, as more
sensory data, to be analized by a second hiearchy of "understanding" that
would allow the system to learn to produce a hiearchy of behavior. So
instead of building two types of modules, all I had to do was build one,
and then use feedback to create the second.

This is what I believe is happening in the motor cortex. the motor cortex
is not some inverse of the sensory cortex, it's more of the same thing.
The only difference is that the sensory cortex is feed from external
sensors, and the motor cortex is feed from internal sensors, which sense
the brain's effector outputs.

A network in this feedback configuration then works as a static pattern
generator (it is always producing the next output as a function of the last
outputs). It produces constant cyclic patterns by default (useful in
applications like walking and breathing and chewing), but is also just a
static sequence generator for producing any complex sequence needed (like
reciting our ABCs).

But since the feedback net is only half the the full net, it means that all
ouput behaviors can be triggered by either revent sensory conditions, or
recent past behaviors. This allows the system to triger sequences of
behavior at will as needed.

A lot of books seem to explain the sturction of the brain like this:

senory data -> Sensory cortex (cross connects) Motor Cortex -> outputs

But my ideas says it actually works like this:

Sensory data -> Sensory cortex \
outputs -+--->
+-> Motor cortex / |
| |
+--------<----feedback---<------+

So, the sensory cortex and motor cortex are the exact same type of hardware
peforming the exact same function in my design (and in how I think the
brain is actualy built as well).


So, this major motor cortex feedback loop has been a standard part of my
design for about 5 years now. But I've not started working on that because
before that will work, I have to get the general cortex technology working.
But that's what I've now down with my network - that is, I found a design
that explaines everything to me - which includes feedback loops for the
purpose of creating pattern/sequence generators.

Whether it can work as one large pre-wired global loop (as I mostly
thinking about it as drawn above), or whether the system needs lots of
minor loops which might have to be dynamically added, I don't yet know.
I'm not there yet. Maybe your idea of needing lots of small internal loops
is right. But, the need for feedback for pattern and sequence generation
has been key in my design for years now and the knowlege that feedback in
general was required somewhere has been obvious to me since I first started
working on the problem.

> FB loops do many things,
> importantly to make global comparisons between various types of
> information instantly possible, and also to help synchronize activity
> in all areas.
>
> 1. hierarchical levels of data abstraction.

Yeah, I'll been trying to figure out the right way to build that for 25
years.

> 2. more plasticity at higher levels.

That's not key in my eye. It's only a minor implemetnation issue, like
making tree limbs at the top of the tree smaller than the ones at the
bottom to keep the trunk from breaking.

> 3. feedback loops allow global temporal comparisons

"temporal comparison" is done _everyere_ in my net. It's the fundimentail
processes of my network now. But it's done with feedback inside each node
(by memory of the last pulse).

> and real-time
> global synchronization across regions.

Well, FB to me is KEY because of it's power to turn a temporal network,
into a sequence generator. If you use global feedback to feed the final
outputs, back to a different set of sensory inputs, then you have all the
global cross syncronization I think you need.

The key thing you left out is that something must be directiong the actions
of the system. Something must be evaluating behavior as good or bad
because otherwise, the system has no purpose, and no clue about what should
be doing. that's where reinforcement learning comes in. It's the answer to
how you give hardware open ended purpose so it can find it's own creative
answers isntead of closed end purpose which is just hard-coded behavior.

This is something I realized 20 years ago and which you don't even mention
as part of the problem in your list here. It's by far the most important
of all the key issues to AI.

> > If someone can study the brain and answer the quesiton as to what the
> > generic data processing that the vision system and auditory system has
> > in common, then that would be very useful to us.
>
> Some general organizing principles above. The first 2 are especially
> relevant to your question here.

Yeah, but your "general organizing prinicples" of "hierarchical levels of
data abstraction" was something obvious to me 25 years ago. I didn't need
to study the brain any more than that what I knew 25 years go to sense that
was needed. And nothing I've learned about the how the brain actually
works for vision processing (for example), has given me much added insight
into how to implement it.

And that's the trick. Figuring out how to turn these abstract ideas into a
real implemenation. And that's the search I've been on for about 30 years
now. I think I know have all the answers. I know what I need to build.

It's like trying to figure out how to build a flying machine when you have
no clue how to do it. Should we put wings on our arms and flap? How does
flapping our arms actually allow a bird to fly? Before you get off the
ground, you have to figure out the nature of how an airfol can create lift,
and figure out that a man doesn't have enough power to do this on your own
with the material you have to work with, so you have to add an engine, with
enother power to lift not onlhy itself, but the rest of the plane, and the
person. And you have to figure out things nature never figured out, like
how a spinning airfoil powered by an engne can create foward thurst. But,
once you have all those peices put together, then all you have to do is
work out the enginerring details of the exact size, and weight, and
configuration of the airfoils.

What I've been looking for over the past 30 years is the basic
understanding of what I have to build in order to create AI. As of a year
or two ago, I finsied that task. I know exactly what has to be built to
get AI off the ground. I've got the working models of the technology. Now
it's just a lot of engineering grunt work to finsish creating all the
implemenation details.

The work to be done, even if I'm right about the approach, is still
significant. If it was only a month of work to get human level AI, I would
be doing it instead of writting Usenet posts. But it could easily be a few
years of full time work researching and experiementing with this approach
to work out all the details (like to understand the importance of your
plasticity at different levels issues, and to understand if one global
feedback system can work to create a good motor cortex or if you need many
different feedback loops - or if you must dynamically create it somehow).
And, how do you optimize the design of the network for different types of
sensory problems, like vision, vs sound.

So, as long as you are talking about studying the brain to figure out the
general principles behind AI, that's fine and that's the same approach I
belive in. However, I think studying the brain to find the general
principles before you know what the general principles are, is a very hard
hill to climb, and one that for the most part, I don't think anyone who is
stuying the brain has a good grasp of. It's like trying to study the
inside of the computer to learn the general principles of a stored
program computer before you understand anything about computers. You would
get so lost in the complexities of the signals and chips in a computer that
you would never see the general principle of a stored program computer lost
in all that complexity. And I think the same thing is happening with the
people studying the brain. They have no clue what they are looking for,
and instead, find ways to impose their beliefs on the hardware they find,
instead of developing the correct general principles first.

Until someone figures out the correct general principles, I think most the
people studying the complexity of the brain will contine to be lost and
continue to mis-label the purpose of the stuff they find, and I know for a
fact a large number of people working on AI are lost because they are busy
building what they think they "know" the brain is doing, instead of working
hard to figure out what the correct general principles are first before
they build. (Like for example this discussion about behavior being goal
directed we are having in another thread is an example of a lost soul that
things they have the correct general principles).

I think I've now finished the search for the correct general principles
because I've got working prototypes for my general principles. It's like
having the wing-foil model that creates lift in the wind tunnel, and
understanding that's the piece of data that was missing to build a plane,
but not yet having the design of the full plane. Now I just need to finish
the engineering research needed to turn the general principle into useful
AI technology.

--
Curt Welch http://CurtWelch.Com/
curt@xxxxxxxx http://NewsReader.Com/
.


Loading