Re: Curtnetrons Don't Do Parity
- From: curt@xxxxxxxx (Curt Welch)
- Date: 05 Jul 2006 00:08:10 GMT
Michael Olea <oleaj@xxxxxxxxxxxxx> wrote:
Curt Welch wrote:
Michael Olea <oleaj@xxxxxxxxxxxxx> wrote:
Imagine you have a check reader. Down at the low level you are
grouping pixels together into possible "things" like characters, lines
(like the signature line) and so on. You have some pixels near the
edge of the image, maybe a character. You are making a grouping
decision, which pixels to include, which to exclude. At the low level
you have only some local cues, like proximity. Say a character is
close to an ornamental border you often find around the edges of the
check. Black pixels from the border (which is fuzzy) may sometimes be
closer to character pixels than to each other. At the pixel level that
there is even such thing as a border in the image is not at all
obvious. On the other hand, from a low res birds eye view of the image
the border jumps out like a beacon. You can trace its boundaries, and
propagate those boundaries down to the high-res detail. The evidence
that this is a border comes from the whole image - from pixels at
opposite ends of the image. The decision down at the character
grouping level depends on hypotheses at whole-image scale. An
identical arrangement of pixels at the local level has two different
correct groupings depending on context, is there or is there not an
ornamental border around this check, for example.
Yes, so it seems obvious to me that the low level shouldn't be wasting
it's time trying to figure out if a pixel is part of a border or part
of a possible character because it cant'. It should instead be making
grouping decision based on what it can figure out - is it part of a
short edge for example?
Nowhere did I say anything about the low level trying to figure out if a
pixel is part of a border or part of a possible character.
The point is that the strictly local cues available from a small patch of
image are in general insufficient to result in grouping pixels together
such that they belong to a common "object" of interest in the scene,
where the objects of interest depend on the task, i.e. the behavior in
response to objects. The local groupings will sometimes group together
parts of different scene objects (e.g. part of a character and part of a
border) and will some times split apart scene objects (e.g. chop a
character in two). This is not an opinion, it is an unavoidable empirical
fact. The reason is simple - take, say, a 16x16 patch of image. An
identical configuration of the pixels in that patch does not have a
unique "correct" grouping into "parts" because the "correct" grouping
depends not only on the arrangement of pixels in that patch, but on the
context in which that patch is embedded; it depends, ultimately, on the
whole image.
When you say "It should instead be making grouping decision based on what
it can figure out - is it part of a short edge for example", note that
this is an implicit partition of a mini scene into parts. It has grouped
pixels into 3 categories: 1) those that belong to the "edge", 2) those
that lie on one side of the edge, 3) those that lie on the other side of
the edge. This "edge" and this grouping have no necessary relation to the
edges of objects in the scene. This "edge" may well cut across object
boundaries. The pixels grouped together in the "edge" may include, for
example, part of the boundary between a dark spot of fur and lighter fur
on an animal's coat, and a boundary between shadow and light in the
distant background occluded by the animal. Look at the picture I
suggested you look at. Or the "edge" may fall entirely with in a dark
spot of fur.
There is no question these effects occur, the question is what to do
about them. These issues have been studied extensively in a) computer
vision, b) psychology, and c) neuroscience. Stephen Palmer's introductory
graduate text "Vision Science: Photons the Phenomenology" (810 pages)
draws on the literature and research from all 3 areas, though b and c
much more than a. Much of the basic data - the facts to be explained -
comes from the Gestalt school of psychology (e.g. Koffka, Kanizsa, J. J.
Gibson). Look up "Kanizsa triangle". Look up "illusory contours".
Consider one little bit of data from neuroscience - the response of
"orientation tuned" neurons in V1. A first-cut simple characterization of
these neurons is that they are "oriented edge detectors", responding to
an edge at the prefered orientation in their "recpetive field". Three
points: 1) the response to the "receptive field" is not a function of the
contents of the receptive field alone, but depends also on the scene
outside the receptive field (the "context" in which the receptive field
occurs).
That makes no sense what so ever to me. The receptive field is DEFINED as
the part of the field the signal is a function of. If the signal changes
as a result of information outside of the receptive field, then you defined
the receptive field of that neuron incorrectly. It just means the true
receptive field is larger and the mapping function more complex.
This dependence affects both the strength of the response (e.g.
firing rate), and the "preferred orientation" of "edges" to which it
responds most vigorously. 2) The response changes during the course of a
single fixation (on the order of 200 to 600 ms). 3) The response depends
on the behavioral task in which the animal is engaged.
3 seems to indicate feedback from above. But how much is that required to
do strong pattern recognition and how much of that might be a more complex
part of a selective activation feature which has more connection to
controlling what we are "looking at" vs improving pattern matching?
Obviously, there's a lot I don't know about the research so I'm only
thinking out loud here.
So, for example, during the course of one fixation of a scene with an
"illusory contour" cutting through the classical receptive field of such
a neuron - that is, an "edge" that has zero illumination contrast between
one side of the "edge" and the other, such that this "edge" is at the
preferred orientation of the cell - initially there is no response (the
cell fires action potentials at the spontaneous background rate). Some 50
ms or so later one of two things happens: if the behavioral task requires
fine discrimination, say putting your finger on the "edge", then the cell
responds as it does when there is an illumination contrast at the
preferred orientation, but if the behaviroal task does not require such
fine discrimination (say discriminating an illusory triangle from an
illusory rectangle) then the cell does not respond to the "edge", but
continues to fire at the spontaneous background rate.
We cna raed wrdos lkee tihs. Whih vrey litlee parctice, we can read
tehm at auobt the same seped we read wdros slept correctly.
Yes, we can - by taking into account the global context in which they
occur. As to how much practice it takes, and the speed at which we can
read them - where's your data?
Data? Who needs stinking data when I can just BS?
My point is simply that we can read them fairly quickly and easily
considering how far off from being correct they are. We can read them much
faster than we could for example solve a cross word puzzle and then read
it. The point is that there seems to be no extended process happening that
first searches for a potential correct spelling before we can understand
what the word is - our recognition system produces a good guess simply
because all the clues available is enough to point it to the right answer
without any type of search process. The correct words just pop into our
mind as we scan the letters without having to mentally move the letters
around first.
-- Michael
--
Curt Welch http://CurtWelch.Com/
curt@xxxxxxxx http://NewsReader.Com/
.
- Follow-Ups:
- Re: Curtnetrons Don't Do Parity
- From: Michael Olea
- Re: Curtnetrons Don't Do Parity
- References:
- Re: Curtnetrons Don't Do Parity
- From: Albert van der Horst
- Re: Curtnetrons Don't Do Parity
- From: Michael Olea
- Re: Curtnetrons Don't Do Parity
- From: Curt Welch
- Re: Curtnetrons Don't Do Parity
- From: Michael Olea
- Re: Curtnetrons Don't Do Parity
- From: Curt Welch
- Re: Curtnetrons Don't Do Parity
- From: Michael Olea
- Re: Curtnetrons Don't Do Parity
- Prev by Date: Re: What part of the brain is conscious?
- Next by Date: Re: What part of the brain is conscious?
- Previous by thread: Re: Curtnetrons Don't Do Parity
- Next by thread: Re: Curtnetrons Don't Do Parity
- Index(es):
Relevant Pages
|