Re: nearest neighbour of a point
- From: "running.polygon@xxxxxxxxx" <running.polygon@xxxxxxxxx>
- Date: 29 Apr 2006 05:24:13 -0700
Hans-Bernhard Broeker wrote:
In comp.graphics.algorithms running.polygon@xxxxxxxxx wrote:
I have 2 sets of points X & Y. The distribution of points in both the
sets is such that points at a certain distance are concentrated and on
either side of this distance, they start becoming sparce, something
like a normal distribution.
This description is a bit hard to parse. You seem to be saying that
within set X, distances to the nearest neighbor cluster around a
certain value, rather separated from zero. And the same for set Y.
Or is this clustering true for *all* distances between points in one
set? I.e. are all points of set X expected to fall roughly on a
spherical shell of radius R around any point in set X one picks as a
reference?
Sorry, I should have made myself a bit more clear.
All the points lie within a hypercube of side 1 and all points are
having non-negative co-ordinates in each of the dimensions. So, if we
consider 3D, then they would all lie in a cube of side 1 in the 1st
octant(if that's what it's called).
Also, consider there are say 20 dimensions, then we know for sure that
2 points can be separated by a max. (dist)^2 of 20.
So, all I'm saying is that the number of points in distribution X & Y
is such that the number of points at a dist^2 <0 & >20 from the origin
is zero, followed by points having dist^2 <1 & > 19, and so on....
and the number of points having dist^ between 10 & 11 would probably be
the most.
Of course, the mean of the distribution need not be exactly in the
center(at 10 in case of 20 dimensions) but it is a normal distribution.
The former case would indicate that your pointset is some kind of
regular grid embedded in that high-D space, and it may be a grid of
considerably fewer actual dimensions than the embedding space. You
could find out by counting the number of almost-equally-far neighbors
per point. E.g. a regular "grid" of points along a curve in 3D space
would yield 2 close neighbors per point, a gridded surface would have
4, and a volumetric grid would have 6.
E.g. if you find an average of 2*d close neighbors per vertex, your
pointset might be a grid on a d-dimensional manifold, rather than a
truly N-dimensional data set. If d is a good deal smaller than N,
that could ease the search algorithms a lot.
The points are in higher dimentions > 3. They may be up to 50
dimentions.
In that case, and assuming the argument above doesn't apply, odds are
that there's no algorithm that could improve upon the naive "compute
all distances and remember the shortes" approach. 2^50 is guaranteed
to be much larger than the number of points you have, so no kind of
subdivision method can really help you, if the dataset's actual
dimensionality is 50.
Yes, that is right. 50 dimensions would imply that even if I have a few
ten thousand points then 2^50 would easily overshadow that number.
The fact to remember is that in
high-dimensional spaces, everything is pretty much next to everything
else, so the only way to find the closest thing is to check
everything.
You can do the maths yourself: compare the volume of a 50-dimensional
hyper-spherical shell whose radius and thickness are the center and
width of your distance distribution, to that of a 50-dimensional
hypersphere of said radius. You'll find that the shell's volume is
not really much different from that of the full sphere. I.e. the
restriction to a thin spherical shell isn't going to help you much in
searching or organizing points.
It's a bit(very) hard(for me) to visualize anything in more than 3D.
However, instead of considering the thickness of the hyperspherical
shell to be the width of the distribution, we let the thickness be say
the standard deviation of the distribution, would that mean than the
volumes of the shell & the sphere are largely different?
If I try to imagine a similar concept in 3D I land at the following
reasoning:
Let y be a point in set Y, and let dy be the dist^2 of y from {0,0,0}.
Say that I have a list of all points in set X such that they are sorted
in ascending order on their dist^2 from {0,0,0}.
Now, I can say that the point in set X which is closest to point y is
one which lies in the band such that the band contains all points at a
distance of [dy-DELTA] to [dy+DELTA] such that DELTA is chosen in such
a way that it encompases at least the point which is closest to y.
However, I'm not sure if the same concept can be extended to higher
dimensions.
--
Hans-Bernhard Broeker (broeker@xxxxxxxxxxxxxxxxxxxxx)
Even if all the snow were burnt, ashes would remain.
.
- Prev by Date: Wanted: XY Keystone Correction Algorithm
- Next by Date: Re: color modification
- Previous by thread: Wanted: XY Keystone Correction Algorithm
- Next by thread: Mesh Correction Algorithm - Academic Question
- Index(es):
Relevant Pages
|