Re: index



On Jul 30, 9:56 pm, "Brian Selzer" <br...@xxxxxxxxxxxxxxxxxxx> wrote:
"David BL" <davi...@xxxxxxxxxxxx> wrote in message

news:fda99e61-ebc0-4bf5-b6c3-348e54d3f66e@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx





On Jul 30, 11:13 am, "Brian Selzer" <br...@xxxxxxxxxxxxxxxxxxx> wrote:
"David BL" <davi...@xxxxxxxxxxxx> wrote in message

news:281568e5-bca8-4107-bab9-f8cb304f7b41@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
On Jul 29, 9:08 pm, "Brian Selzer" <br...@xxxxxxxxxxxxxxxxxxx> wrote:

[snip]

Yes it would. The additional read and an exclusive lock /would/ be
required
whenever there is an update.

You appear to be equating an index that covers
the heading with a clustered index. Covering the heading--that is,
containing all of the columns in the heading--is not what makes an
index
a
clustered index.

Yes, I would define a clustered index as what you call a covering
index. Evidently you want to distinguish a "primary" copy of the data
- based on your argument concerning write locks for updates. However,
it seems odd to me that the definition of a clustered index would be
concerned with where locks are located. In fact a lock manager will
invariably dynamically allocate locks (only) when they are needed,
indexed in transient data structures (such as hash tables) in system
memory. Where is the need to distinguish a primary copy of the data?

It's not so much about where locks are located: it's about whether index
keys can be considered to be part of the content of the database. To be
clear: the content of the database is characterized by the set of
queries
that can be answered given the data in the database. With the exception
of
clustered indexes, which are not just indexes but also tables, every
index
can be eliminated without reducing or restructuring the content of the
database. A clustered index, on the other hand, can only be eliminated
by
being transformed into a heap--otherwise there would be a reduction in
the
content of the database. Even so, every query that can be answered with
indexes can be answered without indexes (at least theoretically. Some
queries may become impractical to the point of being unusable without
indexes.). So doesn't it follow that there already is a primary copy of
the
data? Except for clustered index keys--each of which may be the only
copy
of a particular datum in the database--can index keys really be
considered
part of the content of the database? Wouldn't that therefore relegate
them
to secondary status?

It may be useful to look at an analogy in C++. Consider a bijection
between two integer attributes named x and y coded as follows:

class Bijection
{
public:
void insert(int x,int y) {xtoy[x]=y; ytox[y]=x;}
int getx(int y) {return ytox[y];}
int gety(int x) {return xtoy[x];}
private:
map<int,int> xtoy;
map<int,int> ytox;
};

As a systems programmer I have used this technique in various forms on
a number of occasions. I see a class instance as analogous to a
single “table”, and the two maps as “clustered indexes”. Each is
redundant with respect to the other. Both are needed for performance.

I think I see what you're driving at, even though your example is broken:

Bijection b; int i, j, k, l;
b.insert(4, 3); b.insert(4, 5); b.insert(4, 7);
i = b.getx(3); // returns 4
j = b.getx(5); // returns 4
k = b.getx(7); // returns 4
l = b.gety(4); // returns 7

Yes of course I didn't bother worrying about repeats. However whether
one needs to test for that explicitly depends on the application. For
example in the past I have needed a transient bijection between GUID
and memory pointer and I could *assume* uniqueness of both in the
design.

Your example is illuminating--aside from the fact that it doesn't operate as
its name suggests. I don't think it's thread safe: x == b.getx[b.gety[x]]
could sometimes return false. That highlights one of the pitfalls of
duplicating content.

Most classes in most OO programs are not threadsafe because they do
not need to be threadsafe. Imagine the performance hit if every
variable of every data type was protected with a mutex!

Don't scoff at that Bijection class. It is like a Lamborghini when
used in an appropriate context. Why use a general purpose off road
vehicle when you know you're only driving on the highway?

But the bottom line is that what is a clustered index in the implementation
implied by the original poster's question differs from what you consider to
be a clustered index.

Perhaps.

.



Relevant Pages

  • UniqueIdentifier, offline data: clustering key choice? ... (indexing+performance question)
    ... Say if you have two customer databases, ... When putting them onto the synchronised database you would ... >clustered index, but again: ... >The identifier column, of course, would be thrown away ...
    (microsoft.public.sqlserver.programming)
  • Re: InsertOnSubmit for linq inserts data at first row in table.
    ... The rows in a table in a relational database DO NOT HAVE ANY ... such thing as "the first row in the table", ... free space in the middle of the table. ... The preceding is for a table that does not have a clustered index. ...
    (microsoft.public.dotnet.languages.csharp)
  • UniqueIdentifier, offline data: clustering key choice? ... (indexing+performance question)
    ... Central SQL Server database with multiple tables holding millions ... UniqueIdentifier as a surrogate key to ensure that I have a simple ... and a dozen non-clustered indexes on each table). ... clustered index, but again: A general mechanism would be preferred. ...
    (microsoft.public.sqlserver.programming)
  • Re: DBCC DBREINDEX and FillFactor
    ... file (log file) instead of somewhere around 10% increase. ... >> you database objects. ... > Actually a Shrink operation will fragment the objects not defrag. ... > A non clustered index is simply implemented behind the scenes as a table ...
    (microsoft.public.sqlserver.server)
  • Re: DBCC DBREINDEX and FillFactor
    ... Andrew J. Kelly SQL MVP ... >>> you database objects. ... >> Actually a Shrink operation will fragment the objects not defrag. ... >> A non clustered index is simply implemented behind the scenes as a table ...
    (microsoft.public.sqlserver.server)