Re: using clustered index to optimize inserts ...
- From: "Pravin" <pcbhagat2000@xxxxxxxxx>
- Date: 15 Jun 2006 20:56:34 -0700
Hi,
I will try to explain locking in terms of Sybase docs... which I feel
is very clear and self explanatory.
Allpages Locking: Allpages locking locks both data pages and index
pages. When a query updates a value in a row in an allpages-locked
table, the data page is locked with an exclusive lock. Any index pages
affected by the update are also locked with exclusive locks. These
locks are transactional, meaning that they are held until the end of
the transaction.
Datapages Locking: In datapages locking, entire data pages are still
locked, but index pages are NOT locked. When a row needs to be changed
on a data page, that page is locked, and the lock is held until the end
of the transaction. The updates to the index pages are performed using
latches, which are non transactional. Not entire table is locked with
an exclusive lock.
Latches are held only as long as required to perform the physical
changes to the page and are then released immediately. Index page
entries are implicitly locked by locking the data page. No
transactional locks are held on index pages.
Datarows Locking: In datarows locking, row-level locks are acquired on
individual rows on data pages. Index rows and pages are not locked.
When a row needs to be changed on a data page, a non transactional
latch is acquired on the page. The latch is held while the physical
change is made to the data page, and then the latch is released. The
lock on the data row is held until the end of the transaction. The
index rows are updated, using latches on the index page, but are not
locked. Index entries are implicitly locked by acquiring a lock on the
data row.
I would suggest, go for Allpages Locking for Master tables or those
tables which are not updated frequently; Datapages Locking for those
tables which are updated regularly but not very frequently and Datarows
Locking for those table which are frequently updated and always has
hot-spot.
======================================================
Clustered Index: The datarows will be arranged as per the clustered
index in a table. Each INSERT or UPDATE will cause rearrangement of
datarows in table. Hence Master tables are a clear candidates of
clustered index. A table with high volume of INSERTs and UPDATEs can
have a clustered index if it ensures that new rows are always
appended..., like clustered index on IDENTITY column.
Guidelines for clustered indexes:
* Most allpages-locked tables should have clustered indexes or use
partitions to reduce contention on the last page of heaps. In a
high-transaction environment, the locking on the last page severely
limits throughput.
* If your environment requires a lot of inserts, do not place the
clustered index key on a steadily increasing value such as an IDENTITY
column. Choose a key that places inserts on random pages to minimize
lock contention while remaining useful in many queries. Often, the
primary key does not meet this condition. This problem is less severe
on data-only-locked tables, but is a major source of lock contention on
allpages-locked tables.
* Clustered indexes provide very good performance when the key matches
the search argument in range queries, such as:
where colvalue >= 5 and colvalue < 10
In allpages-locked tables, rows are maintained in key order and pages
are linked in order, providing very fast performance for queries using
a clustered index.
In data-only-locked tables, rows are in key order after the index is
created, but the clustering can decline over time.
* Other good choices for clustered index keys are columns used in order
by clauses and in joins.
* If possible, do not include frequently updated columns as keys in
clustered indexes on allpages-locked tables.
* When the keys are updated, the rows must be moved from the current
location to a new page. Also, if the index is clustered, but not
unique, updates are done in deferred mode.
Non-Clustered Index: When you consider adding nonclustered indexes, you
must weigh the improvement in retrieval time against the increase in
data modification time. In addition, you need to consider these
questions:
>> How much space will the indexes use?
>> How volatile is the candidate column?
>> How selective are the index keys? Would a scan be better?
>> Are there a lot of duplicate values?
Because of data modification overhead, add nonclustered indexes only
when your testing shows that they are helpful.
Performance price for data modification : Each nonclustered index needs
to be updated, for all locking schemes - For each insert into the table
and For each delete from the table
An update to the table that changes part of an index's key requires
updating just that index.
For tables that use allpages locking, all indexes need to be updated:
>> For any update that changes the location of a row by updating a
clustered index key so that the row moves to another page
>> For every row affected by a data page split
For allpages-locked tables, exclusive locks are held on affected index
pages for the duration of the transaction, increasing lock contention
as well as processing overhead.
Some applications experience unacceptable performance impacts with only
three or four indexes on tables that experience heavy data
modification. Other applications can perform well with many more
tables.
Hope this helps. Please read Sybase docs on Performance Tunning and
Indexing.
Regards,
__________________________
Pravin Bhagat
bravegag wrote:
Hi all,
I have researched a bit before asking this question but have not found
actual accurate answers. Let me first describe the facts:
We have a large table with a wide set of columns i.e. around 30
columns. There are many many single inserts into this table, the major
bottleneck of the application are the insertion times, after analyzing
the tables monSysSQLText, monSysStatement, monSysPlanText. The primary
key is not a virtual key e.g. IDENTITY but instead more a real key
descriptive value i.e. a concatenated string that includes date, dpto,
order info, etc, still of course unique.
My questions for the scenario above are:
- how does Sybase implements the clustered index?
- assigning a bucket (hashing) for every distinct value of the
clustered column?
- should I prefer a column that contains 15 different values evenly
distributed hence 15 different buckets instead of one that contains
hundreds or always distinct values?
- how many buckets or insertion points are reserved when defining a
cluster index based on a column with unique values?
The second table is one containing a few columns i.e. primary key, a
Text column, a timestamp and a few more. Would the clustered index
column selection criteria would/should be the same as above or
different given the Text field?
Now a more general few questions I have not found answers:
1-. Can you have function based indexes in Sybase? this would specially
help using e.g. a function that takes a timestamp and maps it to hours,
this would fit specially well for the first scenario above as the table
only contains many thousand rows for a day and the day after is swept
i.e. optimize the many thousands of inserts and the queries.
2-. Is it the clustered index Sybase's implementation equivalent to
Oracle's range or hash partitioning hence understand what is the
criteria used by Sybase to create the actual buckets or insertion
points e.g. how is it specially for the unique valued columns?
Many thanks in advance,
Best Regards,
Me
.
- Follow-Ups:
- Re: using clustered index to optimize inserts ...
- From: Carl Kayser
- Re: using clustered index to optimize inserts ...
- References:
- using clustered index to optimize inserts ...
- From: bravegag
- using clustered index to optimize inserts ...
- Prev by Date: Re: Query and Isolation Level
- Next by Date: Sybase Developer for Singapore
- Previous by thread: Re: using clustered index to optimize inserts ...
- Next by thread: Re: using clustered index to optimize inserts ...
- Index(es):
Relevant Pages
|
|