Re: Cache-Size vs Performance



MitchAlsup@xxxxxxx writes:
So for normal PC/workstatioin like applications you will see a
logarithmic decrese in the miss rate as the cache size grows
logarithmecally. I leave it as an excercise to figur out what rate this
curve moves at.

For normal big data base applications the mis rate decreases similarly
but at a much lower rate (typically 1/4 to 1/5 as fast as the
PC/workstation rate) and starts to decrease at a different place.


in big database applications ... there is the processor cache (used to
compensate for memory access latency) and there is database cache
.... where the database uses real storage to compensate for disk record
access latency.

the major database vendors tend to have very detailed models of their
internal operation and processing ... and tend to work with server
vendors to make sure that processor cache sizes are sufficient for
efficient dbms execution.

dbms may have a little more variety with different databases requiring
adequate real storage to be used for caching (disk) records in
transactions.

for a little drift ... i've posted before about the evolution in
system real storage sizes and that effect on the uptake with the
(emerging?) RDBMS technology.

in the 70s there was some discord between some of the "60s" physical
database people in stl/bldg90 and the relational/sql system/r people
in sjr/bldg28 ... misc. past posts mentioning system/r
http://www.garlic.com/~lynn/subtopic.html#systemr

the physical databases tended to have record pointers as part of data
and exposed as part of the normal programming paradigm ... i.e.
application fetched some record, operated on that record and then had
direct pointer to one or more other records.

the stl people somewhat argued that the relational/sql paradigm
abstracted away the direct record pointer paradigm by the relational
schema and using large indexes inside the dbms implementation. for
many many databases, the indexes doubled the physical disk
requirements (compared to the 60s physical database implementation)
and significantly increased the number of disk access to retrieve a
record (physically processing the index before getting to the pointers
to the desired records).

on the other hand, with physical pointers no longer exposed in the
standard database paradigm, relational significantly reduced the
administrative and human overhead compared to the 60s paradigm.

one of the things that started to tip the balance in the 80s was that
1) disk space became significantly cheaper, the size of the index was
reduced as a cost issue and 2) the amount of real memory increased
significantly ... which allowed "caching" of much of the relational
index (eliminating the significant disk i/o penalty processing the
index to find a specific record or records). And then with further
increases in real storage sizes, not only could relational indexes be
cached ... but frequently much of the actual database records.

however, some of these databases may have cache operations that have a
wide variation. bank accounts might see very little database cache
benefit ... i.e. if you make an ATM withdrawal ... the probability may
be very low that there will be another hit on the same bank account
while the record is still in the database cache (over a broad range of
database cache sizes). significant changes in real storage for
database caching may see little benefit until it is nearly as large as
the whole database.
.



Relevant Pages

  • Re: Event ID: 13512 ???
    ... the Active Directory will disable the disk ... If such cache is discovered as being active, ... database may be damaged if power to the drive is lost. ... Write Caching feature and if there are provisions for a power loss (such as ...
    (microsoft.public.win2000.general)
  • Re: Exchange 2007 SCC Cluster - Slow Failover
    ... Exchange 2007 needs to flush the database cache to disk ... CCR SP1 does not persist the cache before moving to the other node and thus ... SSC has to flush uncommitted data to disk before moving the database. ...
    (microsoft.public.exchange.admin)
  • Re: O_DIRECT question
    ... writes to a file and don't pollute cache memory without using O_DIRECT? ... It's why database people like it, ... that makes sure that different people doing allocations and deallocations ... wrong data - including seeign uninitialized portions of the disk etc etc. ...
    (Linux-Kernel)
  • Re: What is the complexity of find_by_name ?
    ... would be 2-15 milliseconds per database call... ... Next, SQL must parse and optimize the query, since Rails doesn't make ... normally complete the cache flushing mentioned above. ... Then, to execute the search requires a logsearch through disk pages, ...
    (comp.lang.ruby)
  • Re: LDAP Performance (long)
    ... Cache the slapd's internal database lookups in slapd memory. ... The first is the new TAG:key lookup, ...
    (comp.mail.sendmail)