Re: Newbie question about db normalization theory: redundant keys OK?



On Dec 25, 12:30 am, Sampo Syreeni <de...@xxxxxx> wrote:
On 2007-12-23, David BL wrote:

I've never thought much of multiphase commit protocols. Instead I'm a
fan of persistent message queues.

I don't quite see them as solving the same problem. If your queue is
such that all of the updates posted in it are always applied, then in a
distributed environment it would seem that you need something like a
multiphase commit to get agreement on what to insert into the queue in
the first place. Or if some of the queued transactions can be rolled
back, then you need distributed agreement when the queued transaction is
finally executed. Finally, it obviously solves the problem if the
transactions are serialized before insertion into the queue, but then we
lose in concurrency and the queue easily becomes a central hot spot.

So, could you perhaps elaborate a bit?

Consider the following hypothetical example: there are two
geographically separated databases storing account balances. There is
no replication - each DB is recording independent accounts.

Suppose we want a transfer between accounts needing a distributed
transaction. The two phase commit protocol will involve a "prepare
to commit" phase, and each DB will need to lock the relevant account
balances. The locks aren't released until the participants
acknowledge the commit message (assuming the presumed rollback
protocol). Network failures could make this take a very long time.
ie network failure can upset the autonomy of the databases. How bad
is that!

Instead consider that we avoid TPC entirely, and use persistent
message queues...

Suppose we want to transfer from account balance Ax on machine X into
account balance Ay on machine Y.

Let X have a persistent message queue Qx that records one way messages
to be sent from X to Y. Each message represents a deposit to be
performed on an account on Y. Y records a persistent sequence number
Nx for the number of messages it has processed from Qx. A middleware
layer allows Y to pump messages from Qx and it uses Nx to ensure it
processes each message exactly once.

Similarly Y has Qy for messages to be sent from Y to X and X records
sequence number Ny.

A successful transfer involves the following steps

1. On X, a local transaction does the withdrawal on Ax and pushes
message onto Qx that represents a deposit that needs to be performed
on Ay. Ax and Qx are updated atomically on X.

2. Y reads message from Qx then opens a transaction that atomically
updates Ay and Nx.

Now consider that Y needs to cancel the transfer. Eg this could occur
because by the time Y processes the deposit message the relevant
account has been closed. Then we simply introduce a message to
cancel the transfer, which in a fashion can be compared to a presumed
commit TPC. In fact a cancel of a transfer is nothing other than a
deposit message to undo the original withdrawal.

Note that when we account for the outstanding (ie not yet processed by
receiver) deposit messages, the total money in the system is conserved
by the local transactions. Note as well that local transactions are
performed without any network I/O whatsoever.

The durability requirement on these transactions can be relaxed. For
example when they commit they can release locks before flushing the
log, making the locks less sensitive to file I/O. Accounts are locked
for extremely short periods. It is a vastly superior approach to
TPC.

IMO the whole idea of a "distributed transaction" is disgusting. It
seems tragic for a system to use them if they can be avoided. I
wonder if they always can be (when one considers the "bigger picture"
and one isn't constrained by an existing legacy system).

.



Relevant Pages

  • RE: Carry forward balance of account
    ... your database contains hundreds of thousands of transactions. ... CurrentFiscalYear field in my Company table... ... Use the Income and Expense balances to calculate a net profit or loss ... Update the G/L AccumulatedProfitOrLoss account accordingly. ...
    (microsoft.public.access.modulesdaovba)
  • RE: Display queried records with Null values (null recordcount)
    ... tables related to Assets and Transactions tables in the query, ... since AssetID from tblAssets is a foreign ... The query would return multiple rows per account, ...
    (microsoft.public.access.gettingstarted)
  • Re: Design Question - Accounts/Transactions
    ... The indexed view can't return anything about individual transactions. ... reason is that the indexed view contains only one row per account. ... you want an index that will help this query. ... One of the most important thing to get right is the clustered index. ...
    (microsoft.public.sqlserver.programming)
  • Re: T.J. Maxx data theft worse than first reported
    ... No, my intellectually deficient friend, retailers only need store the authorization code they receive from the card issuer". ... with just account number, amount and date. ... http://www.garlic.com/~lynn/2007c.html#15 Securing financial transactions a high priority for 2007 ...
    (bit.listserv.ibm-main)
  • Re: Advanced Client install nightmare
    ... I can succesfully connect to the remote registry using the account. ... MVP Windows Server System - SMS ... I have deleted and re-created the client install account and the client ... Slowing down the queue processing ...
    (microsoft.public.sms.admin)

Loading