Re: How do i write Set based queries and avoid a cursor?
- From: Hugo Kornelis <hugo@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Fri, 03 Oct 2008 23:55:17 +0200
On Wed, 1 Oct 2008 14:10:39 -0700 (PDT), --CELKO-- wrote:
Based on previous replies to similar questions, I expect Joe Celko to tell you to use a column active_date that stores the date (or even date and time) that the active status was achieved, and that is NULL if whatever is modeled in the table is not active. <<
No, based on previous replies, you should have said I would advise a
pair of columns to show duration of that status, more like this:
CREATE TABLE Foobar
(..
foo_status CHAR(5) NOT NULL
CHECK(foo_status IN (..)),
start_date DATE NOT NULL,
end_date DATE, -- null is current
Hi Joe,
True. You've got me there!
(And he'll also blast you for using "field" instead of "column"). <<
Yes. This is important if you are ever going to think in sets and not
sequential files.
Trust me on this, Joe. It *IS* possible to use the terms "field" and
"record" and still think in sets.
"That which we call a rose, by amny other name would smell as sweet!"
-- Shakespeare, "Romeo and Juliet", 1594
Of course, Joe will not care whether the business is actually interested at all in what time the widget became active. <<
Can you think of a business that would not have such an interest when
it involves a product or service they are libel for? Business is very
temporal.
Who says that the original poster was modelling a product or service
they are libel for? Assumptions, my friend! They (together with good
intentions) pave the path to hell!
There are various situation where all information needed is whether some
FooBar is or is not active. To give just one example - when you have to
dispatch jobs to workers, you need a list of idle workers. You do not
need to know any history of when they were idle and when they were
active - you just need to send the task to an idle worker.
Nor will he care that it might involve lots of extra work for the data entry department, maybe even severe changes in some business process, to track and enter this information (that the business doesn't even need). <<
This usually involves adding a DEFAULT CURRENT_TIMESTAMP to a table
declaration or a line in a stored procedure.
That does not help one bit when someone flips the "active" status. A
DEFAULT doesn't do anything when you run an UPDATE.
In fact, he'll also be blind to the fact that this will without any doubt result in fake dates being stored, as that's what always happens when some clowns from the IT department who are completely out of touch with the real work on the business floor dictate the storage of some uninteresting attribute. <<
The time of an event for which the company is libel might not be
considered an "uninteresting attribute" by management or the lawyers.
I think we already covered that. If the company were libel for the time
of the event, the table would already have been designed with a column
to store that time. Since it's not, we can safely assume that the
company isn't libel for said time (or, if you prefer to be cynical, that
the managers that pay the DBA's bills think they're not).
Why would "fake dates" be stored?
Because that's what happens when idiots from the IT department add
mandatory attributes that the business never asked for. The data entry
people are presented a screen that demands some input that they can't
supply, so they just enter something and move on - after all, they got
their work to do!
And when a few years down the road management invents a new business requirement that does involve tracking when the active state was reached, the poor chap tasked with the database will see this column, will of course assume that it holds the data required, and will then produce reports based on incorrect data. <<
1) Why do you assume that if you record a fact (especially when the DB
does it for you with CURRENT_DATE), it has to be wrong? Weird.
So you've never been in a situation where changes were entered in
advance, or after the fact?
2) "Errors using inadequate data are much less than those using no
data at all." --Charles Babbage
If Charles said that, he was wrong.
I think it'll be 20 years ago now that I took over maintenance for an
existing application (PL/1 on a mainframe, using an IMS/DB database;
aaah, the memories!). After a few days of studying the documentation and
comparing it to the actual source code, I knew of a way to greatly
improve the quality of the documentation in just a few minutes: I binned
all of the documentation! It was seriously outdated. The code had moved
on but the docs had not. If I had kept the documentation, I would have
wasted lots of my time, and run the risk of making incorrect changes by
relying on the outdated documentation. Binning the documentation ensured
that both I and all my successors should use the source as the only
trustworthy source of information, until finally budget was approved to
re-document the thing (which, of course, never happened).
Wouldn't you much rather have a design with just the is_active column and no active_date, so that you know to tell upper management that this new requirement involves changes to the database and can only be tracked for new widgets? <<
No, hell no! In the real world, new requirements almost always have a
temporal component. See another posting of mine about the drinking
age changing in the US. Before the laws changed, Sally was a
perfectly bartender when I hired her at 18. The next week, I have to
take her off the bar and put a wrist band on her (to mark underage
servers). If I "flip her flag" without the history, I have destroyed
information -- was I engaged in criminal activity with a minor or did
I overpay her for the bartending she could not have been doing?
You get to pick your examples, I get to pick mine.
Imagine a DB app that calls a webservice to check address against postal
code (I believe you call that ZIP code). Since webservices are not 100%
reliable, a list of available webservices is kept in a table. When an
address has to be checked, one of them is called. If the call results in
an error or timeout, the webservice is marked as not in use, and another
webservice is called instead. Once or twice a month, an employee will
check webservices marked "not in use", find out if the problem is
permanent or temporal and mark the service as "in use" once a temporal
problem is resolved, or remove it completely if the problem is
permanent. Management decides not to store when exactly these services
started or stopped being in use, as they are just using some service
provided for free by some third parties.
Three months later, it turns out that one of the services used isn't
free after all, but requires a monthly fee. If I had followed your
advice, I might see a status of "in use" from Jan 1st until March 13th,
"not in use" from March 13th until March 16th, and "in use" from March
16th until now. That's almost three full months use - quite a hefty
bill. But in reality, the service was never called until March 13th,
when some other service failed. This service failed too (turns out one
parameter was misconfigured), so it was also given status "not in use".
Three days later, the parameter was repaired and the status was set to
"in use" once more, after which we actually started using it.
With your table, management would probably get a report saying that the
service has been used for three months, with a three day outage.
Grudgingly, they'd pay the bill for three months (with a small discount
for the outage).
With my table, I'd have to report to management that there is no easy
way to figure out the actual use of that particular service. Not very
good, but at least better than the lies you'd tell them. And I'd then
have to try to dig up the information on the actual use in some other
way. Maybe we'd eventually even end up examining the logs from the
service provider. But we would not pay for the months that the service
was not actually used.
--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
.
- Follow-Ups:
- Re: How do i write Set based queries and avoid a cursor?
- From: --CELKO--
- Re: How do i write Set based queries and avoid a cursor?
- From: DA Morgan
- Re: How do i write Set based queries and avoid a cursor?
- References:
- Re: How do i write Set based queries and avoid a cursor?
- From: --CELKO--
- Re: How do i write Set based queries and avoid a cursor?
- Prev by Date: Re: SSMS - setting root of Object Explorer
- Next by Date: Re: How do i write Set based queries and avoid a cursor?
- Previous by thread: Re: How do i write Set based queries and avoid a cursor?
- Next by thread: Re: How do i write Set based queries and avoid a cursor?
- Index(es):
Relevant Pages
|