Re: sql user defined function called on every row



Steve Richter wrote:
On Jan 9, 10:36 am, Graybeard <graybe...@xxxxxxxxxxxxxx> wrote:
While I agree with your complaint, how big a deal is it really? This
UDF is prob just substring to get the date flipped. I would add to
the UDF an IF parm = ' ' return '0001-01-01' so the UDF can handle
the date flip as desired and let it call. This should be a
DETERMINISTIC call anyway - does it really slow down performance that
much ?

apparantly calling functions does not slow the query down by much. ( I
had assumed otherwise. ) I just ran a test that did a group by on the
function and then a 2nd time w/o the function ( group by the input to
the function )

I suspect you're not running it across enough rows to note the potential time difference. I created a table with 300,000 rows of a six-byte m/d/y date field. Half the rows had a valid m/d/y date; the other half contained blanks. I amended the function not to choke on the blank dates - it now returns date('0001-01-01') for the blanks, as you were trying to do in your CASE statement:

create function mylib.cvtdatemdy (mdy char(6))
returns date language sql deterministic
begin
if mdy <> ' ' then
return date('20'|| right(mdy,2)|| '-' || left(mdy,2) ||
'-' || substr(mdy,3,2));
else
return date('0001-01-01');
end if;
end;

I then made several successive runs of two queries to build a temporary table of DATE values. The first query did what you first attempted to do (even though it now no longer needs the CASE statement, due to the function's correctly handling blanks):

declare global temporary table datestuff as
(select mdy_col,
case mdy_col
when ' ' then date('0001-01-01')
else mylib.cvtdatemdy(mdy_col) end thedate
from mylib.mydatetbl) with data


The second query implemented the inelegant solution I wrote up yesterday, using the UNION:

declare global temporary table datestuff as
(select mdy_col, mylib.cvtdatemdy(mdy_col) thedate
from mylib.mydatetbl
where mdy_col <> ' '
union all
select mdy_col, date('0001-01-01') thedate
from mylib.mydatetbl
where mdy_col = ' ') with data


The first one, which calls the function for all 300,000 rows in the table, consistently took approximately twice as long, as measured by CPU milliseconds, to run. The reason is that the function is implemented on the System i as an ILE C service program, and there is overhead in calling it.

As the involuntarily assigned SQL guru at my company (not that I know all that much about SQL and DB2; just more than anyone else there), that's why I always tell the developers not to use user-defined functions for batch jobs that are going to churn through hundreds of thousands, or possibly millions, of rows of data. For an interactive application, in which a user is going to fill up a 10 row subfile, the extra overhead from calling the UDF is probably negligible; but not for long-running batch jobs.
.



Relevant Pages

  • Re: noise words, @@ERROR, and stop and resume indexing
    ... This can be resolved without making the UDF "infallible", ... Removing the noise words, as well as not passing empty or null strings ... SQL Full Text Search Blog ... if this query causes an ignored-word error ...
    (microsoft.public.sqlserver.fulltext)
  • Re: Dynamisches Füllen einer Return Table in SQL Funktion
    ... da die Erstellung einer UDF zwingend ... Gruß Thomas ... >kann man mit dynamischen SQL nicht ansprechen, ...
    (microsoft.public.de.sqlserver)
  • Re: "Error Converting varchar to int" when executing SQL Server udf
    ... >I have the following SQL Server udf: ... > create function dbo.udfIsExistingUser (@UserNodeId int = NULL) ... stored procedure -- probably by older version components that do not know ... that may work for you: since the return of your UDF is essentially ...
    (microsoft.public.vb.database.ado)
  • Re: Need Help Converting Decimal Date Values to Date Values
    ... convert these decimals to date values? ... If you're doing this in SQL, as your inquiry suggests, then you could either write a user-defined function (UDF) to do it, or you could implement the same code ... But if, on the other hand, you were using the UDF in a batch job that read through a substantial number of rows of data, you would notice a significant performance hit. ...
    (comp.sys.ibm.as400.misc)
  • Re: getdate() in UDF
    ... You can use openrowset to get the date inside of a UDF: ... RETURNS datetime ... DECLARE @dt datetime ... 'SELECT getdateAS thedate') ...
    (microsoft.public.sqlserver.programming)