Re: utl_smtp Hanging when opening a connection
- From: Ian M <noemailherethanks@xxxxxxxx>
- Date: Fri, 30 Nov 2007 01:29:31 +0100
joel garry wrote:
On Nov 29, 1:24 pm, Ian M <noemailheretha...@xxxxxxxx> wrote:collins.pa...@xxxxxxxxxxxxxx wrote:We are using utl_smtp to send emails from our database. We currently
run Oracle 9.2 on RHEL4, with Postfix as our mailserver.
Which 9.2? Some platform-specific bugs have been fixed in the later
patches. One always has to wonder what wasn't found. I'm sure you've
seen metalink Note:390852.1, in the realm of silly code idiosyncracies
one could easily miss.
Lately, we have noticed that the process creating the email is
hanging, which often requires the database to be taken down to fully
kill off the session. (This is on our test system, fortunately).
Have you tried killing the session from the OS level, rather than db?
PMON may then be more accomodating than SMON. How different is your
test system from your production?
Having looked at a numnber of posts on various forums, I have seenHi Paul,
that this is not an uncommon problem. Oracle have included a timeout
parameter in utl_smtp.open_connection, but this is not implemented for
write processes (in version 9.1 to 10.2, although more may be the
same), and from comments, I have seen that this timeout functionality
does not apply to opening the connection itself.
Has anybody come up with a solution to programmatically cause the open
connection to timeout if a connection is not established in a
reasonable time, and if so, can you please help me.
Many thank in anticipation,
Paul
I am not sure about RHEL4 but I had a similar situation a few years back
on a HP box with frequent external network problems.
To reduce the impact of this I amended the servers TCP settings (I think
it was tcp_ip_abort_cinterval I'm not 100% though). This caused the
failed open connection attempts to close much faster which was useful
for server scanning failures etc.
The details of those sorts of things tend to be very platform and OS
version specific. This tcp twiddle was no longer needed when the
system was upgraded to hp-ux 11i, for example:
http://groups.google.com/group/comp.databases.oracle.server/msg/b8f8ae3a5be16fa1?dmode=source
Not long ago I saw an hp-ux box that isn't meant to talk to the
outside world. So resolv.conf didn't have anything in it. So
sendmail started up with the wrong domain. So when the raid
monitoring software tried to send local mail (with a domain specified)
to root to say a disk had failed, it would just get stuck in mqueue.
And then sendmail would put more mqueue files out there to say it had
tried and failed to send a message. Then more to say it's been trying
for 5 days. The five day window would allow about 32000 messages to
hang around there (or was that an inode issue?), plus about 255
sendmail and 255 rmail processes. When I tried to rid the mqueue of
the files, that allowed the processes to take over all the processors,
not very nice to telnet. I eventually got all that sorted out, added
the dns server reference to resolv.conf, killed/restarted sendmail,
database was able to continue, all was well with the world except for
a hot-swappable disk. And except everything that depended on not
having a domain specified was broken. Fortunately that quickly blew
up a process that was configured to send me mail from another machine
when the standby log transport failed, so I noticed it before anything
else messed up.
Moral: Check everything, even on a system that appears to be working
and no one complains about and has monitoring software that notifies
you when things go wrong.
jg
--
@home.com is bogus.
mo' money, mo' money, mo'money! http://www.internetnews.com/bus-news/article.php/3712566
Hi Joel,
I found your comment on 11i interesting and frankly surprising as I was not aware of a change here, I decided to test it, the results may surprise you.
# uname -a
HP-UX XXXXXXXX B.11.11 U 9000/800.....
I checked what the current value setting was
# /usr/bin/ndd -get /dev/tcp tcp_ip_abort_cinterval
75000
The IP 123.123.123.123 is blocked by a firewall so similar to any network down problems.
S=$SECONDS;telnet 123.123.123.123 1521;echo "$(($SECONDS-$S)) Seconds."
....
75 Seconds.
So we are looking at a SYN_SENT for the 75 seconds before connection abort.
I then set this value to 10 seconds.
# /usr/bin/ndd -set /dev/tcp tcp_ip_abort_cinterval 10000
S=$SECONDS;telnet 123.123.123.123 1521;echo "$(($SECONDS-$S)) Seconds."
....
10 Seconds.
Obviously I agree completely on the platform and OS specific comment, the moral is indeed check everything.
Regards, Ian.
.
- References:
- utl_smtp Hanging when opening a connection
- From: collins . paulj
- Re: utl_smtp Hanging when opening a connection
- From: Ian M
- Re: utl_smtp Hanging when opening a connection
- From: joel garry
- utl_smtp Hanging when opening a connection
- Prev by Date: Re: AFTER CREATE trigger (object actually being REPLACED)
- Next by Date: Oracle Query Math
- Previous by thread: Re: utl_smtp Hanging when opening a connection
- Next by thread: Unique constraint over 2 columns with allowable NULLs
- Index(es):
Relevant Pages
|