Re: Client sqlplus SELECT + SPOOL knocks down server network
- From: hpuxrac <johnbhurley@xxxxxxxxxxxxx>
- Date: Tue, 11 Mar 2008 16:14:53 -0700 (PDT)
On Mar 11, 9:59 am, Carlos <miotromailcar...@xxxxxxxxxxxx> wrote:
On 11 mar, 09:43, Carlos <miotromailcar...@xxxxxxxxxxxx> wrote:
On 10 mar, 18:27, DA Morgan <damor...@xxxxxxxxx> wrote:
Carlos wrote:
On 10 mar, 14:08, DA Morgan <damor...@xxxxxxxxx> wrote:
Carlos wrote:
On 10 mar, 13:21, DA Morgan <damor...@xxxxxxxxx> wrote:A few additional thoughts.
Carlos wrote:Hello Daniel:
Hello all:Some of what you've written seems self-conflicting.
I'm facing a strange problem on a 10.2.0.3 Oracle on Oracle Enterprise
Linux 5 server.
This morning someone was issuing a SELECT from a windows client via
ODBC and suddenly all the server communications broke down: even no
ping from or to that server worked.
Tried stop&start network but no avail. The server appears to be dead
at communications until reboot.
So I tried to repeat the SELECT from sqlplus on a windows client to a
OS file with SPOOL and all the communications on the server are gone
once again. Ping only answers to 127.0.0.1 or its own IP address, and
Linux is telling me that the network card is OK. Stop&start network
didn't fix the problem either.
Client sqlnet.log shows:
Fatal NI connect error 12170.
VERSION INFORMATION:
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
Windows NT TCP/IP NT Protocol Adapter for 32-bit Windows: Version
10.2.0.1.0 - Production
Time: 10-MAR-2008 12:11:59
Tracing not turned on.
Tns error struct:
ns main err code: 12535
TNS-12535: Message 12535 not found; No message file for
product=NETWORK, facility=TNS
ns secondary err code: 12560
nt main err code: 505
TNS-00505: Message 505 not found; No message file for
product=NETWORK, facility=TNS
nt secondary err code: 60
nt OS err code: 0
Client address: <unknown>
The select is a two column (varchar2(9), char(1)) select of 11 million
rows aprox.
Could anybody tell me where to search next? (I didn't find anything at
metalink).
TIA.
Cheers.
Carlos.
You start off telling us you've 10.2.0.3 and then everything
thereafter indicates 10.2.0.1.
Then you say "dead until reboot." Does that mean it is now
working fine or not?
How about queries using Oracle's native interface SQLNET
rather than ODBC?
Ping only answers to 127.0.0.1, I presume on that machine,
have you verified that other addresses exist?
Have you performed an ipconfig /renew?
Perform an ipconfig /all ... what does it show?
Are you using DHCP?
Can you ping the server from elsewhere on the network?
What are your firewall settings?
What tool is connecting using ODBC?
--
Daniel A. Morgan
Oracle Ace Director & Instructor
University of Washington
damor...@xxxxxxxxxxxxxxxx (replace x with u to respond)
Puget Sound Oracle Users Groupwww.psoug.org
You start off telling us you've 10.2.0.3 and then everythingYup: 10.2.0.3 is the server on EL5. 10.2.0.1 was the windows client
thereafter indicates 10.2.0.1.
from which I did the test.
Then you say "dead until reboot." Does that mean it is nowWorking fine after reboot... until I issue the SELECT again.
working fine or not?
How about queries using Oracle's native interface SQLNETThis is what I did on the windows 10.2.0.1 client with sqlplus.
rather than ODBC?
Ping only answers to 127.0.0.1, I presume on that machine,Yes. If I ping the server network IP address from the server itself,
have you verified that other addresses exist?
it answers OK, but no answer from any other IP in the network. If I
ping the server from another machine there is no answer.
Nope.Are you using DHCP?
Can you ping the server from elsewhere on the network?Nope
What are your firewall settings?No firewall. Isolated environment.
What tool is connecting using ODBC?Visual FoxPro. But, as said before, sqlplus bring the server
communications to its knees too.
I'll try the ipconfig (after repeating the whole process from startup
the server again).
Thank you very much.
Cheers.
Carlos.
1. Upgrade the client to 10.2.0.3
2. Post the SQL statement: I'd like to see it
3. Capture the statement and see if it contains any non-visible
characters when Fox sends it through
4. What ODBC driver? Try changing drivers
Is it only this one SQL statement or can more than one statement
do it? What happens if you take the problem statement and
reconstruct it from scratch in SQL*Plus starting with the
simplest SELECT single_column FROM single_table and then
incrementally add back its full functionality? When does it
break?
--
Daniel A. Morgan
Oracle Ace Director & Instructor
University of Washington
damor...@xxxxxxxxxxxxxxxx (replace x with u to respond)
Puget Sound Oracle Users Groupwww.psoug.org
Daniel:
I am doing the test from sqlplus (no ODBC this time)
2. Post the SQL statement: I'd like to see it
select /*+ full(a) */
a.col1,
a.col2
from schema.table_name a;
It starts displaying/spooling and then breaks with:
...
955800207 T
ERROR:
ORA-03135: la conexión ha perdido contacto
10078290 filas seleccionadas.
ERROR:
ORA-03114: no conectado a ORACLE
(There are 11339771 rows in the table).
From here on, there is no ping from or to the server.
ifconfig on the server shows:
RX packets:674109 errors:0 dropped:0 overruns:0 frame:0
TX packets:674109 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:53271778 (50.8 MiB) TX bytes:259126296 (247.1 MiB)
Interrupt:169
And it's time to reboot again...
Cheers.
Carlos.
Aha:
https://metalink.oracle.com/metalink/plsql/f?p=130:14:114792536629517....
The title is: Subject: ORA-03113 Or ORA-03135 Reported for Failover
with Network Outage
The doc id is: Note:455771.1
The date is: 10-MAR-2008
That's today.
The bug number is:
Bug 5758870 Select Type TAF fails with ORA-3113 If Network Cable Pulled off
The solution is:
Problem is fixed from 11g onwards and in 10.2.0.4 Patch set release.
One off patchs for some platforms are available.
Apply one-off Patch 5758870 for your platform, if available.
Are you on a RAC cluster?
--
Daniel A. Morgan
Oracle Ace Director & Instructor
University of Washington
damor...@xxxxxxxxxxxxxxxx (replace x with u to respond)
Puget Sound Oracle Users Groupwww.psoug.org
Daniel:
First of all: thank you very much for your attention and time.
Are you on a RAC cluster?
Nope. Single Instance.
I've taken a look at the metalink note that you kindly pointed, but no
TAF here.
I'm beggining to suspect of a faulty network card. I've noticed that
the problem occurs when there is a long information transfer from
server to client, but not based on size, but on the time the transfer
takes. I checked with FTP and big files, but it takes not too much
time to transfer (the network card and switch are 1000Mbps) and the
error does not arise. But when one sqlplus session takes too long to
pull out a query the error appears in a random fashion: sometimes at
10M rows, sometimes at 7M rows, even at 700000 rows.
The server itself works OK as far as its 'local' network activity
goes: the listener is up & running and the enterprise manager works
too (from a local browser). The same query from a local sqlplus (with
listener connection, no bequeath) works too.
Michael:
This can be tested by moving the system cable to a different switch port
Yes. I thought of that too (switch issues). But the network boy swears
everything is OK. I'll try to convince him to test on other switch.
Thank you all for the suggestions and hints.
Cheers.
Carlos.
Solved (or so it seems)...
After a lot of tests, I discovered that the problem always appeared
when the client and the server were connected to the same switch
(!!!). When they were connected to different switches the query wenk
OK (I think it's because there where 100mbps sections between them: I
suspect of some kind of overflow issue at the network card).
I searched for a linux driver for the network card (Marvell Yukon).
After downloading it, I moved my way through the installation process
(compiling with the Linux source files) and finally 'modprobed' the
driver...
...and the problem has not appeared again since then (fingers
crossed)!
Cheers.
Carlos
We are just starting to take a look at linux.
Modprobed the driver ... what does that mean?
.
- Follow-Ups:
- Re: Client sqlplus SELECT + SPOOL knocks down server network
- From: joel garry
- Re: Client sqlplus SELECT + SPOOL knocks down server network
- References:
- Client sqlplus SELECT + SPOOL knocks down server network
- From: Carlos
- Re: Client sqlplus SELECT + SPOOL knocks down server network
- From: DA Morgan
- Re: Client sqlplus SELECT + SPOOL knocks down server network
- From: Carlos
- Re: Client sqlplus SELECT + SPOOL knocks down server network
- From: DA Morgan
- Re: Client sqlplus SELECT + SPOOL knocks down server network
- From: Carlos
- Re: Client sqlplus SELECT + SPOOL knocks down server network
- From: DA Morgan
- Re: Client sqlplus SELECT + SPOOL knocks down server network
- From: Carlos
- Re: Client sqlplus SELECT + SPOOL knocks down server network
- From: Carlos
- Client sqlplus SELECT + SPOOL knocks down server network
- Prev by Date: Re: Oracle on Windows vs Solaris or Linux ?
- Next by Date: Re: Oracle on Solaris or Linux
- Previous by thread: Re: Client sqlplus SELECT + SPOOL knocks down server network
- Next by thread: Re: Client sqlplus SELECT + SPOOL knocks down server network
- Index(es):
Relevant Pages
|
Loading