Author |
Message
|
gfrench |
Posted: Wed May 18, 2011 8:34 am Post subject: TCPIP Nodes - Adapter Error 'invalid argument' |
|
|
 Acolyte
Joined: 10 Feb 2002 Posts: 71
|
Googled everywhere. Deploying a flow on 6.1.0.9 with a TCPIPClient Output and TCPIPClient Receive. It works on Windows, it works on Linux. When deployed on Solaris 10 I get the following error:-
Quote: |
May 18 17:24:16 myhost WebSphere Broker v6109[20591]: [ID 702911 user.info] (D1BRKR1B.ExGrp_BS_Queue_02)[54]BIP3450E: An adapter error occurred during the processing of a message. The adapter error message is 'Invalid argument '. : D1BRKR1B.69fd0171-2d01-0000-0080-9547da6bafd0: /build/S610_P/src/DataFlowEngine/NativeTrace/ImbNativeTrace.cpp: 742: MbErrorHandler.throwableToMbException: :
|
If I remove the TCPIP nodes and redeploy it works. The only slight clue was something to do with file descriptors, but we have 20,000 limit set for the userid running the broker. I've tried with and without using a configurable service. Same error. Anyone got any thoughts? |
|
Back to top |
|
 |
lancelotlinc |
Posted: Wed May 18, 2011 9:09 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
Does your bar file contain the XSDZIP? There is a bug in 6.1.x.x where sometimes the XSDZIP is missing or the content of the XSDZIP does not include the right file. _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
lancelotlinc |
Posted: Wed May 18, 2011 9:24 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
As I remember, there are two files in the XSDZIP: a big endian and a little endian representation. It works on your Windows and Linux because the right endian is present. But it does not work on your Solaris because it is the other endian. _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
mqjeff |
Posted: Wed May 18, 2011 9:28 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
You'd only see an XSDZIP if you were processing XML messages over TCP, using the XMLNSC domain.
I'd consider it more likely that gfrench is using MRM, likely CWF, to process the data coming in over the wire, and thus would have a .dict file and not an XSDZIP.
It is otherwise worth ensuring that the Toolkit building the bar is at the 6.1.0.9 level as well.
It's also worth reviewing a service trace during the issue to see if it makes any complaints that look like the Broker is not able to access the right network adapter in the Solaris tcp/ip configuration or etc. |
|
Back to top |
|
 |
smdavies99 |
Posted: Wed May 18, 2011 11:07 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
mqjeff wrote: |
You'd only see an XSDZIP if you were processing XML messages over TCP, using the XMLNSC domain.
I'd consider it more likely that gfrench is using MRM, likely CWF, to process the data coming in over the wire, and thus would have a .dict file and not an XSDZIP.
|
mqjeff is right.
There are two different message sets involved. XMLNSC on the input side and CWF on the TCP side.
Graham & I will do some more investigation tomorrow. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
lancelotlinc |
Posted: Wed May 18, 2011 11:29 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
You may like to open a PMR asking the question why your flow works in little endian environments and not in big endian environments. _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
gfrench |
Posted: Wed May 18, 2011 11:58 pm Post subject: |
|
|
 Acolyte
Joined: 10 Feb 2002 Posts: 71
|
Thanks for all the ideas.
We've had a play removing bits and reconfiguring endians of the message set and nodes with little success.
In desperation we resorted to trying the samples. Same error. Finally we deployed just the TCPIPServerSimulation.msgflow on its own, which uses delimited message, delimeted by a x'00'. Same error
Code: |
May 19 08:54:16 ebiz-dev-esb-broker WebSphere Broker v6109[19477]: [ID 702911 user.info] (D1BRKR1B.ExGrp_BS_Queue_03)[41]BIP2113E: Message broker internal error: diagnostic information 'ServerConnectionManager:runConnectionCreation', '8091', 'Invalid argument'. : D1BRKR1B.1d4bb0c5-2f01-0000-0080-84c8404acc86: /build/S610_P/src/DataFlowEngine/NativeTrace/ImbNativeTrace.cpp: 742: ServerConnectionManager.runConnectionCreation: :
May 19 08:54:16 ebiz-dev-esb-broker WebSphere Broker v6109[19477]: [ID 702911 user.info] (D1BRKR1B.ExGrp_BS_Queue_03)[41]BIP3450E: An adapter error occurred during the processing of a message. The adapter error message is 'Invalid argument '. : D1BRKR1B.1d4bb0c5-2f01-0000-0080-84c8404acc86: /build/S610_P/src/DataFlowEngine/NativeTrace/ImbNativeTrace.cpp: 742: MbErrorHandler.throwableToMbException: :
|
Will look at the service trace and network adapters next and see what we can find... |
|
Back to top |
|
 |
gfrench |
Posted: Thu May 19, 2011 1:06 am Post subject: |
|
|
 Acolyte
Joined: 10 Feb 2002 Posts: 71
|
Looks like the box has one network adapter. The service trace shows the following error
This seems to point to file descriptor issue with Solaris 10 and our limit. Initial glance seems to look ok
Code: |
$ ulimit -n
20000
$ ulimit -Hn
unlimited |
but then the manuals state 65,536 is the default, so not sure why our 20000 is set and what is required by the connection manager. Our SA has changed the system to
Code: |
$ ulimit -Hn
unlimited
$ ulimit -n
65536
|
and still we see the problem... Anyone else have any thoughts? |
|
Back to top |
|
 |
gfrench |
Posted: Fri May 20, 2011 12:18 am Post subject: |
|
|
 Acolyte
Joined: 10 Feb 2002 Posts: 71
|
Dispair!
Having researched 'java.io.IOException: Invalid argument at sun.nio.ch.DevPollArrayWrapper' on Solaris 10, I am almost certain this is a problem with file descriptors. Its widely reported appearing with java networking on Solaris. We are running 64 bit and uname reports:-
SunOS ebiz-dev-esb-broker 5.10 Generic_137111-04 sun4v sparc SUNW,Sun-Fire-T200
I've tries with the following values for soft(S) and hard(H) limits of file descriptors:-
S=20,000 H=unlimited
S=256 H=65,536
S=20,000 H=128,000
restarting the broker, but each time get the same error. Anyone got any other suggestions for me to try?
Thanks |
|
Back to top |
|
 |
davecrighton |
Posted: Sun May 22, 2011 6:05 am Post subject: |
|
|
Novice
Joined: 13 Jun 2007 Posts: 12
|
There's a number of related bug reports for this in oracles tracker for the JRE which is supposedly resolved you can follow through all the related links from here:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6322825
Unfortunately we've had another customer hit this same issue even though they should have had all those bug fixes in their JRE level. They were able to recreate the problem outside broker so took it up with oracle support, they never got back to us on the outcome so I don't know if there is now a resolution or not. |
|
Back to top |
|
 |
smdavies99 |
Posted: Sun May 22, 2011 7:13 am Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
Dave,
Thanks for the pointer. This is very much in alignment to what Graham has found.
We can finally get the broker to start!
Ulimit Hard = 256000
Ulimit Soft = 65536
But the downside is that everything is so slow.
5+ times longer for any deployments. More often than not, they timeout.
I guess the Virtual Size of the DataFlowEngine processes is whole lot greater all round now.
We could probably create another broker on the same system under a different user with the right uLimits. That might very well be a work around. We only need one flow using TCP/IP in out current setup.
It could very well be that are going to use unit as the carrot to move to V7 on Linux. _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
davecrighton |
Posted: Sun May 22, 2011 8:00 am Post subject: |
|
|
Novice
Joined: 13 Jun 2007 Posts: 12
|
It's going back a bit but from what I remember the basic reason is that on Solaris every socket has an associated file descriptor and when you use a selector to do asynchronous IO you end up with one file descriptor for each of the 64k possible connections you could have on the socket. Basically the TCPIP stack writes into a file so that the the thread waiting to accept() knows it has a request for its socket rather than some other client on the same port.
Normally these get cleared out transparently but for some reason under some conditions on Solaris it looks like this doesn't happen. As I said we never really got to the bottom of it as the customer pursued it through Oracle support.
While this may result in a large number of file descriptors im not sure why this should cause a large rise in virtual memory size. Are you able to determine if this is mostly JVM heap or native?
Im guessing the timeouts are due to socket operations being impacts? Is there any associated exceptions other than the timeout message? |
|
Back to top |
|
 |
|