|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
SDR channel times out and starts immediately when F5 involve |
« View previous topic :: View next topic » |
Author |
Message
|
jaswant.bea979 |
Posted: Thu Mar 15, 2018 7:37 am Post subject: SDR channel times out and starts immediately when F5 involve |
|
|
Novice
Joined: 15 Mar 2018 Posts: 16
|
Hi Folks,
I'm in middle of migrating a standalone MQ7.5 setup to Multi-InstanceQueueManager MQ8.0. All servers in old and new are LINUX
Old Flow:
Application (using MQ CLIENT ,CCDT , SVRCONN ) > QueueManager (CS2691) IP:PORT
SDR Channel (CS2691.S1.CS691) > another F5 > Active QueueManager (CS691) IP:PORT -- (this hasnt changed)
New Flow:
Application (using MQ CLIENT ,CCDT , SVRCONN ) > LoadBalancer F5 > Active QueueManager (CS2691) IP:PORT
SDR Channel (CS2691.S1.CS691) > another F5 > Active QueueManager (CS691) IP:PORT -- (this hasnt changed)
DISCINT is zero for SDR channel.
dis CHANNEL(CS2691.S1.CS691) CHLTYPE(SDR)
2 : dis CHANNEL(CS2691.S1.CS691) CHLTYPE(SDR)
AMQ8414: Display Channel details.
CHANNEL(CS2691.S1.CS691) CHLTYPE(SDR)
ALTDATE(2018-03-14) ALTTIME(09.36.27)
BATCHHB(0) BATCHINT(0)
BATCHLIM(5000) BATCHSZ(500)
CERTLABL( ) COMPHDR(NONE)
COMPMSG(NONE) CONNAME(x.x.x.x(yyyy))
CONVERT(NO) DESCR( )
DISCINT(0) HBINT(300)
KAINT(AUTO) LOCLADDR( )
LONGRTY(999999999) LONGTMR(1200)
MAXMSGL(4194304) MCANAME( )
MCATYPE(PROCESS) MCAUSER( )
MODENAME( ) MONCHL(QMGR)
MSGDATA( ) MSGEXIT( )
NPMSPEED(FAST) PASSWORD( )
PROPCTL(COMPAT) RCVDATA( )
RCVEXIT( ) RESETSEQ(NO)
SCYDATA( ) SCYEXIT( )
SENDDATA( ) SENDEXIT( )
SEQWRAP(999999999) SHORTRTY(10)
SHORTTMR(60) SSLCIPH( )
SSLPEER( ) STATCHL(QMGR)
TPNAME( ) TRPTYPE(TCP)
USEDLQ(YES) USERID( )
XMITQ(xxxxxxxxxxxx)
From logs:
-------------------------------------------------------------------------------
03/15/2018 08:21:39 AM - Process(30650.1) User(mqm) Program(runmqchl)
Host(mwpayivlsu101.pre.us.santanderus.corp) Installation(Installation1)
VRMF(8.0.0.5) QMgr(CS2691)
AMQ9206: Error sending data to host 180.24.80.24(50020).
EXPLANATION:
An error occurred sending data over TCP/IP to 180.24.80.24(50020). This may be
due to a communications failure.
ACTION:
The return code from the TCP/IP(write) call was 104 X('68'). Record these
values and tell your systems administrator.
----- amqccita.c : 3166 -------------------------------------------------------
03/15/2018 08:21:39 AM - Process(30650.1) User(mqm) Program(runmqchl)
Host(mwpayivlsu101.pre.us.santanderus.corp) Installation(Installation1)
VRMF(8.0.0.5) QMgr(CS2691)
AMQ9999: Channel 'CS2691.S1.CS691' to host '180.24.80.24(50020)' ended
abnormally.
EXPLANATION:
The channel program running under process ID 30650 for channel
'CS2691.S1.CS691' ended abnormally. The host name is '180.24.80.24(50020)';
in some cases the host name cannot be determined and so is shown as '????'.
ACTION:
Look at previous error messages for the channel program in the error logs to
determine the cause of the failure. Note that this message can be excluded
completely or suppressed by tuning the "ExcludeMessage" or "SuppressMessage"
attributes under the "QMErrorLog" stanza in qm.ini. Further information can be
found in the System Administration Guide.
----- amqrccca.c : 1090 -------------------------------------------------------
03/15/2018 08:26:39 AM - Process(30968.1) User(mqm) Program(runmqchl)
Host(mwpayivlsu101.pre.us.santanderus.corp) Installation(Installation1)
VRMF(8.0.0.5) QMgr(CS2691)
AMQ9002: Channel 'CS2691.S1.CS691' is starting.
EXPLANATION:
Channel 'CS2691.S1.CS691' is starting.
ACTION:
None.
-------------------------------------------------------------------------------
03/15/2018 08:46:39 AM - Process(30968.1) User(mqm) Program(runmqchl)
Host(mwpayivlsu101.pre.us.santanderus.corp) Installation(Installation1)
VRMF(8.0.0.5) QMgr(CS2691)
AMQ9206: Error sending data to host 180.24.80.24(50020).
EXPLANATION:
An error occurred sending data over TCP/IP to 180.24.80.24(50020). This may be
due to a communications failure.
ACTION:
The return code from the TCP/IP(write) call was 104 X('68'). Record these
values and tell your systems administrator.
----- amqccita.c : 3166 -------------------------------------------------------
03/15/2018 08:46:39 AM - Process(30968.1) User(mqm) Program(runmqchl)
Host(mwpayivlsu101.pre.us.santanderus.corp) Installation(Installation1)
VRMF(8.0.0.5) QMgr(CS2691)
AMQ9999: Channel 'CS2691.S1.CS691' to host '180.24.80.24(50020)' ended
abnormally.
Questions:
Would you help me and suggest what could be the possible reason for this channel to go down and starts immediately ?
I talked to my network team and they mentioned their TCP idle time out is 600 secs which is less than HBINT (300) and this aviods LB to terminate the connection.
Thank you,
JSingh |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Mar 15, 2018 8:27 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
So you're making a connection from the US to Japan, and you have set the disconnect interval to 0. This is quite brave. For a connection that is transcontinental and obviously subject to noise and errors, I would strongly recommend a disconnect interval you can both live with. It helps keeping the connection healthy.
Have fun  _________________ MQ & Broker admin |
|
Back to top |
|
 |
jaswant.bea979 |
Posted: Fri Mar 16, 2018 5:33 am Post subject: |
|
|
Novice
Joined: 15 Mar 2018 Posts: 16
|
Thank you for your reply. Initially DISCINT was 6000 ( default) and I noticed this error and reached out NW Team. Response was usual - NO Issue on NW level . So i had to change to zero.
I'm trying to identify if there is a way to capture logs/trace on MQ side to find why the connection is getting dropped OR who terminated the connection ?
Not sure what this means - TCP/IP(write) call was 104 X('68').
Also, I don't get why the channel starts immediately after it "ends abnormally"?
I kept checking MSGS for this channel and GETTIME for attached transmission queue and observed it doesn't matter if messages flow or not. For instance channel went down at 10:30 am EST ended abnormally at 11:45 am EST and no message went through this If messages flow ,it comes back up automatically ( so no triggering), still I get this error. |
|
Back to top |
|
 |
jaswant.bea979 |
Posted: Fri Mar 16, 2018 5:37 am Post subject: |
|
|
Novice
Joined: 15 Mar 2018 Posts: 16
|
I kept checking MSGS for this channel and GETTIME for attached transmission queue and observed it doesn't matter if messages flow or not. For instance channel went down at 10:30 am EST (ended abnormally) . starts immediately and again goes down at 11:45 am EST and again starts . no message went through during this time ( so no triggering).If messages flow still I get this error. |
|
Back to top |
|
 |
jaswant.bea979 |
Posted: Wed Mar 21, 2018 5:37 am Post subject: |
|
|
Novice
Joined: 15 Mar 2018 Posts: 16
|
IBM Support found using the trace file that RCVR side was ending the connection because of missing hearbeat.
Command for trace:
strmqtrc -m QMGR_NAME -t detail -t all
Then i checked with LB and they enabled keep alive at LB which is before RCVR channel MQ and it resolved the issue.
Thanks. |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|