Author |
Message
|
VJ |
Posted: Sun Apr 24, 2011 8:36 pm Post subject: Channel goes to Retrying state and Message Seq number error |
|
|
Newbie
Joined: 24 Nov 2010 Posts: 5
|
I am establishing channel connection between Linux and Mainframe.
Linux is 5.5 and MQ is 7.0.1.3.
Linux is sender and Mainframe is receiver. There is no SSL configured between this.
When I start the channel it is running. But after some time I am getting an error in the log and then then channel goes to retrying state.
LOG:
Quote: |
-----
----- amqrmrsa.c : 533 --------------------------------------------------------
04/25/2011 09:01:16 AM - Process(15604.1) User(mqm) Program(runmqchl)
Host(HOST1)
AMQ9002: Channel 'QM1.TO.QM2' is starting.
EXPLANATION:
Channel 'QM1.TO.QM2' is starting.
ACTION:
None.
-------------------------------------------------------------------------------
04/25/2011 10:32:12 AM - Process(15604.1) User(mqm) Program(runmqchl)
Host(HOST1)
AMQ9209: Connection to host 'xx.xx.xx.xx(yyyy)' closed.
EXPLANATION:
An error occurred receiving data from 'xx.xx.xx.xx(yyyy)' over TCP/IP. The
connection to the remote host has unexpectedly terminated.
ACTION:
Tell the systems administrator.
----- amqccita.c : 3473 -------------------------------------------------------
04/25/2011 10:32:12 AM - Process(15604.1) User(mqm) Program(runmqchl)
Host(HOST1)
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program 'QM1.TO.QM2' ended abnormally.
ACTION:
Look at previous error messages for channel program 'QM1.TO.QM2' in the
error files to determine the cause of the failure.
----- amqrccca.c : 921 --------------------------------------------------------
04/25/2011 10:32:21 AM - Process(32536.1) User(mqm) Program(runmqchl)
Host(HOST1)
AMQ9002: Channel 'QM1.TO.QM2' is starting.
EXPLANATION:
Channel 'QM1.TO.QM2' is starting.
ACTION:
None.
-------------------------------------------------------------------------------
04/25/2011 10:32:22 AM - Process(32536.1) User(mqm) Program(runmqchl)
Host(HOST1)
AMQ9526: Message sequence number error for channel 'QM1.TO.QM2'.
EXPLANATION:
The local and remote queue managers do not agree on the next message sequence
number. A message with sequence number 109 has been sent when sequence number
101 was expected. The remote host is 'xx.xx.xx.xx(yyyy)'.
ACTION:
Determine the cause of the inconsistency. It could be that the synchronization
information has become damaged, or has been backed out to a previous version.
If the situation cannot be resolved, the sequence number can be manually reset
at the sending end of the channel using the RESET CHANNEL command.
----- cmqxrfpt.c : 448 --------------------------------------------------------
04/25/2011 10:32:22 AM - Process(32536.1) User(mqm) Program(runmqchl)
Host(HOST1)
AMQ9506: Message receipt confirmation failed.
EXPLANATION:
The local and remote queue managers do not agree on the next message sequence
number. A message with sequence number 109 has been sent when sequence number
101 was expected. The remote host is '10.20.1.35(1515)'.
ACTION:
Determine the cause of the inconsistency. It could be that the synchronization
information has become damaged, or has been backed out to a previous version.
If the situation cannot be resolved, the sequence number can be manually reset
at the sending end of the channel using the RESET CHANNEL command.
----- cmqxrfpt.c : 448 --------------------------------------------------------
04/25/2011 10:32:22 AM - Process(32536.1) User(mqm) Program(runmqchl)
Host(otdsnl01)
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program 'OTDS1P.TO.CSQQ' ended abnormally.
ACTION:
Look at previous error messages for channel program 'OTDS1P.TO.CSQQ' in the
error files to determine the cause of the failure.
----- amqrccca.c : 921 --------------------------------------------------------
04/25/2011 10:33:21 AM - Process(968.1) User(mqm) Program(runmqchl)
Host(otdsnl01)
AMQ9002: Channel 'OTDS1P.TO.CSQQ' is starting.
EXPLANATION:
Channel 'OTDS1P.TO.CSQQ' is starting.
ACTION:
None.
-------------------------------------------------------------------------------
04/25/2011 10:33:22 AM - Process(968.1) User(mqm) Program(runmqchl)
Host(otdsnl01)
AMQ9526: Message sequence number error for channel 'OTDS1P.TO.CSQQ'.
EXPLANATION:
The local and remote queue managers do not agree on the next message sequence
number. A message with sequence number 109 has been sent when sequence number
101 was expected. The remote host is '10.20.1.35(1515)'.
ACTION:
Determine the cause of the inconsistency. It could be that the synchronization
information has become damaged, or has been backed out to a previous version.
If the situation cannot be resolved, the sequence number can be manually reset
at the sending end of the channel using the RESET CHANNEL command.
----- cmqxrfpt.c : 448 --------------------------------------------------------
|
I can see uncommitted messages in the Transmission Queue. I resolve the channel and then reset it. Then the channel will be comming to running state. After some time the same problem reoccurs.
Your advice is required on this. |
|
Back to top |
|
 |
fjb_saper |
Posted: Sun Apr 24, 2011 9:47 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
OK so apparently your channel is not only in retry mode it is also in doubt.
Looks like you have messages that did not make it.
Have you tried the resolve channel(your channel name) action(backout) ? (from memory)...
What happens if you start the channel after that?
You should also talk to your network people. Either the network is unreliable, or is subject to interferences. Usually these are the main reasons for the channel to go in doubt.
Also make sure that your receiver only services one channel with the same name, coming from a single host. Changing hosts can have strange influences on channel sequence numbers (as in multi-instance sender??)
Have fun  _________________ MQ & Broker admin |
|
Back to top |
|
 |
VJ |
Posted: Sun Apr 24, 2011 10:13 pm Post subject: |
|
|
Newbie
Joined: 24 Nov 2010 Posts: 5
|
Hi fjp_saper , Thanks for the earliest reply.
I tried to resolve the channel by ACTION(BACKOUT) . Even after that I was able to see UnCommitted messages in the Transmission Queue. So I used to resolve the channel by ACTION(COMMIT).
Though I Resolve the channel , I need to reset the channel after that . If i dont reset, it doesnt come to Running state when I start it. It goes to Retrying state only. Because I am getting Sequence Number error problem. So we need to reset the channel on both the ends before starting it. |
|
Back to top |
|
 |
exerk |
Posted: Mon Apr 25, 2011 1:52 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
VJ wrote: |
So we need to reset the channel on both the ends before starting it. |
No, you don't. Resetting the SDR causes the RCVR to reset also; you can just reset the RCVR to the number stated in the log, or the SDR to the number stated in the log (but that is just the same as resetting to the initial value). All of that is moot however - you have an ongoing problem you need to identify the root cause of, and fix. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
fjb_saper |
Posted: Mon Apr 25, 2011 11:47 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
VJ wrote: |
Hi fjp_saper , Thanks for the earliest reply.
I tried to resolve the channel by ACTION(BACKOUT) . Even after that I was able to see UnCommitted messages in the Transmission Queue. So I used to resolve the channel by ACTION(COMMIT).
Though I Resolve the channel , I need to reset the channel after that . If i don't reset, it doesn't come to Running state when I start it. It goes to Retrying state only. Because I am getting Sequence Number error problem. So we need to reset the channel on both the ends before starting it. |
Did you read up in the interconnection manual on how to solve the problem of an "in doubt" channel?
You should not use the resolve actions lightly.
The only reason I suggested the backout was because your sender # was slightly above the receiver # and the difference could have made up the indoubt # ...
As my esteemed colleague suggested, it is time for you to call for help investigating the source (likely network problem).  _________________ MQ & Broker admin |
|
Back to top |
|
 |
VJ |
Posted: Wed May 04, 2011 12:26 am Post subject: |
|
|
Newbie
Joined: 24 Nov 2010 Posts: 5
|
My N/W Team mentioned that there is no problem in the N/W and the N/W is stable. But I am facing the issue even now. Have raised a PMR. THey have given 3 options. 1) N/W problem (which is ruled out now). 2) Changing the Name of the channel (Because there may be more than one sender channel request for one receiver channel). 3) Deleting and Recreating both the Sender and Receiver channels .
My problem is I can not implement the options 2 and 3 because the Requester channel is running in Mainframe which is running MQ 1.2 (very very old) and the Mainframe people do not want to change it. Because they do not know the impact. Still I am trying to push it. Let u knw the outcome. |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed May 04, 2011 5:38 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
VJ wrote: |
My N/W Team mentioned that there is no problem in the N/W and the N/W is stable. But I am facing the issue even now. Have raised a PMR. THey have given 3 options. 1) N/W problem (which is ruled out now). |
If I had a dollar for every instance where the network support team told me that it wasn't a network problem - and it turned out that it was... _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Vitor |
Posted: Wed May 04, 2011 5:49 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
bruce2359 wrote: |
If I had a dollar for every instance where the network support team told me that it wasn't a network problem - and it turned out that it was... |
We'd both be rich!
According to the network team, it's never a network problem. Until after some effort on your part (possibly including physical or psychological torture) they admit to:
Quote: |
a trivial change which couldn't possible cause the problems you're seeing and doesn't affect that part of the network anyway. |
and yet, when this change is reversed, the problem goes away...  _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
PeterPotkay |
Posted: Wed May 04, 2011 8:32 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
MQ 1.2!? That is the oldest version I have ever seen mentioned here. _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed May 04, 2011 9:16 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
PeterPotkay wrote: |
MQ 1.2!? That is the oldest version I have ever seen mentioned here. |
Ah... the memories. 1.2 was my first. It was gentle, it was kind. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
|