Author |
Message
|
Tibor |
Posted: Mon Jan 15, 2007 6:31 pm Post subject: solved: Channel is working but sending AMQ9526 per minute |
|
|
 Grand Master
Joined: 20 May 2001 Posts: 1033 Location: Hungary
|
We changed the hardware under an AIX box (AIX v5.2 ML8) for broker qmgr and upgraded the MQ from CSD11 to CSD13 because of an internal error like this . Since then the listener works almost correctly but only one channel writes periodically to the AMQERR01.LOG:
Code: |
-------------------------------------------------------------------------------
01/16/07 03:08:03
AMQ9526: Message sequence number error for channel 'FSQP.RINO'.
EXPLANATION:
The local and remote queue managers do not agree on the next message sequence
number. A message with sequence number 25620057 has been sent when sequence
number 30341 was expected.
ACTION:
Determine the cause of the inconsistency. It could be that the synchronization
information has become damaged, or has been backed out to a previous version.
If the situation cannot be resolved, the sequence number can be manually reset
at the sending end of the channel using the RESET CHANNEL command.
----- amqrmtra.c : 3039 -------------------------------------------------------
01/16/07 03:08:03
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program 'FSQP.RINO' ended abnormally.
ACTION:
Look at previous error messages for channel program 'FSQP.RINO' in the error
files to determine the cause of the failure.
----- amqrmrsa.c : 467 --------------------------------------------------------
|
I have already tried all of known workarounds (reset channel, restarting channel and listener processes) without any success.
Has anyone a good idea for it?
Thanks in advance,
Tibor
Last edited by Tibor on Tue Jan 16, 2007 7:30 am; edited 1 time in total |
|
Back to top |
|
 |
Mr Butcher |
Posted: Mon Jan 15, 2007 10:51 pm Post subject: |
|
|
 Padawan
Joined: 23 May 2005 Posts: 1716
|
how was your reset channel command exactly? what is the error message after the reset channel?
a channel out of sequence is not an internal error in most cases. it happens e.g. if a queuemanager creted new, or a backup is restored, or a switch to a backup machine (with a seperate queuemanager installation, or... or....) was performed.
In that case, sender and receiver dont have the same sequence number any longer, so one must reset (in most cases the sender resets because this also resets the receiver, but you can also set the receiver to the proper number).
now what did you reset, and what new sequence number didi you set? _________________ Regards, Butcher |
|
Back to top |
|
 |
Tibor |
Posted: Tue Jan 16, 2007 12:01 am Post subject: |
|
|
 Grand Master
Joined: 20 May 2001 Posts: 1033 Location: Hungary
|
Mr Butcher wrote: |
how was your reset channel command exactly? what is the error message after the reset channel? |
I did it through GUI and runmqsc (the seqnum was 1), but the error message didn't changed: the expected message sequence number stayed the magic value:25620057. I have already browsed the SYSTEM.CHANNEL.SYNCQ to find a duplicated entry.
Moreover, this channel is in the running state at first sight, but if I query the channel status frequently there is a retrying/indoubt state on the time the error message.
And the most mysterious thing: both queue manager is on the same box and there is another channel between them which works without any problem ... |
|
Back to top |
|
 |
Mr Butcher |
Posted: Tue Jan 16, 2007 1:09 am Post subject: |
|
|
 Padawan
Joined: 23 May 2005 Posts: 1716
|
did you reset on the sender end or on the receiver end? please try to reset on the sender. 25620057 is not the expected but the received number, so to me it looks like you reset the receiver.
making the channels work again is one thing (and should be handled by the reset), findiing out why the channels went out of sequence is a different story. you wrote that you updated to csd13 because the channels where out of sync, so it must have been a result of some action or problem before the update (and only you may know what that was). _________________ Regards, Butcher |
|
Back to top |
|
 |
Tibor |
Posted: Tue Jan 16, 2007 3:07 am Post subject: |
|
|
 Grand Master
Joined: 20 May 2001 Posts: 1033 Location: Hungary
|
Mr Butcher wrote: |
you wrote that you updated to csd13 because the channels where out of sync, so it must have been a result of some action or problem before the update (and only you may know what that was). |
I updated to CSD13 because of an internal error (IY85622). And yes, I reset both end of channel  |
|
Back to top |
|
 |
Tibor |
Posted: Tue Jan 16, 2007 3:40 am Post subject: |
|
|
 Grand Master
Joined: 20 May 2001 Posts: 1033 Location: Hungary
|
Breaking news : the original channel pair (FSQP.RINO) was deleted and the new sender-receiver channels (FSQP.RINO.2) created, that's why I got another messages (for every minutes):
Code: |
----- amqrmrsa.c : 467 --------------------------------------------------------
01/16/07 12:01:32
AMQ9519: Channel 'FSQP.RINO' not found.
EXPLANATION:
The requested operation failed because the program could not find a definition
of channel 'FSQP.RINO'.
ACTION:
Check that the name is specified correctly and the channel definition is
available.
----- amqrcdfa.c : 1085 -------------------------------------------------------
01/16/07 12:01:32
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program 'FSQP.RINO' ended abnormally.
ACTION:
Look at previous error messages for channel program 'FSQP.RINO' in the error
files to determine the cause of the failure.
----- amqrmrsa.c : 467 --------------------------------------------------------
|
And it was over after 10 repeat... dummy MQ  |
|
Back to top |
|
 |
Mr Butcher |
Posted: Tue Jan 16, 2007 5:18 am Post subject: |
|
|
 Padawan
Joined: 23 May 2005 Posts: 1716
|
maybe you get that message now every 20 minutes? maybe another queuemanager is trying to access that receiver channel?
i dont think this is something that you can blame mq about... someone (a sender or server) is still trying to access the deleted receiver channel.
that would also explain the sequence error. you said you reset both ends, but it looks like there is a third end? _________________ Regards, Butcher |
|
Back to top |
|
 |
Tibor |
Posted: Tue Jan 16, 2007 7:29 am Post subject: solved: Channel is working but sending AMQ9526... |
|
|
 Grand Master
Joined: 20 May 2001 Posts: 1033 Location: Hungary
|
Mr Butcher wrote: |
that would also explain the sequence error. you said you reset both ends, but it looks like there is a third end? |
Thanks all, now it's working fine: it was a man-in-the-middle attack because the sysadmin re-started the queue manager on the old AIX - accidentally... and I have to revoke the blaming on MQ.
Anyway, you were the light in the dark, Mr Butcher.
Other: if everyone is interested in the internal error IY85622, here it caused by the missing of SSL keyring files!
Tibor |
|
Back to top |
|
 |
|