Author |
Message
|
tkaravind |
Posted: Tue Mar 30, 2004 3:19 am Post subject: Sender channel fails to run after system restart ... |
|
|
Acolyte
Joined: 24 Jul 2001 Posts: 62
|
Dear All,
We have a Mainframe-AIX connection through MQSeries.
The sender side is on the mainframe and the AIX is on the receiving end.
It is so happened that when we restarted the AIX server the sender was not able to connect to the receiving end for a long time. The mainframe team members even reported the channel was in a RETRYING state.
This was so even 40-50 mins after the AIX queue manager was up and running.
When we manually intervened and started the channel it however started running !
There was no symptom reported on the MQ error log too (i.e message seq number mismatch / max_resource exceeded etc). The log is only full of channel started / ended normally messages.
So we are still in the dark as to the root cause in order to avoid it in future.
However I did see an FDC getting generated at around that time. I have copy-pasted the first few lines of that :
+-----------------------------------------------------------------------------+
| |
| MQSeries First Failure Symptom Report |
| ===================================== |
| |
| Date/Time :- Saturday March 27 23:33:07 GMT 2004 |
| Host Name :- inbkp01 (AIX 4.3) |
| PIDS :- 5765B73 |
| LVLS :- 520 |
| Product Long Name :- MQSeries for AIX |
| Vendor :- IBM |
| Probe Id :- XC338001 |
| Application Name :- MQM |
| Component :- xehAsySignalHandler |
| Build Date :- May 30 2001 |
| CMVC level :- p520-CSD01G |
| Build Type :- IKAP - (Production) |
| UserID :- 00000203 (mqsi) |
| Program Name :- runmqchi |
| Process :- 00006766 |
| Thread :- 00000002 |
| Major Errorcode :- xecE_W_UNEXPECTED_ASYNC_SIGNAL |
| Minor Errorcode :- OK |
| Probe Type :- MSGAMQ6209 |
| Probe Severity :- 3 |
| Probe Description :- AMQ6209: An unexpected asynchronous signal (1) has |
| been received and ignored. |
| Arith1 :- 1 1 |
| Arith2 :- 6766 1a6e |
| |
+-----------------------------------------------------------------------------+
MQM Function Stack
xehAsySignalMonitor
xehHandleAsySignal
xcsFFST
MQM Trace History
{ xppInitialiseDestructorRegistrations
} xppInitialiseDestructorRegistrations rc=OK
{ xehAsySignalMonitor
-{ xppPostAsySigMonThread
-} xppPostAsySigMonThread rc=OK
-{ xehHandleAsySignal
--{ xcsRequestThreadMutexSem
--} xcsRequestThreadMutexSem rc=OK
--{ xcsFFST
==========>>>>>
Has anyone faced this situation earlier ? Any help is greatly appreciated.
Thanks & Regards,
Aravind |
|
Back to top |
|
 |
gunter |
Posted: Tue Mar 30, 2004 3:59 am Post subject: |
|
|
Partisan
Joined: 21 Jan 2004 Posts: 307 Location: Germany, Frankfurt
|
There should be an entry in the errorlog of the sending queuemanager. If a channel is retrying, there was a "channel stopped by error" before. Mostly there is nothing to find on the receiving queuemanager, because the sender couldn't connect it. _________________ Gunter Jeschawitz
IBM Certified System Administrator - Websphere MQ, 5.3 |
|
Back to top |
|
 |
tkaravind |
Posted: Wed Mar 31, 2004 4:37 am Post subject: |
|
|
Acolyte
Joined: 24 Jul 2001 Posts: 62
|
Hi,
There is nothing reported on the sender side. There are multiple "Channel Started" messages ...this goes to show that the channel was retrying for a long period time. No other error is being flagged off here.
For the same duration, on the receiver side (AIX), I have corresponding sets of "Channel started " immediately followed by "Channel ended normally"
At the point where we manually stopped & started the channel I get "Channel Active" on the mainframe and the "Channel Started" on the receiver. Then everything was normal.
Does this mean that the receiver was somehow ending when it was supposed to have been in Started mode ?
Thanks & Regards,
Aravind |
|
Back to top |
|
 |
gunter |
Posted: Sat Apr 03, 2004 12:00 am Post subject: |
|
|
Partisan
Joined: 21 Jan 2004 Posts: 307 Location: Germany, Frankfurt
|
I remember, we had a similar problem with a channel between mainframe and windows or solaris with different channel states on both ends.
We could start the channel only after a STOP FORCE on the mainframe. I believe, the solution was to set AdoptNewMCA.
But there was always a message in the errorlog on the mainframe.
If possible, check the system channel event queue, each command and each statechange cause an entry with all information needed. _________________ Gunter Jeschawitz
IBM Certified System Administrator - Websphere MQ, 5.3 |
|
Back to top |
|
 |
xxx |
Posted: Sun Apr 04, 2004 9:50 pm Post subject: |
|
|
Centurion
Joined: 13 Oct 2003 Posts: 137
|
I believe the receiver side is tcp keep alive is not set , as a result it did not accept a new channel connection ,
Once you have the manual restart of the channel , it worked, |
|
Back to top |
|
 |
tkaravind |
Posted: Mon Apr 05, 2004 5:42 am Post subject: |
|
|
Acolyte
Joined: 24 Jul 2001 Posts: 62
|
Hi ,
Thanks for these info.
But when I restart the AIX server (receiver) wouldn't the receiver channel process end by itself (i.e get killed abnormally) ... so that even without these options(AdoptNewMCA/KeepAlive) set the sender should be able to reconnect because it is still retrying !!!
I am assuming that these options are required only when the sender terminates abnormally (say due to network failures etc) and the receiver not even being aware of it.
However in our case this situation does not arise since the receiver ends anyway due to the AIX server itself being brought down - so it is anyway ready to accept the new incoming connections from the sender queue manager.
Please correct me if I am wrong !!! This problem still baffles me !
Thanks & Regards,
Aravind |
|
Back to top |
|
 |
kbaesung |
Posted: Thu Apr 08, 2004 5:58 pm Post subject: |
|
|
Newbie
Joined: 27 Jan 2004 Posts: 3
|
hi~
Probe Id(XC338001) is informative FDC error log.
Cause,
it's happened,When MQ application program ended abnormally. _________________ i'm mq engeener |
|
Back to top |
|
 |
|