Author |
Message
|
j1 |
Posted: Fri Mar 31, 2006 8:03 am Post subject: dspmq not giving any output. Qmgr comes up only intermitentl |
|
|
 Centurion
Joined: 23 Jun 2003 Posts: 139
|
Hi, We have a strange problem with a Solaris Box on MQ 5.3 CSD 11. We recently added some cluster sender and rcvr channels and now, the qmgr seems to be going down. the dspmq show NO output. If all mqm processes are killed and the QMGR is started up, it comes up for about 15 mins. before going down again. Sounds like some kind of a loaad issue. The error logs are not much help too... Any ideas. this is urgent
Quote: |
03/31/06 10:28:19
AMQ9508: Program cannot connect to the queue manager.
EXPLANATION:
The connection attempt to queue manager 'xxxxxxxx' failed with reason code
2059.
ACTION:
Ensure that the queue manager is available and operational.
----- amqrmsaa.c : 427 --------------------------------------------------------
03/31/06 10:28:19
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program xxxxxxxxxxxx' ended abnormally.
ACTION:
Look at previous error messages for channel program xxxxxxxxxx' in the
error files to determine the cause of the failure.
----- amqrmrsa.c : 467 --------------------------------------------------------
03/31/06 10:29:12
AMQ9508: Program cannot connect to the queue manager.
EXPLANATION:
The connection attempt to queue manager 'xxxxxxxxxxx' failed with reason code
2059.
ACTION:
Ensure that the queue manager is available and operational.
----- amqrmsaa.c : 427 --------------------------------------------------------
|
|
|
Back to top |
|
 |
mvic |
Posted: Fri Mar 31, 2006 8:10 am Post subject: Re: dspmq not giving any output. Qmgr comes up only intermit |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
sk1 wrote: |
The error logs are not much help too... Any ideas. this is urgent |
If urgent, call Support. This forum is user-to-user, we can't do 24x7 service.
So, on with the investigation. What is output from the following?
Code: |
$ cd /var/mqm/errors
$ /opt/mqm/bin/ffstsummary |
|
|
Back to top |
|
 |
j1 |
Posted: Fri Mar 31, 2006 8:18 am Post subject: |
|
|
 Centurion
Joined: 23 Jun 2003 Posts: 139
|
ffst summary : No output
Error Log : Some kind of issue with the listener you think ?
Quote: |
03/31/06 11:07:05
AMQ9202: Remote host xxxxxxxxxxxxxx(111111) (1111)' not
available, retry later.
EXPLANATION:
The attempt to allocate a conversation using TCP/IP to host
xxxxxxxxxxxm (111111111111111) (1111)' was not successful. However
the error may be a transitory one and it may be possible to successfully
allocate a TCP/IP conversation later.
ACTION:
Try the connection again later. If the failure persists, record the error
values and contact your systems administrator. The return code from TCP/IP is
146 (X'92'). The reason for the failure may be that this host cannot reach the
destination host. It may also be possible that the listening program at host
xxxxxxxxxxxxxx(111111111111) (1111)' was not running. If this is
the case, perform the relevant operations to start the TCP/IP listening
program, and try again.
|
|
|
Back to top |
|
 |
PeterPotkay |
Posted: Fri Mar 31, 2006 8:25 am Post subject: |
|
|
 Poobah
Joined: 15 May 2001 Posts: 7722
|
what are the contents of var/mqm/errors? _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
 |
mvic |
Posted: Fri Mar 31, 2006 9:10 am Post subject: |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
sk1 wrote: |
ffst summary : No output |
This probably means /var/mqm/errors has no .FDC files. Peter is right to ask the explicit question though - what files are in there?
Quote: |
Error Log : Some kind of issue with the listener you think ? |
Possibly. It's a network / comms issue for sure. But as for the queue manager itself, I don't see any cause for concern, and would query whether you need to be stopping and starting the queue manager. These network-related error log entries do not appear to show the queue manager is failing.
I looked up in an old document of mine (can't get access to a Solaris machine at the moment) and find that Errno 146 is ECONNREFUSED on Solaris. Very likely this indicates a firewall setup or other network problem.
If you have a network admin group, report this to them - I think it will help if you supply the error log entries to them, and point out that 146 is ECONNREFUSED. |
|
Back to top |
|
 |
j1 |
Posted: Fri Mar 31, 2006 10:00 am Post subject: |
|
|
 Centurion
Joined: 23 Jun 2003 Posts: 139
|
there are tons of FDIC files and I will go through some of them soon. I just did a file system restore and that seems to have solved the problem. It appears this was recently upgraded from CSD 5 to 11. Are all the CSD's cumulative. Could this be the issue.... |
|
Back to top |
|
 |
wschutz |
Posted: Fri Mar 31, 2006 10:47 am Post subject: |
|
|
 Jedi Knight
Joined: 02 Jun 2005 Posts: 3316 Location: IBM (retired)
|
No, you should be okay going straight to CSD 11 from CSD5.
If there are a "ton" of FDC files in /var/mqm/errors, then ffstsummary should have produced output. Are you sure you were in that directory when you ran it?
Did your restore regress you back to CSD 5 ? _________________ -wayne |
|
Back to top |
|
 |
j1 |
Posted: Fri Mar 31, 2006 10:53 am Post subject: |
|
|
 Centurion
Joined: 23 Jun 2003 Posts: 139
|
I was in the right directory, but my guess is just like dspmq did not give any output, even though a qmgr existed, so too fftssummary, did not give nay out put. The timestamp of all the FDIC files is today and yesterday...
The backup did regress to CSD 5 |
|
Back to top |
|
 |
wschutz |
Posted: Fri Mar 31, 2006 10:58 am Post subject: |
|
|
 Jedi Knight
Joined: 02 Jun 2005 Posts: 3316 Location: IBM (retired)
|
So after the file restore, did dspmq start working again?
and what directory was restored? only /opt/mqm or also /var/mqm ? _________________ -wayne |
|
Back to top |
|
 |
j1 |
Posted: Fri Mar 31, 2006 11:11 am Post subject: |
|
|
 Centurion
Joined: 23 Jun 2003 Posts: 139
|
for e.g here is an excerpt from one of ther FDC' S. Like someone said, I guess I should be calling support right abt now. The friendly folks at mqseries.net are usually better !!:
Quote: |
| |
| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Friday March 31 07:28:25 EST 2006 |
| Host Name :- xxxxxxxxx (SunOS xx) |
| PIDS :- 5724B4103 |
| LVLS :- 530.11 CSD11 |
| Product Long Name :- WebSphere MQ for Sun Solaris |
| Vendor :- IBM |
| Probe Id :- XC130003 |
| Application Name :- MQM |
| Component :- xehExceptionHandler |
| Build Date :- Aug 2 2005 |
| CMVC level :- p530-11-L050802 |
| Build Type :- IKAP - (xxxxxxxx) |
| UserID :- 00001002 (mqm) |
| Program Name :- amqzlaa0_nd |
| Process :- 00000004 |
| Thread :- 00000001 |
| QueueManager :- xxxxxxxxxxxxx|
| Major Errorcode :- STOP |
| Minor Errorcode :- OK |
| Probe Type :- HALT6109 |
| Probe Severity :- 1 |
| Probe Description :- AMQ6109: An internal WebSphere MQ error has occurred. |
| FDCSequenceNumber :- 0 |
| Arith1 :- 10 a |
| Comment1 :- SIGBUS |
| |
|
|
|
Back to top |
|
 |
j1 |
Posted: Fri Mar 31, 2006 11:13 am Post subject: |
|
|
 Centurion
Joined: 23 Jun 2003 Posts: 139
|
both /opt/mqm and /var/mqm were restored |
|
Back to top |
|
 |
wschutz |
Posted: Fri Mar 31, 2006 11:20 am Post subject: |
|
|
 Jedi Knight
Joined: 02 Jun 2005 Posts: 3316 Location: IBM (retired)
|
|
Back to top |
|
 |
mvic |
Posted: Fri Mar 31, 2006 12:19 pm Post subject: |
|
|
 Jedi
Joined: 09 Mar 2004 Posts: 2080
|
A thought: Could the file backup/restore problems be related to the observed MQ problems?
Now that the files in /var/mqm have been restored, what is the output from
Code: |
$ cd /var/mqm/errors
$ /opt/mqm/bin/ffstsummary |
|
|
Back to top |
|
 |
j1 |
Posted: Fri Mar 31, 2006 1:44 pm Post subject: |
|
|
 Centurion
Joined: 23 Jun 2003 Posts: 139
|
cd /var/mqm/errors lists only old FDC's .
/opt/mqm/bin/ffstsummary pauses for a while before i hit ctrl + c.
Raised a PMR with IBM will update the root cause once we have it... |
|
Back to top |
|
 |
j1 |
Posted: Fri Mar 31, 2006 1:45 pm Post subject: |
|
|
 Centurion
Joined: 23 Jun 2003 Posts: 139
|
actually, last time i ran /opt/mqm/bin/ffstsummary, it just returned 0 lines, and came out. |
|
Back to top |
|
 |
|