ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » Trace/Logging of issued commands

Post new topic  Reply to topic
 Trace/Logging of issued commands « View previous topic :: View next topic » 
Author Message
smeunier
PostPosted: Tue Apr 17, 2012 11:29 am    Post subject: Trace/Logging of issued commands Reply with quote

Partisan

Joined: 19 Aug 2002
Posts: 305
Location: Green Mountains of Vermont

I have an environment that has 4 servers(2 z/os and 2 unix) in a cluster. The 2 unix servers contain the full repository. A SUSPEND QMGR was issued from one of the z/os server, which it promptly performed. A few dys later the RESUME QMGR command was issued. Other than a command accepted message in the SYSLOG, there was not indicator as to whether is worked or not. A week later, the second z/os server was suspended........all hell broke loose, as effectively both z/os server were now suspended.

Obviously there was should have been some due diligence that should have been performed by the practitioner, to insure, that when the previous qmgr was RESUMED, they should have confirmed its state. My task is now to try find why the RESUME command did not complete or failed.

I have looked at the UNIX MQ logs but do not see any errors. If there was a failure, or a success (apparently not) would this have gotten logged somewhere. What logs? The z/os MQ logs or some other location?

Any idea on where I could find success/failure of the RESUME Command?
Back to top
View user's profile Send private message
bruce2359
PostPosted: Tue Apr 17, 2012 12:21 pm    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9394
Location: US: west coast, almost. Otherwise, enroute.

Why did you/they do the SUSPEND?

What changes to qmgr(s) took place on the suspended qmgr?

Did the RESUME take place on the exact same qmgr whee you did the SUSPEND?

What version/release of mq?
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
smeunier
PostPosted: Tue Apr 17, 2012 12:41 pm    Post subject: Reply with quote

Partisan

Joined: 19 Aug 2002
Posts: 305
Location: Green Mountains of Vermont

The suspend was issued because that server was coming down for maintenance. It is part of a dual z/os logistics system instance. It is a procedural step to insure that no transaction will be caught in flight to the server coming offline while the secondary assume full workload. The resume was issued from the same server as the suspend. There was no work done to the QMGR, but to the OS for security fixes and general support upgrades to z/os. An IPL followed.

MQ V7.0.1 is installed
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Tue Apr 17, 2012 3:38 pm    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7717

Whenever I suspened or resumed QMs in clusters, I would use the DISPLAY CLUSQMGR command on the suspended QM to see what the SUSPEND attribute in the output of the command said. I am not aware of any logs where this info is captured. So our scripts to suspend or resume the QM always included the display clusqmgr command to validate what happened.

I have opened a PMR in the past where they confirmed it is safe to issue the RESUME command for a QM not suspended, and safe to issue the SUSPEND command to a QM already suspended.

You comment that all hell broke loose once the 2nd QM was suspended is perplexing to me. A suspended QM is not going to refuse traffic, unless you use the FORCE option on the SUSPEND command. But I must admit I never tried suspended all the QMs in a cluster that hosted a particular queue. If all the destinations for a message in a cluster only included suspended QMs, I don't know. Define 'all hell'. Were messages going to DLQs - what were the reason codes in the dead letter header? Were apps getting failed MQPUTs - what was the reason code?


Perhaps this offers a clue where to look:
http://publib.boulder.ibm.com/infocenter/wmqv7/v7r0/topic/com.ibm.mq.csqzaj.doc/sc12930_.htm
Quote:

On z/OS, if you define CLUSTER or CLUSNL:
The command fails if the channel initiator has not been started.
Any errors are reported to the console on the system where the channel initiator is running; they are not reported to the system that issued the command.
On z/OS, you cannot issue RESUME QMGR CLUSTER(clustername) or RESUME QMGR FACILITY commands from CSQINP2.


Its a good idea to have test queues defined on all QMs in a cluster and to have scripts that pump dummy messages to these queues to validate all QMs in the cluster get their expected messages after maintenance.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Tue Apr 17, 2012 8:07 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

If I want to verify that a queue manager correctly is in resumed state, I usually do 2 checks

Code:
dis clusqmgr(<qmgrname>) suspend


Once on the qmgr that was suspended,
Once on each of the FR's.

If it does not show up correctly on the FR's trouble shoot the communications.

If it does not show up on correctly one of the FR's trouble shoot communications between the FR's...

And as a added precautions check through monitoring that the cluster queues are seeing traffic (selected sample of cluster queues)

Never had surprises this way...

Have fun
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
smeunier
PostPosted: Wed Apr 18, 2012 6:01 am    Post subject: Reply with quote

Partisan

Joined: 19 Aug 2002
Posts: 305
Location: Green Mountains of Vermont

First thanks for all the replies. I think the consensus is what it should have been, and that is check, check, then check again, if it is a critical path. The DISPLAY CLUSQMGR(*) SUSPEND will be added to the checkout process. This we knew we would have to do. You can only be lucky so many times, before you run out.

Let me explain the "all Hell" statement.

The destination queues in the cluster are only defined on the z/os servers as this is the endpoint processing. With one QMGR already suspended from the cluster and presumably resumed , the second qmgr was suspended from the cluster. Thus, effectively removing all the endpoint QMGRS from the cluster. The "all hell" refers to the fact that these are real-time logistics transactions, which if not processed withing 60 seconds are stale. They have message expiry for self cleanup, and the sending application has a thread timer, that reports failure. Since these are logistics transactions at a very high volume rate, the factory had essentially stalled. Time is money.

This procedure is one that we have effectively executed for over 5 yrs and give us a high degree of flexibility. This time it showed a flaw in out procedure. Its never to late to LEARN!!!!!

Thanks again.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Wed Apr 18, 2012 8:17 am    Post subject: Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7717

Were messages going to DLQs - what were the reason codes in the dead letter header? Were apps getting failed MQPUTs - what was the reason code?
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Wed Apr 18, 2012 3:56 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

All Hell broke loose:

The manual says (somewhere) that if all destinations are on a suspended qmgr, it is like if none of the destination was on a suspended qmgr.

So my guess is that he had a lot of transactions that expired on the queue without being processed as he probably stopped the application before checking (monitoring) that no more transactions were flowing through the qmgr...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
mqjeff
PostPosted: Wed Apr 18, 2012 5:28 pm    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

fjb_saper wrote:
All Hell broke loose:

The manual says (somewhere) that if all destinations are on a suspended qmgr, it is like if none of the destination was on a suspended qmgr



Suspended just means "make the best effort to ignore this queue manager, unless it's the only queue manager we can talk to!".
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » Clustering » Trace/Logging of issued commands
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.