Author |
Message
|
George Carey |
Posted: Mon Jan 29, 2007 9:57 am Post subject: MQ HA configuration on Solaris |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Can anyone confirm that the following configuration is valid for the environment described and if not considered valid, detail the reasons why.
Environment: Solaris server A and Solaris server B have shared filesystems on NetApps RAID drive that are NFS mounted to both systems for "/var/mqm" and "/opt/mqm" , then standard WS MQ install images are put on both Machines (2nd install overwrites files of first - so what).
Creation of qmgr HATEST.QM and local queue AL.Q are sucessfully done on server A and messages are put to AL.Q. Server A is shut down and on server B same qmgr HATEST.QM is started and the messages from AL.Q queue are successfully read. This can be ping ponged to hearts content.
Note: No special scripts (MC91) run, no third party software, Veritas, etc..., no nothing special done and I have an HA configuration working.
N.B. I add that the connection to the qmgr are local binds using the sample amqsget and put modules, but that is all the actual applications will need.
This would seem intuitively obvious to work but every IBM document gives convoluted explanations crosslinking directories, etc..., with 50 odd pages for MC91 on how to set up an HA configuration. With the caviat noted on doing local binds is there any valid reason for NOT using this HA configuration !!!
Thanks for any (thougtful non-superficial) responses. _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
George Carey |
Posted: Mon Jan 29, 2007 2:26 pm Post subject: Thanks for any (thougtful non-superficial) responses. |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
OK, I will now take superficial and non-thoughtful responses.  _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
George Carey |
Posted: Mon Jan 29, 2007 2:28 pm Post subject: Thanks for any (thougtful non-superficial) responses. |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
OK, I will now take superficial and non-thoughtful responses.  _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
jefflowrey |
Posted: Mon Jan 29, 2007 2:41 pm Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
You'll likely run into issues with file locks and shmem files. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
George Carey |
Posted: Mon Jan 29, 2007 3:02 pm Post subject: |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Not seeing any issues there as I ping pong between server A and server B with a controlled shutdown of a qmgr on one server before starting up qmgr on the other.
Also for an uncontrolled shutdown (i.e. a system failure) one could as part of the startup process of the qmgr on backup server, one could just put in the standard 'ipcrm -m .... , etc. ' commands to clear all that. _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
Vitor |
Posted: Tue Jan 30, 2007 12:25 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
For what it's worth....
Before we went to Veritas we used a similar set up (machine A, machine B, shared disk) but had /opt/mqm installed locally on each machine with /var/mqm on the shared. The "so what" comment in your original post caused our Unix admins to reply "we don't know, and don't want to find out".
I've run your scenario past them (I am not now nor have I ever been a Unix admin type) and their first comment was "how do you clean up reliably after an uncontrolled shutdown?". Your idea of a script got a cool response which I have learned to associate with "possible but hard work and may need us to intervene manually".
Their 2 cents not mine. Level of superficiality and thought in response unknown at this time.  _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
George Carey |
Posted: Tue Jan 30, 2007 7:27 am Post subject: |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Thanks for the feedback Vitor but if anything my configurations are eliminating any hard work, ... I have done my share of unix administration it is a trivial task to clear any shared semaphores on reboot. In fact here is the recommended procedures from IBM:
Recommended failover procedure
Following are the recommendations from PMR 67163.7TD.000 for failover procedure to reduce the (small) chance of WMQ inadvertently attaching to pre-existing IPC resources. Consider these guidelines along-side those provided in the support pac.
1) On the machine on which QMGR is originally running: after killing QMGR, all IPC resources owned by QMGR should be removed. This will occur automatically if the whole machine failed and is being rebooted. However, if the machine is not rebooted (e.g. in a test scenario or where a queue manager has failed but the machine continues to run), then resources should be removed manually. If no other qmgrs are running then you should ipcrm all IPC resources owned by mqm. If other qmgrs are running, then you can make use of the amqiclen utility to remove just QMGR's IPC resources.
2) Immediately before strmqm on the machine that QMGR is failed over to, you need to ensure the IPC-corresponding files under /var/mqm/qmgrs/QMGR are removed. As amqiclen is currently implemented, it can't be used, because it won't always remove those files if the actual IPC resource doesn't exist (and you wouldn't expect them to exist). Therefore, for now you can do the following:
rm -f /[data_directory]/qmgrs/[qmgr_name]/*sem/*
rm -f /[data_directory]/qmgrs/[qmgr_name]/shmem/*
rm -f /[data_directory]/qmgrs/[qmgr_name]/@app/*sem/*
rm -f /[data_directory]/qmgrs/[qmgr_name]/@app/shmem/*
rm -f /[data_directory]/qmgrs/[qmgr_name]/@app/spipe/*
rm -f /[data_directory]/qmgrs/[qmgr_name]/@ipcc/*sem/*
rm -f /[data_directory]/qmgrs/[qmgr_name]/@ipcc/shmem/*
rm -f /[data_directory]/qmgrs/[qmgr_name]/@ipcc/spipe/*
rm -f /[data_directory]/qmgrs/[qmgr_name]/@qmpersist/*sem/*
rm -f /[data_directory]/qmgrs/[qmgr_name]/@qmpersist/shmem/*
rm -f /[data_directory]/qmgrs/[qmgr_name]/@qmpersist/spipe/*
This will remove all QMGR IPC-corresponding files. Do this before allowing amqcrsta processes to startup for this failed-over qmgr.
So besides doing ipcrm -m ... commands the above commands are recommended as well before starting the backup queue manager. I am not saying anything different. May be you Unix admins we not aware of these recommendations? _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
Vitor |
Posted: Tue Jan 30, 2007 7:45 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Personally I'll take your word for everything.
I suspect the issue would have been the running of a script to clear things out. The admins here traditionally take a dim view of things that have to be done sometimes and sometimes not.
But I don't know. Since we got Veritas working I've ceased to care (though in all honesty I didn't care much to start with). I just design the stuff, not keep it going  _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
ashoon |
Posted: Tue Jan 30, 2007 8:00 am Post subject: I've run into one problem |
|
|
Master
Joined: 26 Oct 2004 Posts: 235
|
I've run into one problem doing this...
it happened when I accidentally started the QMgr on the second machine while the first was running and that fudged it completely.... but like you said - that won't happen in your environment.
another problem that I ran into was using NFS - MQ would take forever to start and stop... while I'm not sure the root cause we switched to mounting a local SAN and that fixed the issues but it could have been the Linux environment.
finally I like seperating the MQ libraries from the runtime so I can upgrade one machine while the second is running.
HTH |
|
Back to top |
|
 |
George Carey |
Posted: Tue Jan 30, 2007 8:25 am Post subject: |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
Your last comment ashoon about upgrades... or more generally planned maintenace has validity. One does not want to update libraries in /opt/mqm filesystem if in use which they would be in my configuration. But if the servers are just one of many servers that are being load balanced across then taking one out as the others took the load would be a part of the maintenance ops anyway. _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Jan 30, 2007 8:27 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Also, of course, I'm fairly sure this is an unsupported configuration. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
George Carey |
Posted: Tue Jan 30, 2007 9:13 am Post subject: |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
I don't believe that is true ... and to the exact point of my posting, give me valid reasons as to why it would not be ???
One would think if a configuration has no arguably invalid aspects to it and that it works, then it would not be an invalid configuration. _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
Toronto_MQ |
Posted: Tue Jan 30, 2007 9:18 am Post subject: |
|
|
 Master
Joined: 10 Jul 2002 Posts: 263 Location: read my name
|
He didn't say it was "invalid", he said it was "unsupported". And from experience, I agree. There's a reason why MC91 was written. Otherwise, IBM's HA recommendation would be as simple as you put forward. If you want this information, you may want to look into opening a question with IBM support.
Cheers
Steve |
|
Back to top |
|
 |
George Carey |
Posted: Tue Jan 30, 2007 9:45 am Post subject: |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
<"He didn't say it was "invalid", he said it was "unsupported". ...>
I didn't say he said it was invalid, you did. I said: "I don't believe that is true .. "
I have opened this discussion with IBM support and have read their statement of supported configuration which is just a generic statement about when they will support MQ in HA environments and they have sent it up the chain to next support level for response, themselves. I am trying to get a valid set of reasons as to why this is a valid or is an invalid configuration. It is not supported does not fit that !!
I think I will go back to my original request for thoughtful and non-superficial responses. _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
George Carey |
Posted: Tue Jan 30, 2007 11:25 am Post subject: |
|
|
Knight
Joined: 29 Jan 2007 Posts: 500 Location: DC
|
<"I am trying to get a valid set of reasons as to why this is a valid or is an invalid configuration. It is not supported does not fit that !! ">
Editorial re-wording:
I am trying to get a valid set of TECHNICAL reasons as to why this is an invalid configuration. It is not supported does not fit that !! _________________ "Truth is ... grasping the virtually unconditioned",
Bernard F. Lonergan S.J.
(from book titled "Insight" subtitled "A Study of Human Understanding") |
|
Back to top |
|
 |
|