Author |
Message
|
issac |
Posted: Wed Jan 12, 2011 6:54 pm Post subject: How to auto-reset sequence number? (We have 200+ chls...) |
|
|
 Disciple
Joined: 02 Oct 2008 Posts: 158 Location: Shanghai
|
Hello,
I know it's a rigid question, please don't question my intention. I know how to setup HA env for WMQ, and I know auto-resetting this might lead to data loss.
My scenario is, in our env, we have over 200 chls in one QMGR. Then one day, we hit IZ21977, the whole cluster were down (because the gateway stopped redirecting messages). We tried switching to the standby AIX host, however IZ21977 makes it useless to do so. The gateway is connected to bank branches all over china. It took 5 people 3 hours to recreate the QMGR and then reset all our channels and make it work again.
Now we have upgraded our WMQ to 6.0.2.10, but we can't afford to hit any further similar bug any more. We need a way to at least quickly reset seq for all the channels if we have to recreate the QMGR.
Please help, thanks in advance. _________________ Bazinga! |
|
Back to top |
|
 |
fjb_saper |
Posted: Wed Jan 12, 2011 8:18 pm Post subject: Re: How to auto-reset sequence number? (We have 200+ chls... |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
issac wrote: |
Hello,
I know it's a rigid question, please don't question my intention. I know how to setup HA env for WMQ, and I know auto-resetting this might lead to data loss.
My scenario is, in our env, we have over 200 chls in one QMGR. Then one day, we hit IZ21977, the whole cluster were down (because the gateway stopped redirecting messages). We tried switching to the standby AIX host, however IZ21977 makes it useless to do so. The gateway is connected to bank branches all over china. It took 5 people 3 hours to recreate the QMGR and then reset all our channels and make it work again.
Now we have upgraded our WMQ to 6.0.2.10, but we can't afford to hit any further similar bug any more. We need a way to at least quickly reset seq for all the channels if we have to recreate the QMGR.
Please help, thanks in advance. |
Really simple. Use a script for runmqsc or mqsc...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
santnmq |
Posted: Wed Jan 12, 2011 8:57 pm Post subject: |
|
|
Centurion
Joined: 11 Jan 2011 Posts: 125
|
At what point/condition you want to auto reset the sequence number ? |
|
Back to top |
|
 |
issac |
Posted: Wed Jan 12, 2011 9:48 pm Post subject: I want seq no to auto-reset when I issue some command |
|
|
 Disciple
Joined: 02 Oct 2008 Posts: 158 Location: Shanghai
|
Thanks for your participation. I want seq nos to be auto-reset once I issue a command... _________________ Bazinga! |
|
Back to top |
|
 |
bruce2359 |
Posted: Wed Jan 12, 2011 10:08 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Quote: |
We need a way to at least quickly reset seq for all the channels if we have to recreate the QMGR.
I want seq nos to be auto-reset once I issue a command... |
A contradiction, yes? You want to automatically reset sequence number when you enter a command? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Last edited by bruce2359 on Wed Jan 12, 2011 10:09 pm; edited 1 time in total |
|
Back to top |
|
 |
santnmq |
Posted: Wed Jan 12, 2011 10:09 pm Post subject: |
|
|
Centurion
Joined: 11 Jan 2011 Posts: 125
|
a simple solution to your problem is to create a script, having command that you want to execute and additional MQSC code to reset the sequence number.
Use this script in place of your command. Will get desired results. |
|
Back to top |
|
 |
John89011 |
Posted: Wed Jan 12, 2011 11:01 pm Post subject: |
|
|
Voyager
Joined: 15 Apr 2009 Posts: 94
|
This can get quite ugly.. are you going to have your script looking for the coresponding seq. number for all the 200+ channels or are you going to set it to 1 at which point you'll have to reset it to 1 on the other end as well? Just something to keep in mind... |
|
Back to top |
|
 |
zpat |
Posted: Wed Jan 12, 2011 11:03 pm Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
If you reset the sender end - it resets the receiver end automatically. |
|
Back to top |
|
 |
santnmq |
Posted: Wed Jan 12, 2011 11:08 pm Post subject: |
|
|
Centurion
Joined: 11 Jan 2011 Posts: 125
|
creating an script will be one time effort to get it resolved but at the same time you can try to look for other feasible option.
The main constraint here is that you want to reset the seq number at the time when you execute your perticular command. Will keep this also in mind. |
|
Back to top |
|
 |
santnmq |
Posted: Wed Jan 12, 2011 11:26 pm Post subject: |
|
|
Centurion
Joined: 11 Jan 2011 Posts: 125
|
additionally if you use some advance features of scripting, you may not need to mention all 200+ channels in the script.
give a thougt on the channel list also. |
|
Back to top |
|
 |
exerk |
Posted: Thu Jan 13, 2011 12:13 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Your script will have to parse the queue manager error log for each channel name that is in retry due to a mismatched sequence number, and those channels may be both SDR and RCVR. Your script will have to parse the "...expected sequence number..." part of the entry and issue a RESET command against that channel, or in the case of SDR channels just a reset to the initial value, but bear in mind that if you do that the receiving end admins may interpret that as an error occurrence so you need to inform them of what you are going to do (in advance preferably, so they can correlate their potential 'error' with your actions), or reset your end to the expected number at their end. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
Mr Butcher |
Posted: Thu Jan 13, 2011 2:08 am Post subject: |
|
|
 Padawan
Joined: 23 May 2005 Posts: 1716
|
he is looking for a solution after the queuemanager was re-created.
that means, every channel used before is out of sync.
So as first step i would reset every sender channel.
For receiver, it is more difficult. The sequence error will only show up if a message is received, and not when the channel becomes "running".
So as you do not know when the next message will arrive on your receiver you either coordinate with your partners make them reset the sender, or make them send a message so you know what sequence number you can reset your receiver to.
In case you use a script to do that automatically, that script must run permanent as you never know when the channel is started and when the first message will arrive causing the sequence error.
For situations like that we have set up a loopback queue with our customers. If we send a message to that queue at customer end, it is send back to us. Using this "feature" we can verify sender and receiver channels sequence numbers at any time. If customer uses discint / triggering, this will also start the connection from customer to us.
In case of sequence errors, we can fix both channels without the need to contact the customer or to wait for any activities performed by the customer. We use this for initial connection setup as well as after migrations, software updates, network changes and so on as we can exchange messages without the need of applications being active.
Of course i agree its always good to contact the customer, but in our case first activity is to make it working again and then tell the customer. _________________ Regards, Butcher |
|
Back to top |
|
 |
exerk |
Posted: Thu Jan 13, 2011 5:32 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Mr Butcher wrote: |
he is looking for a solution after the queuemanager was re-created.
that means, every channel used before is out of sync.
So as first step i would reset every sender channel... |
I stand by my earlier statement in regard to what the receiving end's admins may think of that. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
Mr Butcher |
Posted: Thu Jan 13, 2011 5:52 am Post subject: |
|
|
 Padawan
Joined: 23 May 2005 Posts: 1716
|
i would not care, as a reset was required in that situation. what i do not like are (unexperienced) administrators that are faced with a non running channel and just do a reset because it may help without checking if there is a sequence number problem.
i also could not think of any situation where i would have a script of monitoring or similiar that is based on the real sequence number. As long as there is no sequence number mismatch, i dont care for the actual, real number.
And if one of my customers resets his sender because he had to rebuild his queuemanager then this is fine to me.
Of course i like to be notified that there was a problem if this is a production issue (as i may have received alerts), but not in the first step, as i see the requriements of the one that has the problem. he has to be back online fast.
you cant call 200 customers one after the other if you have such a problem e.g. on a trading system. if it only takes 1 minute per customer you need > 3 hours till the last one is running.
i am sure the remote administrator is glad too if he can report that there was an issue that was fixed fast, no matter if it was his problem or not. or maybe its better to answer - "they called me to help fixing the issue, but i was out for lunch"?
i see your point. i would maybe say the same if thats just 1 connection. but i cant do for 200 customers on a trading system. _________________ Regards, Butcher |
|
Back to top |
|
 |
exerk |
Posted: Thu Jan 13, 2011 6:13 am Post subject: |
|
|
 Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Mr Butcher wrote: |
...i see your point. i would maybe say the same if thats just 1 connection. but i cant do for 200 customers on a trading system. |
Hence why I stated that the contact should be up-front, i.e. "...we have had this situation, and if it happens again and we have to rebuild, we will reset all SDR channels to the default at our end, so please be aware of that process...". I too would not want the latency of a 'yes' from others to get things up and running, unless they had specifically requested that contact. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
 |
|