Author |
Message
|
saviobarr |
Posted: Tue Mar 13, 2018 6:24 am Post subject: Message Flow monitoring |
|
|
Centurion
Joined: 21 Oct 2014 Posts: 100 Location: Sao Paulo, Brazil
|
Hey everyone,
I have to monitor a message flow in order to know if it is running or not. If it is not in execution/stopped, I have to take some action.
Is there some builtin resource in IIB that provides such capability?
Have a great day
Savio Barros _________________ Go as far as you can go. Then go farther! |
|
Back to top |
|
 |
abhi_thri |
Posted: Tue Mar 13, 2018 6:52 am Post subject: |
|
|
 Knight
Joined: 17 Jul 2017 Posts: 516 Location: UK
|
hi...you can use mqsilist command to check the state of a flow/application,
below command lists out all message flows on the node
E.g:- mqsilist <brokername> -r | grep "Message flow" | sort -k 8
you can check at specific integration server level as well using '-e' flag
mqsilist <brokername> -e <integration server> |
|
Back to top |
|
 |
saviobarr |
Posted: Tue Mar 13, 2018 8:19 am Post subject: |
|
|
Centurion
Joined: 21 Oct 2014 Posts: 100 Location: Sao Paulo, Brazil
|
abhi_thri wrote: |
hi...you can use mqsilist command to check the state of a flow/application,
below command lists out all message flows on the node
E.g:- mqsilist <brokername> -r | grep "Message flow" | sort -k 8
you can check at specific integration server level as well using '-e' flag
mqsilist <brokername> -e <integration server> |
Hi,
Thanks for replying. I am considering mqsilist to verify it, but it is a passive action (someone needs to run the command). _________________ Go as far as you can go. Then go farther! |
|
Back to top |
|
 |
Vitor |
Posted: Tue Mar 13, 2018 9:15 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
saviobarr wrote: |
(someone needs to run the command) |
Like cron or some other piece of software?
In what cases do you think a flow would not be in execution / stopped? Without the broker producing a BIP message that doesn't need anyone to run anything? Why not check for that?
What action do you intend to take? How would you know to take the action, i.e. in the ideal (and probably non-existent) world where IIB has the built in resource you ask about, what would this resource do? _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
saviobarr |
Posted: Tue Mar 13, 2018 9:33 am Post subject: |
|
|
Centurion
Joined: 21 Oct 2014 Posts: 100 Location: Sao Paulo, Brazil
|
Hi,
Thanks for replying
Quote: |
In what cases do you think a flow would not be in execution / stopped? |
I.E.: It's a client requirement .
Quote: |
Without the broker producing a BIP message that doesn't need anyone to run anything? Why not check for that? |
It works fine, but it is a passive action
Quote: |
What action do you intend to take? |
mqsistartmsgflow
Quote: |
How would you know to take the action |
That is my question : does IIB have some resource that does the job?
Quote: |
i.e. in the ideal (and probably non-existent) world where IIB has the built in resource you ask about, what would this resource do? |
What I configure it to do
I understood that IIB does not have such built in resource, so this requirement can be addressed by doing some script some script that runs mqsilist command (among other actions)
[]'s _________________ Go as far as you can go. Then go farther! |
|
Back to top |
|
 |
Vitor |
Posted: Tue Mar 13, 2018 9:51 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
saviobarr wrote: |
Quote: |
In what cases do you think a flow would not be in execution / stopped? |
I.E.: It's a client requirement . |
Ok, that's neither an answer to my question nor a technical requirement.
saviobarr wrote: |
Quote: |
Without the broker producing a BIP message that doesn't need anyone to run anything? Why not check for that? |
It works fine, but it is a passive action |
No it isn't. What's passive about parsing the log with splunk (or similar) and cutting a ticket?
saviobarr wrote: |
Quote: |
What action do you intend to take? |
mqsistartmsgflow |
This is what I mean about it not being a technical requirement. So what your client wants is an automatic facility to check if the flow is not running and issue an mqsistartflow if it's not. And when the flow stops running 30 seconds later for the same reason it stopped originally, automatically detect this fact and issue another mqsistartflow. Which 30 seconds later will be followed by another automatically generated mqsistartflow, then another, then another, etc.
What your client fails to grasp (and which you've clearly failed to convey to them) is that if a flow is configured to be running, the IBM software will restart it if it fails. The responsibility of the user is to find out why it failed and fix the root cause, not blindly restart it to fail again.
Also if you have this mechanism, how will you ever stop the flow to patch the broker? A stopped flow either means regularly scheduled maintenance or someone has mqsibroker authority who shouldn't.
saviobarr wrote: |
Quote: |
i.e. in the ideal (and probably non-existent) world where IIB has the built in resource you ask about, what would this resource do? |
What I configure it to do |
Manically issue an mqsistart until the flow stays up?
You can do this with a cron. You don't actually need to check anything; put an mqsistart command in the cron and have it issue the command every 10 seconds. An mqsistart has no effect on a running flow, and nothing you've said here says anything about doing anything between the flow stopping and the flow being restarted (like investigation).
Now what we do (and obviously neither you nor your clients need this level of sophistication) is parse the log. If the parser sees a BIP message saying that the broker's had to restart the flow (or indeed the EG in which it runs) we cut a ticket to have the problem investigated. If it happens again within a configured time frame, the ticket is escalated. If the parser sees a BIP message saying the flow's been stopped manually, the ticket system (which happens to be our change control system) checks for an open change record covering the outage. If it sees one, it cuts a low priority ticket for someone to verify the flow restarted after the change (when we have more money, we'll automate cancelling the ticket when the BIP message about the flow starting turns up ). If there's no change record, it cuts a priority 1 ticket and someone gets me out of bed.
Then bad things happen. Bad, bad, bad things. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
saviobarr |
Posted: Tue Mar 13, 2018 10:20 am Post subject: |
|
|
Centurion
Joined: 21 Oct 2014 Posts: 100 Location: Sao Paulo, Brazil
|
Quote: |
Ok, that's neither an answer to my question nor a technical requirement. |
Non tech req
Quote: |
No it isn't. What's passive about parsing the log with splunk (or similar) and cutting a ticket? |
Good idea. I will consider Splunk
Quote: |
This is what I mean about it not being a technical requirement. So what your client wants is an automatic facility to check if the flow is not running and issue an mqsistartflow if it's not. |
The short answer is yes
Quote: |
And when the flow stops running 30 seconds later for the same reason it stopped originally, automatically detect this fact and issue another mqsistartflow. Which 30 seconds later will be followed by another automatically generated mqsistartflow, then another, then another, etc. |
Retry logic to control how many times it is started in a given time frame
Quote: |
What your client fails to grasp (and which you've clearly failed to convey to them) is that if a flow is configured to be running, the IBM software will restart it if it fails. |
Good observation
Quote: |
The responsibility of the user is to find out why it failed and fix the root cause, not blindly restart it to fail again. |
Yes. I agree, but in order to investigate the root cause the user needs to get notified. The start msg flow action is just an vague example. Issue a notification fits good in the screnario
Quote: |
Also if you have this mechanism, how will you ever stop the flow to patch the broker? A stopped flow either means regularly scheduled
maintenance |
stop the robot before the maintenance
Quote: |
or someone has mqsibroker authority who shouldn't |
Kill the guy who gave the brk authority and the one who got
Quote: |
Manically issue an mqsistart until the flow stays up? |
Retry logic
Quote: |
You can do this with a cron. You don't actually need to check anything; put an mqsistart command in the cron and have it issue the command every 10 seconds. An mqsistart has no effect on a running flow, and nothing you've said here says anything about doing anything between the flow stopping and the flow being restarted (like investigation). |
I actually intend to trigger an investigation once the user gets notified
Quote: |
Now what we do (and obviously neither you nor your clients need this level of sophistication) is parse the log. If the parser sees a BIP message saying that the broker's had to restart the flow (or indeed the EG in which it runs) we cut a ticket to have the problem investigated. If it happens again within a configured time frame, the ticket is escalated. If the parser sees a BIP message saying the flow's been stopped manually, the ticket system (which happens to be our change control system) checks for an open change record covering the outage. If it sees one, it cuts a low priority ticket for someone to verify the flow restarted after the change (when we have more money, we'll automate cancelling the ticket when the BIP message about the flow starting turns up ). |
I think it would be a overkill, considering the requirement and the budget
Quote: |
If there's no change record, it cuts a priority 1 ticket and someone gets me out of bed.
Then bad things happen. Bad, bad, bad things |
Nobody wants that
I appreciate your time and your valuable considerations. Thank you!
[]'s
Savio _________________ Go as far as you can go. Then go farther! |
|
Back to top |
|
 |
Vitor |
Posted: Tue Mar 13, 2018 10:30 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
saviobarr wrote: |
Quote: |
or someone has mqsibroker authority who shouldn't |
Kill the guy who gave the brk authority and the one who got  |
Vitor wrote: |
Then bad things happen. Bad, bad, bad things |
After they face me, their bruised and bleeding remains are carried to Cyber Security and anything left is dropped into the pit of Operational Risk Management. At this point, death is a mercy they can only beg for. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
abhi_thri |
Posted: Wed Mar 14, 2018 1:22 am Post subject: |
|
|
 Knight
Joined: 17 Jul 2017 Posts: 516 Location: UK
|
Hi Savio...in case you are not aware if an integration server/execution group crashes for some reason the startDataFlowEngine process (v8/v9) or one of the bip parent process (in v10) will attempt to restart it.
You can verify this by killing an EG explicitly and it should get restarted automatically (you see entries at the syslog showing the same) |
|
Back to top |
|
 |
Vitor |
Posted: Wed Mar 14, 2018 5:09 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
abhi_thri wrote: |
Hi Savio...in case you are not aware if an integration server/execution group crashes for some reason the startDataFlowEngine process (v8/v9) or one of the bip parent process (in v10) will attempt to restart it.
You can verify this by killing an EG explicitly and it should get restarted automatically (you see entries at the syslog showing the same) |
I think I made these points above _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
abhi_thri |
Posted: Wed Mar 14, 2018 5:35 am Post subject: |
|
|
 Knight
Joined: 17 Jul 2017 Posts: 516 Location: UK
|
Please ignore me Vitor...guess i was on too much caffeine today... |
|
Back to top |
|
 |
Vitor |
Posted: Wed Mar 14, 2018 7:08 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
abhi_thri wrote: |
too much caffeine |
No such thing. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
gbaddeley |
Posted: Wed Mar 14, 2018 4:03 pm Post subject: |
|
|
 Jedi Knight
Joined: 25 Mar 2003 Posts: 2538 Location: Melbourne, Australia
|
|
Back to top |
|
 |
Vitor |
Posted: Thu Mar 15, 2018 4:56 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Propaganda put about by the tea growers association.  _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
zpat |
Posted: Thu Mar 15, 2018 7:37 am Post subject: |
|
|
 Jedi Council
Joined: 19 May 2001 Posts: 5866 Location: UK
|
Whatever means you use, it would help to allow for exceptions, for example read a file containing any flow names that are NOT to be auto-restarted.
However unless stopped manually, the flows will keep running - I've never seen a need for this sort of monitoring. You could use the administration API (CMPAPI to do it from Java).
If the flows read queues, you can monitor whether the queue is open for input using your MQ monitoring. _________________ Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error. |
|
Back to top |
|
 |
|