ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum IndexGeneral IBM MQ SupportQuestion about long running UOW's

Post new topicReply to topic
Question about long running UOW's View previous topic :: View next topic
Author Message
GheorgheDragos
PostPosted: Thu Aug 29, 2019 12:14 am Post subject: Question about long running UOW's Reply with quote

Newbie

Joined: 28 Jun 2018
Posts: 9

Dears,

We have encountered the following situation, in which, to my regret, I have made a mistake by claiming that "we cannot help from mainframe side" in front of the customer. I am not proud of myself and it is not like me at all to dismiss a possibility like that.
For several days one of our queue managers has not taken a checkpoint. All good and done, we identified the culprit queues, channels and servers. We informed the customer to commit their threads, and we were informed that they do not have the knowledge on how to do so, their application using JMS. We had a meeting and they have asked me : when we stop/start the application, how come that doesn't solve the issue ( since they are restarting it twice per night, at 1 and at 5 am ). I replied that when an application is stopped, at restart it will re-use any hanging threads in order to complete them, just like CICS starting up in warm mode etc. Ok. So they have asked for a confirmation that the problem is coming from their side in to the mainframe and if there is anything I can do from my side to fix it. The channel being SVRCONN, it is impossible for us to run a commit. So I replied with a definitive NO ( well more polite than that ) and I suggested that they commit all their units of work. Later on that day I receive an email from them with an ibm article about a possibility to use STOP CHANNEL(FORCE)!!!!!!! to commit the threads, from the RECEIVING side. I had not idea about this, and I feel terrible, having already stopped the channel several times from my side ( in normal quiesce mode ). I would like to ask the following :

KAINT - One SVRCONN has multiple threads opened against it ( running instances ), coming from different servers, all from the same application. In the case of an unexpected tcp/ip problem, when the connection is broken between the client and the mainframe, and the orphaned thread stays hanging on the mainframe ( is that even possible - we are running MQ 8 ), can the KAINT help ? for example, when there are 30 threads for example active ( and by the way, these threads could not even do any IO, just keeping the channel active ), tcp/ip breaks down, one incoming message I on it's way, can this message remain somehow in a 'loop' in mq side and, while the initial thread it was assigned to gets timed out due to KAINT, will it be 'purged' ?? Is the KAINT managing the channel itself, or each individual thread? for example, we have 30 individual thread on one SVRCONN, 29 are idle - yet still active - and one is doing active IO. Will the 29 threads be timed out ?

Thank you for your time.

Dragos Gheorghe
Back to top
View user's profile Send private message
hughson
PostPosted: Thu Aug 29, 2019 1:54 am Post subject: Reply with quote

Grand Master

Joined: 09 May 2013
Posts: 1220
Location: Bay of Plenty, New Zealand

When an application running over a SVRCONN ends, and the CHINIT detects that ending of the SVRCONN, either because you stopped it, or because you used HBINT or KAINT to help to detect broken TCP/IP connections, the SVRCONN will, as part of it's clean up, issue an MQBACK, to rollback any MQ only UoWs.

If these UoWs you are having trouble with were single phase commit UoWs they would already be tidied up. Therefore I have to assume that these are 2-phase commit transactions?

I am not aware of an article from IBM that suggests doing a STOP MODE(FORCE) on a channel to Commit any transaction. Is it possible that you could provide us with the link to the article that they sent you?

P.S. KAINT is generally these days unnecessary since from MQ V7, client channels have much better heart-beating. Check that your SVRCONNs are using HBINT by looking at the HBINT value on DISPLAY CHSTATUS - which will show you the negotiated value.

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
bruce2359
PostPosted: Thu Aug 29, 2019 2:26 am Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 8464
Location: US: west coast, almost. Otherwise, enroute.

Some more general questions:

Has this application worked properly in the past?

You mentioned CICS. IS CICS involved in this? Are any other resource managers (data bases, for example) involved in the UofW?

When did this problem begin? Has the app been modified recently?

What errors did you see in the error logs?
_________________
There are two types of people in this world:
1) Those that can extrapolate from incomplete data
Back to top
View user's profile Send private message
HubertKleinmanns
PostPosted: Thu Aug 29, 2019 4:30 am Post subject: Re: Question about long running UOW's Reply with quote

Yatiri

Joined: 24 Feb 2004
Posts: 630
Location: Germany

GheorgheDragos wrote:
Later on that day I receive an email from them with an ibm article about a possibility to use STOP CHANNEL(FORCE)!!!!!!! to commit the threads, from the RECEIVING side. I had not idea about this, and I feel terrible, having already stopped the channel several times from my side ( in normal quiesce mode ).


I never would stop a running channel with mode force - you may loose messages. The only situation where I had to stop a channel with mode force was a sender channel in state BINDING. This happens e. g. when a firewall blocks the connection. In this case no messages have passed the channel and so a stop channel with mode force is harmless.

In addition only the application, which uses the channel, is able to decide, whether a COMMIT has to be done or not. You as an administrator do not know about the application's logic (normally).

- When the application stops in a "normal" manner, the QMgr will perform a COMMIT.

- When the application crashes, the QMgr will perform a BACKOUT.

So there is no need for MQ Administrators, to force a COMMIT (except in some special cases).

GheorgheDragos wrote:
for example, we have 30 individual thread on one SVRCONN, 29 are idle - yet still active - and one is doing active IO. Will the 29 threads be timed out ?


I saw such situations often at my customers. Normally

1. either the client app does not properly close a connection (and "forgets" the connection handle)

2. or a firewall cuts the connection (e. g. due to low traffic).

The second case sometimes occurs in test environments, whereas in production the channel - and so the IP connection - is more busy.
_________________
Regards
Hubert
Back to top
View user's profile Send private message Visit poster's website
tczielke
PostPosted: Thu Aug 29, 2019 5:25 am Post subject: Reply with quote

Sentinel

Joined: 08 Jul 2010
Posts: 830
Location: Illinois, USA

You can definitely perform rollback and commits with JMS. If the JMS developer is not aware of this, they need to familiarize themselves with the JMS 2.0 specification.
_________________
Working with MQ since 2010.
Back to top
View user's profile Send private message
Vitor
PostPosted: Thu Aug 29, 2019 5:26 am Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 25815
Location: Ohio, USA

Taking a step back, I would express the opinion that it's a no-brainer for the processing application to control UOW. What would happen if, instead of MQ it was a database and the database was determining when and if to commit or rollback any work?

Moving specifically to this:

GheorgheDragos wrote:
We informed the customer to commit their threads, and we were informed that they do not have the knowledge on how to do so, their application using JMS


There's another comment here about no brains.

I would have, with minimal politeness, advised them to maybe learn how to use the programming framework they've elected to employ.

I also throw into the mix that my lack of knowledge about Java is legend on this forum. I typed "controlling unit of work with JMS" into Google and not only got this, this and this as the first 3 hits but also this instructional video.

GheorgheDragos wrote:
Later on that day I receive an email from them with an ibm article about a possibility to use STOP CHANNEL(FORCE)!!!!!!! to commit the threads, from the RECEIVING side.


Post the link. I would rebut it with this, which is not an "article" but from the product documentation and has the following to say about the FORCE option:

Quote:
The channel does not complete processing the current batch of messages, and can, therefore, leave the channel in doubt. In general, consider using the quiesce stop option.


So would your customer prefer to recode their application to handle commits correctly, or recode their application (and the containing server in all likelihood) to handle in-doubt client channels?

(Hint: the correct answer in "handle commits" unless their admin is a masochist who enjoys spending time at the office)
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
GheorgheDragos
PostPosted: Mon Sep 09, 2019 9:17 am Post subject: Reply with quote

Newbie

Joined: 28 Jun 2018
Posts: 9

The situation was handled such as the official response was that it is not possible to commit anything from the SVRCONN side, of course, however, stop mode(force) should work, according to the manual. Unfortunately our shop is a little bit unorganized, so we don't have a JMS specialist to work on their application, to handle the way the error handling works. The stop mode(force) was done while the application was stopped. After that we were able to take the checkpoint, and this particular administrator looked like he doesn't know what he's talking about when he said that the action(commit) has to be taken from the sender side and that we cannot help from the mainframe side.
Back to top
View user's profile Send private message
Vitor
PostPosted: Mon Sep 09, 2019 10:38 am Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 25815
Location: Ohio, USA

GheorgheDragos wrote:
The stop mode(force) was done while the application was stopped.


I hope it was disconnected as well, not just stopped but using a JMS connection pool that was still active.

I'm still waiting for a link to this IBM article they sent.

How's the channel now? Running properly, running properly but some weirdness, in-doubt or just plain broken?

Is the customer happy that the resolution to this problem is to stop the application periodically so you can force the channel (risking integrity) and take a checkpoint (which given the use of a FORCE may not be worth the electrons you're using to store it)? Do they still think this is better and has less business risk than learning how to use JMS properly?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
Display posts from previous:
Post new topicReply to topic Page 1 of 1

MQSeries.net Forum IndexGeneral IBM MQ SupportQuestion about long running UOW's
Jump to:



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP


Theme by Dustin Baccetti
Powered by phpBB 2001, 2002 phpBB Group

Copyright MQSeries.net. All rights reserved.