ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » IIB's database connection management philosophy

Post new topic  Reply to topic
 IIB's database connection management philosophy « View previous topic :: View next topic » 
Author Message
McueMart
PostPosted: Thu May 14, 2015 3:28 am    Post subject: IIB's database connection management philosophy Reply with quote

Chevalier

Joined: 29 Nov 2011
Posts: 490
Location: UK...somewhere

Hi all,

This post is an account of my experience, and is an opportunity for the knowledgeable people of this forum (hopefully including the devs...) to give their take on it.

When broker is managing ODBC/JDBC connections, the following set of steps can occur:

1) A flow processes message 1 and successfully connects to ODBC DSN 'TEST'
2) The 'TEST' database is now restarted
3) The flow processes message 2, and this time the flow throws an exception along the lines of: "[unixODBC]Communication link failure , [unixODBC]Disconnect error.".
4) The flow processes message 3, and this time the broker will successfully reconnect to the datasource and function normally.

I have confirmed via PMR that this is the expected behavior. What im having a hard time understanding, is why it has been implemented like this, and what we can do about it.

Ideally I would like it to work like this:

1) A flow processes message 1 and successfully connects to ODBC DSN 'TEST'
2) The 'TEST' database is now restarted
3) A flow processes message 2. The broker detects that the ODBC connection is broken and attempts to reconnect it. The reconnected connection is then used to process the work - which succeeds.


So hopefully it's clear from this example that I don't understand why broker has to function like:

"Ohhh yea, I see this connection is broken - let me throw an exception to let you know this! After that I'll mark the connection as broken and clear this connection out of my pool so that future messages will attempt to reconnect".


From a user's perspective, I don't want broker throwing an exception when the connection is broken UNLESS it really cannot re-connect (i.e. the DB is down).

My thoughts about how can we work around this kind of error:

1) Implement logic down the 'catch' leg of a flow which detects this particular type of exception. *Non-transactionally* , send the original input message back to the input queue. The flow will then have to rollback (we cant commit any other work which might have been done via another DataSource).

2) Put procedures in place that any time a database is restarted, any execution groups which had connections to it need to also be restarted. This wouldn't fix the issue of the odd network blip causing a DB connection to be broken though.

3) Raise an RFE to ask IBM to re-visit their implementation, or at least provide a 'Clear connection pool' facility, so we can immediately clear the 'broken' connection pool after we know a database has been restarted. IBM has stated that this behavior has been in place for a long time, so I fear my chances of success if I go this route...


Couldn't find that this had been discussed previously (Lots of discussion about broker's (lack of) connection pooling , but nothing on this precise issue) - so any input is appreciated.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Thu May 14, 2015 4:14 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

There is probably a good reason this was done. It may be lost in design meetings from very long ago.

Your own flow should be able to detect the database connection error and retry the operation, which should result in a reconnect. This would then ensure that message 2 was not skipped.
Back to top
View user's profile Send private message
Vitor
PostPosted: Thu May 14, 2015 4:33 am    Post subject: Re: IIB's database connection management philosophy Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

McueMart wrote:
1) Implement logic down the 'catch' leg of a flow which detects this particular type of exception. *Non-transactionally* , send the original input message back to the input queue. The flow will then have to rollback (we cant commit any other work which might have been done via another DataSource).


I'd sooner catch the database inside the ESQL (if approprate) or put a Try Catch node immediately before the database operation. That way you can retry without the flow rolling back except in the case of a persistent error - the database may not have simply been restarted, it might have crashed and/or the network blip might be a long drawn out tone......
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
McueMart
PostPosted: Thu May 14, 2015 4:52 am    Post subject: Reply with quote

Chevalier

Joined: 29 Nov 2011
Posts: 490
Location: UK...somewhere

mqjeff wrote:
There is probably a good reason this was done. It may be lost in design meetings from very long ago.


I can believe there was a reason , but for the life of me I cant figure what it would be.


Vitor wrote:
I'd sooner catch the database inside the ESQL


Good shout - let me see if I can catch this exception in an ESQL Handler, and immediately retry the same query (I guess we can implement our own 'retry counter' kind of implementation).

I am not keen on the try/catch node approach as we have 100's of flows which could suffer this issue.
Back to top
View user's profile Send private message
Vitor
PostPosted: Thu May 14, 2015 5:02 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

McueMart wrote:
I am not keen on the try/catch node approach as we have 100's of flows which could suffer this issue.


This is why you have subflows, User Defined Nodes, patterns and all the other good stuff for adding common capabilities to flows.

If you choose to raise an RFE to fix this at the software level, post the link here and chase votes. Speaking personally, I prefer to handle this as I've described as recovery actions (how long between retries, number of retries, recovery actions, etc) vary between applications and use cases. If the only time this situation occured was when the database was deliberately restarted and the connection could be reestablished we'd be fine.

For the record, if we do a database restart here we tend to do it in a maintenance window and restart all the attached applications (including EGs) as a matter of course.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
inMo
PostPosted: Thu May 14, 2015 5:07 am    Post subject: Reply with quote

Master

Joined: 27 Jun 2009
Posts: 216
Location: NY

2 cents ... The DB restart is only one way to cause a broken DB connection. Your higher level question should be focused on how you deal with broken DB connections within your framework. Some form of retry seems appropriate. How to retry, when to retry, and what to retry is all up to you and in your control. You as the designer get to tailor the experience to your particular use case(s). Seems like that is a pretty good design from "very long ago".
Back to top
View user's profile Send private message
mqjeff
PostPosted: Thu May 14, 2015 5:12 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

inMo wrote:
Seems like that is a pretty good design from "very long ago".


What I mean by "very long ago" is "either in the 2.1 timeframe or in the v5 timeframe".
Back to top
View user's profile Send private message
McueMart
PostPosted: Thu May 14, 2015 5:13 am    Post subject: Reply with quote

Chevalier

Joined: 29 Nov 2011
Posts: 490
Location: UK...somewhere

Vitor wrote:
This is why you have subflows, User Defined Nodes, patterns and all the other good stuff for adding common capabilities to flows.


Fortunately we do have a common procedure which we use for all DB interactions! So (hopefully) fixing this issue isn't the nightmare that it could be.

Vitor wrote:

Speaking personally, I prefer to handle this as I've described as recovery actions (how long between retries, number of retries, recovery actions, etc) vary between applications and use cases. If the only time this situation occured was when the database was deliberately restarted and the connection could be reestablished we'd be fine.


Generally if something has gone *really* bad, such as a DB crashing etc, we do want the flow to throw an exception - which will result in alerts being raised and investigation to occur. It is just for this simple case of 'db connection broken - lets retry it once' , that I think an in-code solution is appropriate.


Quote:
For the record, if we do a database restart here we tend to do it in a maintenance window and restart all the attached applications (including EGs) as a matter of course.


I am sure we could implement something similar (and we probably will..) - I just prefer to solve problems at the software level if feasibly possible.
Back to top
View user's profile Send private message
inMo
PostPosted: Thu May 14, 2015 5:16 am    Post subject: Reply with quote

Master

Joined: 27 Jun 2009
Posts: 216
Location: NY

Understood. Just confirming the reliable & performant DB connection model has been in place for some time. Hopefully it didn't come across as anything other.
Back to top
View user's profile Send private message
akil
PostPosted: Thu May 14, 2015 9:12 pm    Post subject: Reply with quote

Partisan

Joined: 27 May 2014
Posts: 338
Location: Mumbai

I think this behaviour is only on NIX, dead connection detection is to be implemented by the ODBC drivers (not the application / framework ).

On Windows, this should work as expected, with dead connections getting detected by the driver, and not getting thrown up to the application.
_________________
Regards
Back to top
View user's profile Send private message Visit poster's website
fjb_saper
PostPosted: Fri May 15, 2015 4:46 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

akil wrote:
I think this behaviour is only on NIX, dead connection detection is to be implemented by the ODBC drivers (not the application / framework ).

On Windows, this should work as expected, with dead connections getting detected by the driver, and not getting thrown up to the application.


Depending on your driver you may have to throw it back to the application. If the driver is not throwing it back to the application, it means that it can restore the UOW context when re-establishing a new connection after detecting the broken one...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
akil
PostPosted: Fri May 15, 2015 9:55 am    Post subject: Reply with quote

Partisan

Joined: 27 May 2014
Posts: 338
Location: Mumbai

Correct, windows fixed it with version 3, see questions 12/13 here https://support.microsoft.com/en-us/kb/169470.

On NIX, I see similar information in the data direct site, but haven't tried it. The defaults in the ODBC ini that is shipped does not seem to have the retry properties set. That could be a reason?
_________________
Regards
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » IIB's database connection management philosophy
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.