MQSeries.net :: View topic - IIB's database connection management philosophy

McueMart · Posted: Thu May 14, 2015 4:14 am Post subject:

Hi all,

This post is an account of my experience, and is an opportunity for the knowledgeable people of this forum (hopefully including the devs...) to give their take on it.

When broker is managing ODBC/JDBC connections, the following set of steps can occur:

1) A flow processes message 1 and successfully connects to ODBC DSN 'TEST'
2) The 'TEST' database is now restarted
3) The flow processes message 2, and this time the flow throws an exception along the lines of: "[unixODBC]Communication link failure , [unixODBC]Disconnect error.".
4) The flow processes message 3, and this time the broker will successfully reconnect to the datasource and function normally.

I have confirmed via PMR that this is the expected behavior. What im having a hard time understanding, is why it has been implemented like this, and what we can do about it.

Ideally I would like it to work like this:

1) A flow processes message 1 and successfully connects to ODBC DSN 'TEST'
2) The 'TEST' database is now restarted
3) A flow processes message 2. The broker detects that the ODBC connection is broken and attempts to reconnect it. The reconnected connection is then used to process the work - which succeeds.

So hopefully it's clear from this example that I don't understand why broker has to function like:

"Ohhh yea, I see this connection is broken - let me throw an exception to let you know this! After that I'll mark the connection as broken and clear this connection out of my pool so that future messages will attempt to reconnect".

From a user's perspective, I don't want broker throwing an exception when the connection is broken UNLESS it really cannot re-connect (i.e. the DB is down).

My thoughts about how can we work around this kind of error:

1) Implement logic down the 'catch' leg of a flow which detects this particular type of exception. *Non-transactionally* , send the original input message back to the input queue. The flow will then have to rollback (we cant commit any other work which might have been done via another DataSource).

2) Put procedures in place that any time a database is restarted, any execution groups which had connections to it need to also be restarted. This wouldn't fix the issue of the odd network blip causing a DB connection to be broken though.

3) Raise an RFE to ask IBM to re-visit their implementation, or at least provide a 'Clear connection pool' facility, so we can immediately clear the 'broken' connection pool after we know a database has been restarted. IBM has stated that this behavior has been in place for a long time, so I fear my chances of success if I go this route...

Couldn't find that this had been discussed previously (Lots of discussion about broker's (lack of) connection pooling , but nothing on this precise issue) - so any input is appreciated.