Author |
Message
|
nathanw |
Posted: Thu Oct 18, 2012 5:56 am Post subject: WMB V8 Processing More Efficiently causing Data issues |
|
|
 Knight
Joined: 14 Jul 2004 Posts: 550
|
Ok cokey my mucky mucks
AIX 6.1
MQ 7.0.1.8
MB 8.0.0.0
In previous versions of Broker we have a flow that references cached data. This is part of an overall interface.
So we upgrade to V8
The interface now fails a message randomly. Looking into the esql we create a reference to an Environment variable which is a cached value.
So we process data multiple times and no errors, then we fail because the reference has no value to refer to and therefore cannot build the xml properly and therefore no message.
The interface also updates the cached data so at times it will update and refresh the cache.
It looks like every so often we have a case of the interface causing it's own failure by carrying out an update and refresh while a read is in process or just after the update.
My questions are
a) has anyone else ever seen something like this
b) what was the remedy
I am fairly certain what the responses will be
As always any thoughts gratefully received. _________________ Who is General Failure and why is he reading my hard drive?
Artificial Intelligence stands no chance against Natural Stupidity.
Only the User Trace Speaks The Truth  |
|
Back to top |
|
 |
lancelotlinc |
Posted: Thu Oct 18, 2012 5:58 am Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
Move your cache to the new V8 cache, a Singleton, or solidDb.
If the new V8 cache is the culprit, open a PMR. _________________ http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Oct 18, 2012 6:10 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
You need to be at FixPack 1 of v8 to use the new cache.
Regardless, it sounds like you didn't wrap your cache accesses in BEGIN ATOMIC blocks. |
|
Back to top |
|
 |
Esa |
Posted: Thu Oct 18, 2012 6:13 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
lancelotlinc wrote: |
Move your cache to the new V8 cache, a Singleton, or solidDb.
If the new V8 cache is the culprit, open a PMR. |
I think the inbuilt cache comes in version 8.0.0.1?
I'm sure the cache is already using a Singleton, but does not lock the object properly. Other threads should not be allowed to access the cache while it is being updated. Depending on the language you used to implement the cache you must either use ESQL ATOMIC blocks or java synchronized...
The V8 cache should take care of this for you. Has anybody tested it yet?
Or disable additional instances of the flow. |
|
Back to top |
|
 |
NealM |
Posted: Thu Oct 18, 2012 6:21 am Post subject: |
|
|
 Master
Joined: 22 Feb 2011 Posts: 230 Location: NC or Utah (depends)
|
Interesting. We are in the process of preparing to move from v6.1 to v8.0.0.1 and use cache heavily; In fact my task this week is to work out moving from the support pac to the new global cache. This is all the more reason to do so.
But let me ask you, on your current flow (is it just one flow? Is it just one EG? Is it just one Broker? We have hundreds, across EGs and across Brokers), does your cache config node have Period before recache = -1, and does your cache put node (or your API) have Life of data = -1? I'm wondering if there is maybe a refresh issue on/with MB8.
Incidentally, one other reason to move to the new global cache is that broker development is dedicated to supporting it; The support pac supporter is also very helpful, witness how quickly he got an updated for v8 version out there (new global cache is new with FP 1), but he can only look into issues when he has free time from his main job. |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Oct 18, 2012 6:25 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
It is interesting to see at least two posters jump to conclusions about how the cache is implemented, when no information on that is given by the OP.
Again, the only obvious explanation for the errors mentioned is a failure to synchronize the access to the cache, making no assumptions about how the cache is implemented. |
|
Back to top |
|
 |
Esa |
Posted: Thu Oct 18, 2012 6:53 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
mqjeff wrote: |
Again, the only obvious explanation for the errors mentioned is a failure to synchronize the access to the cache, making no assumptions about how the cache is implemented. |
mqjeff, are you sure you are not jumping to conclusions  |
|
Back to top |
|
 |
NealM |
Posted: Thu Oct 18, 2012 6:54 am Post subject: |
|
|
 Master
Joined: 22 Feb 2011 Posts: 230 Location: NC or Utah (depends)
|
Quote: |
...jump to conclusion |
Not jumping to conclusions, trying to understand what Nathan's parameters are to compare to ours. |
|
Back to top |
|
 |
Esa |
Posted: Thu Oct 18, 2012 7:00 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
NealM wrote: |
Quote: |
...jump to conclusion |
Not jumping to conclusions, trying to understand what Nathan's parameters are to compare to ours. |
Yes you are, NealM, when you assumpt that the OP is using the support pack and not a homegrown cache implementation... |
|
Back to top |
|
 |
NealM |
Posted: Thu Oct 18, 2012 7:57 am Post subject: |
|
|
 Master
Joined: 22 Feb 2011 Posts: 230 Location: NC or Utah (depends)
|
Quote: |
Not jumping to conclusions, trying to understand what Nathan's parameters are to compare to ours. |
Quote: |
Yes you are, NealM, when you assumpt that the OP is using the support pack and not a homegrown cache implementation... |
Well, that would certainly be a parameter.....
But, one has to wonder why, if it was a homegrown solution, nathan would be asking...
Quote: |
a) has anyone else ever seen something like this |
I guess we just need to sit back and await clarification if it is forthcoming. |
|
Back to top |
|
 |
smdavies99 |
Posted: Thu Oct 18, 2012 9:58 pm Post subject: |
|
|
 Jedi Council
Joined: 10 Feb 2003 Posts: 6076 Location: Somewhere over the Rainbow this side of Never-never land.
|
If Nathan is using the code (or something based upon) what I think he is...
- it is wrapped in an Atomic structure
I have seen a problem occassionally on 7.0.0.3/7.0.0.4 (on windows) where the data held in a shared row is suddenly gone. Forcing a reload solves the problem. Yes I could open a PMR but
- it happens 1 message in say 500,000 or more
- The fix to the problem is simple
Code: |
IF COALESCE(<path to copied varibles>.<varible name>,-100) = -100 then
CALL Reload_Cache_Data(); -- forced reload of data from DB
End if;
|
This code has been working ever since the introduction of shared ESQL variables in V6.0.0.x (since 2006) _________________ WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995
Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions. |
|
Back to top |
|
 |
nathanw |
Posted: Thu Oct 18, 2012 11:43 pm Post subject: |
|
|
 Knight
Joined: 14 Jul 2004 Posts: 550
|
As an update, I did not write this solution only trying to resolve an issue.
The process is working but there is a random failure from time to time and I have been asked to find the cause and the solution.
This will work fine 5 times out of 5 and then fail the 6th work fine for 2 and then fail for 3 etc etc it does seem to be a random thing.
I have asked for more information on the incoming data to prove it is not an issue with data as I am leaning down that path, although i may be wrong. _________________ Who is General Failure and why is he reading my hard drive?
Artificial Intelligence stands no chance against Natural Stupidity.
Only the User Trace Speaks The Truth  |
|
Back to top |
|
 |
Esa |
Posted: Fri Oct 19, 2012 12:20 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
nathanw wrote: |
This will work fine 5 times out of 5 and then fail the 6th work fine for 2 and then fail for 3 etc etc it does seem to be a random thing.
|
nathan, are you using the support pack or is the cache implemented by yourself?
This behavior could be caused by a failure to initialize the cache or detect an uninitialized cache after an EG crasch (which could be caused by another flow in the same EG). Do you see anything in the system logs?
Excuse me for jumping into these conclusions |
|
Back to top |
|
 |
nathanw |
Posted: Fri Oct 19, 2012 1:31 am Post subject: |
|
|
 Knight
Joined: 14 Jul 2004 Posts: 550
|
This is a homegrown implementation as far as I can see. _________________ Who is General Failure and why is he reading my hard drive?
Artificial Intelligence stands no chance against Natural Stupidity.
Only the User Trace Speaks The Truth  |
|
Back to top |
|
 |
nathanw |
Posted: Fri Oct 19, 2012 1:46 am Post subject: |
|
|
 Knight
Joined: 14 Jul 2004 Posts: 550
|
@stephen
yes it is in Atomic blocks _________________ Who is General Failure and why is he reading my hard drive?
Artificial Intelligence stands no chance against Natural Stupidity.
Only the User Trace Speaks The Truth  |
|
Back to top |
|
 |
|