ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » 5MB single message (multiple MRM recs) performance problem

Post new topic  Reply to topic Goto page 1, 2  Next
 5MB single message (multiple MRM recs) performance problem « View previous topic :: View next topic » 
Author Message
Marek
PostPosted: Mon Jul 10, 2006 3:14 am    Post subject: 5MB single message (multiple MRM recs) performance problem Reply with quote

Apprentice

Joined: 30 Jun 2004
Posts: 32
Location: Edinburgh

I have a 5MB single message containing 60,000 records in MRM format which is being consumed from an input Q.

I need to loop through the message translating a code on each record from one format into another and then send the same single message onto an output Q (size unchanged).

I'm able to do this without issues for smaller sized messages (under 2MB). When I process the 5MB message the process either fails or runs for a very long time. I have tried 2 solutions:

    1. Copying InputRoot.MRM to OutputRoot.MRM, looping through the OutputRoot MRM and outputing the transformed message (result: takes 15mins before the Execution Group abends on reaching our limit of 390MB).

    2. Looping through the InputRoot MRM, building a string and CASTing it as a BLOB message outside of the loop (result: takes approx. 3 hours to process, high CPU usage, moderate memory usage).


The code uses LAST MOVE to traverse the MRM records. There is no CARDINALITY statement.

Outputing multiple messages is not an option (although this does work 'nicely') nor is changing to XML format and processing as a LAST MOVE with DELETE of last processed record technique.

Does anyone have any advice?

Thanks in advance.
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Mon Jul 10, 2006 2:26 pm    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Is all transformation in one compute node or many?
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
vk
PostPosted: Mon Jul 10, 2006 5:07 pm    Post subject: Reply with quote

Partisan

Joined: 20 Sep 2005
Posts: 302
Location: Houston

You can split the initial message into smaller messages, transform the smaller messages separately and then merge the transformed messages before senfding it to the destination queue.

Try using Aggregate Request Reply for the split regroup. I have used this for handling large messages and there has been remarkable improvement in performance.

Regards,
VK.
Back to top
View user's profile Send private message
kimbert
PostPosted: Tue Jul 11, 2006 12:08 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
nor is changing to XML format and processing as a LAST MOVE with DELETE of last processed record technique
I'm a bit puzzled by this statement. I assume you are referring to the pattern described in this doc http://www-128.ibm.com/developerworks/websphere/library/techarticles/0505_storey/0505_storey.html
...but this pattern works with any domain, not just XML.
Back to top
View user's profile Send private message
zpat
PostPosted: Tue Jul 11, 2006 12:45 am    Post subject: Reply with quote

Jedi Council

Joined: 19 May 2001
Posts: 5866
Location: UK

Here's some advice I gave a developer with a similar problem - some of it may be relevant:

Quote:
1 -Structure the logic to reduce the function calls inside loops - for example than nesting COALESCE and CAST - perform the COALESCE first, test the result for being blank and issue CAST only if non-blank.

What's important here to keep in mind is that compact coding (eg nested functions) do not equate to machine efficiency, and adding a conditional test (generally low overhead) which would eliminate a large proportion of the calls to another function - may well have a beneficial effect overall - even though it "looks" like more code. So try to issue "expensive" function calls only when needed, using IF THEN tests to see if it can be bypassed.

2 - Minimise the number of times the large message tree is referenced, especially with subscripts.

Assigning values to another variable or structure and then performing operations on this will avoid the need to lookup elements in large message trees. Locating an element in a large tree is exponentially slow and the number of times a statement referencing a large message tree is executed should be minimised.

So values can be assigned to other variables or structures (non subscripted). Bear in mind the internal formats are not XML or MRM specific but only converted to such by the parser when a message is input or output.

Temporary variables can be created for this purpose - there is no need to make them part of the output message tree since the values don't need to persist across nodes. But If they are part of the message tree - then declare them at the start of the tree to reduce the traversing required to locate them. Avoiding as many references as possible to the large message tree should all help to conserve CPU resources.

3 - Make use of * or [ ] to perform operations on all elements of an array without using a loop.

In some cases a loop can be replaced with a statement using .* or [ ] or such like ESQL syntax - this makes the loop "internal" to MQSI and should reduce the overhead associated with the individual iteration of statements. It seems to be that finding the position in the tree structure is what is taking the time. Every time the need to perform this in ESQL is removed, a CPU saving should be seen.

I believe that this may be an exception to the "compact code is not more efficient" rule above.

4 - Use reference variables to index into arrays or trees (WMQI 2.1 and above).

5 - Using several smaller working arrays rather than a large one. Construct the final output at the end - if this will reduce the number of times elements long way down a tree (eg over 1000 deep) are referenced it will improve performance. In fact I suspect if the intermediate arrays are even smaller it will be even better - it should be possible to set various values to see which one gives the best outcome.
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Tue Jul 11, 2006 2:47 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

The other thing to think about is, if you are not radically transforming the structure of the message - but only changing a single field or a small set of recognizable fields, it may be faster to create a simplified MRM structure that has large filler fields instead of individual fields and groups.

Or even just process the message as a BLOB and do POSITION, SUBSTRING stuff.

But really, the processing differences between 2MB and 5MB shouldn't be this huge.

Can we see some of your code?
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
zpat
PostPosted: Tue Jul 11, 2006 3:41 am    Post subject: Reply with quote

Jedi Council

Joined: 19 May 2001
Posts: 5866
Location: UK

It's the number of elements in the array that makes the difference. It seems that the ESQL internal logic steps through the array sequentially and this becomes exponentially inefficient.

Using reference variables can make a big difference.
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Tue Jul 11, 2006 3:53 am    Post subject: Re: 5MB single message (multiple MRM recs) performance probl Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Zpat -
I'd taken
Marek wrote:
The code uses LAST MOVE to traverse the MRM records. There is no CARDINALITY statement.


to mean that Marek is using References.

So in this case, the size of the array shouldn't make *this much* difference.

If not, yes - using references instead of indexes is always the first thing to change when tuning ESQL for performance.

Usually the second thing is to switch to manipulating data in Environment, rather than in OutputRoot. But this only really makes a huge difference if you have spread your transformation out over several nodes, because it saves on tree copies.
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
Marek
PostPosted: Tue Jul 11, 2006 7:51 am    Post subject: Reply with quote

Apprentice

Joined: 30 Jun 2004
Posts: 32
Location: Edinburgh

Ref.

jefflowrey wrote:
Is all transformation in one compute node or many?


Yes, all the logic is in 1 compute node.

Thanks.
Back to top
View user's profile Send private message
Marek
PostPosted: Tue Jul 11, 2006 7:56 am    Post subject: Reply with quote

Apprentice

Joined: 30 Jun 2004
Posts: 32
Location: Edinburgh

Ref.

vk wrote:
You can split the initial message into smaller messages, transform the smaller messages separately and then merge the transformed messages before senfding it to the destination queue.

Try using Aggregate Request Reply for the split regroup. I have used this for handling large messages and there has been remarkable improvement in performance.

Regards,
VK.


Many thanks for this suggestion. Looks promising. I'm working on it at the moment although I'm needing to use the manual quite a bit so progress is a little slow. I note that I will still need to re-assemble the messages after the AggregateReply node ... so I hope that this will not be an issue.
Back to top
View user's profile Send private message
Marek
PostPosted: Tue Jul 11, 2006 8:02 am    Post subject: Reply with quote

Apprentice

Joined: 30 Jun 2004
Posts: 32
Location: Edinburgh

Ref.

kimbert wrote:
Quote:
nor is changing to XML format and processing as a LAST MOVE with DELETE of last processed record technique
I'm a bit puzzled by this statement. I assume you are referring to the pattern described in this doc http://www-128.ibm.com/developerworks/websphere/library/techarticles/0505_storey/0505_storey.html
...but this pattern works with any domain, not just XML.


Yes, I'm referring to the very same pattern. You are correct that this also works with MRMs and I altered my solution to take this into account although there is still no improvement. Thanks for pointing this out.
Back to top
View user's profile Send private message
Marek
PostPosted: Tue Jul 11, 2006 8:28 am    Post subject: Reply with quote

Apprentice

Joined: 30 Jun 2004
Posts: 32
Location: Edinburgh

Guys, sincere thanks for the various tips and suggestions. It's much appreciated.

As requested I've appended the code from the Compute Node (please note that the one-off access to Oracle has no impact as when all the relevant lines are commented out the issue still remains).


Code:
-- Enter SQL below this line.  SQL above this line might be regenerated, causing any modifications to be lost.
DECLARE   scsvChar, movementType, recordDelimiter CHAR;
DECLARE index INT;

SET recordDelimiter = trim(CAST(x'0D0A' as CHAR CCSID 819)); -- carraige line feed
SET index = 1;

-- Create XML Tree result set from Oracle DB
SET Environment.Variables.SQL.Translation[] =
      (SELECT T.TT_ID, T.FROM_CODE, T.TO_CODE
       FROM Database.SDS_TRANSLATION AS T
       WHERE T.TR_ID = 9
       AND   T.TT_ID IN(1,2));

CREATE LASTCHILD OF Environment.Variables DOMAIN 'MRM' NAME 'InputMessage';
SET Environment.Variables.InputMessage = InputBody;
DECLARE MyPointer REFERENCE TO Environment.Variables.InputMessage.STATEMENT_LINE[1];

WHILE LASTMOVE(MyPointer) = TRUE DO   
   -- Scan XML Tree
        SET movementType = THE (SELECT ITEM T.TO_CODE
                                FROM Environment.Variables.SQL.Translation[] AS T
                                WHERE T.FROM_CODE = MyPointer.MOVEMENT_TYPE
                                AND T.TT_ID = 2);
                               
   IF index = 1 THEN
      -- first record   
      set scsvChar =    movementType || MyPointer.UNIT_TYPE || MyPointer.LTBF_ID || MyPointer.SEC_BLOCK || MyPointer.PRODUCT_GROUP ||
            MyPointer.POLICY_NO || MyPointer.PROCESSED_DATE || MyPointer.SOURCE_SYSTEM || MyPointer.NO_OF_UNITS ||
            MyPointer.AMOUNT;
   ELSE      
      -- all other records
      set scsvChar =    scsvChar || recordDelimiter || movementType || MyPointer.UNIT_TYPE || MyPointer.LTBF_ID || MyPointer.SEC_BLOCK || MyPointer.PRODUCT_GROUP ||
            MyPointer.POLICY_NO || MyPointer.PROCESSED_DATE || MyPointer.SOURCE_SYSTEM || MyPointer.NO_OF_UNITS ||
            MyPointer.AMOUNT;
   END IF;
   SET index = index + 1;
       
        DECLARE LastProcessedRow REFERENCE TO MyPointer;
        MOVE MyPointer TO Environment.Variables.InputMessage.STATEMENT_LINE[2];
        DELETE FIELD LastProcessedRow;
END WHILE;

SET OutputRoot.BLOB.BLOB = CAST(scsvChar AS BLOB CCSID InputRoot.Properties.CodedCharSetId ENCODING InputRoot.MQMD.Encoding);
Back to top
View user's profile Send private message
kimbert
PostPosted: Tue Jul 11, 2006 8:44 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
I need to loop through the message translating a code on each record from one format into another and then send the same single message onto an output Q (size unchanged).

Are there any sequences of fixed-length fields in your message definition. If so, maybe these could be concatenated into a single fixed-length string which will then parse a little faster. It's a bit of a hack, but if you really want the speed, it might gain you something. If your message consisted entirely of fixed-length strings, your message definition would reduce to three fields ( prefix, the one you're interested in, and trailer ).
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Tue Jul 11, 2006 2:28 pm    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Can you adjust your Environment.Variables.SQL.Translation[] table so that you don't have to do the selects inside the loop?

That is, can you create a real index table so you can say something like
Code:
movementType = Environment.Variables.SQL.NewTranslation[i];

Or even use a reference pointer into that table.
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
Marek
PostPosted: Wed Jul 12, 2006 6:54 am    Post subject: Reply with quote

Apprentice

Joined: 30 Jun 2004
Posts: 32
Location: Edinburgh

Ref.

kimbert wrote:
Quote:
I need to loop through the message translating a code on each record from one format into another and then send the same single message onto an output Q (size unchanged).

Are there any sequences of fixed-length fields in your message definition. If so, maybe these could be concatenated into a single fixed-length string which will then parse a little faster. It's a bit of a hack, but if you really want the speed, it might gain you something. If your message consisted entirely of fixed-length strings, your message definition would reduce to three fields ( prefix, the one you're interested in, and trailer ).


Thanks for your suggestion but it makes no difference on this occasion. Perhaps the fact that I'm always working with the first array record (due to using the LAST MOVE/DELETE technique).
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » 5MB single message (multiple MRM recs) performance problem
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.