ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Large File Handling

Post new topic  Reply to topic
 Large File Handling « View previous topic :: View next topic » 
Author Message
kirank
PostPosted: Sun Feb 24, 2013 4:54 pm    Post subject: Large File Handling Reply with quote

Centurion

Joined: 10 Oct 2002
Posts: 136
Location: California

Hi,

We have a message flow that reads a file adds some Lookup values to file data and then writes a large file more than 5MB. To help performance we are reading record by record and then write the file and finish the file after last record. But we are seeing that the performance starts to degrade in writing files when the size grows. It is taking 4 hrs for size of 5MB. That is very slow for the business requirements. Is this some bug or are not setting some things properly. We are on Message Broker V 7.0.0.0

Regards

Kiran
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Sun Feb 24, 2013 6:10 pm    Post subject: Re: Large File Handling Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

kirank wrote:
Hi,

We have a message flow that reads a file adds some Lookup values to file data and then writes a large file more than 5MB. To help performance we are reading record by record and then write the file and finish the file after last record. But we are seeing that the performance starts to degrade in writing files when the size grows. It is taking 4 hrs for size of 5MB. That is very slow for the business requirements. Is this some bug or are not setting some things properly. We are on Message Broker V 7.0.0.0

Regards

Kiran

v7.0.0.0 is a little bit dated now. You will want to first upgrade to 7.0.0.5
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
kimbert
PostPosted: Mon Feb 25, 2013 6:04 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
To help performance we are reading record by record and then write the file and finish the file after last record.
Good.
Quote:
But we are seeing that the performance starts to degrade in writing files when the size grows. It is taking 4 hrs for size of 5MB. That is very slow for the business requirements. Is this some bug or are not setting some things properly.
How do you know that it is the FileOutput node that is taking the time? Could it be some other part of the flow?
Back to top
View user's profile Send private message
kirank
PostPosted: Mon Feb 25, 2013 9:14 am    Post subject: Reply with quote

Centurion

Joined: 10 Oct 2002
Posts: 136
Location: California

We are writing a trace file where we are capturing the timestamp for each record. We can see that it writes first 10,000 records in 5 minutes however next 10,000 records take 18 minutes, the subsequent 10,000 records take 34 minutes and so forth. In total for about 50,000 records it is taking 3hrs and 40 minutes.

We will apply patches or upgrade to V8 at some point but its not an option at this point. Is there any other way to improve the performance?

Regards

Kiran
Back to top
View user's profile Send private message
kirank
PostPosted: Mon Feb 25, 2013 4:34 pm    Post subject: Reply with quote

Centurion

Joined: 10 Oct 2002
Posts: 136
Location: California

After doing some additional digging I found that its not really the file node as Kimbert pointed. It is a Compute node that is taking more time. I enabled Accounting and Statistics and found that compute node was taking most of the time.
The Compute node has Loop which does lookup against Database table. There are 4 different SELECT statements for 4 different lookup values. So for 50,000 records these SELECT statements were gettting executed 200,000 times. I thought we can move these SELECTs outside of loop and do select just once and store it in Environmen. Then inside loop we can read from Environment by doing SELECT against Environment tree. However that approach is taking even longer time.

Is there any other better approach?

Regards

Kiran
Back to top
View user's profile Send private message
Vitor
PostPosted: Mon Feb 25, 2013 5:36 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

kirank wrote:
I thought we can move these SELECTs outside of loop and do select just once and store it in Environmen. Then inside loop we can read from Environment by doing SELECT against Environment tree. However that approach is taking even longer time.

Is there any other better approach?


You're sure that for the 50,000 records the Environment tree remains set and you're not making the 4 SELECT calls for each of the 50,000 records and adding them to the Environment tree where previously you were just making 4 SELECT calls for each of the 50,000 records?

You might want to consider a shared variable.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Tue Feb 26, 2013 4:53 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

You can also look at doing one select for four variables instead of four selects for one variable.

You should also engage your DBA to look at optimizing the selects on the database side.
Back to top
View user's profile Send private message
kimbert
PostPosted: Tue Feb 26, 2013 5:33 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
I thought we can move these SELECTs outside of loop and do select just once and store it in Environment. Then inside loop we can read from Environment by doing SELECT against Environment tree. However that approach is taking even longer time.
Why would you use an ESQL SELECT to extract the data from the Environment tree? The SELECT logic should have been done by the database.
If you have designed the system optimally then the Environment tree will contain a set of data values that are organised perfectly for the message flow. Then the message flow can just iterate over the results ( using a REFERENCE variable ) and generate the output very fast.
Back to top
View user's profile Send private message
souciance
PostPosted: Tue Feb 26, 2013 7:01 am    Post subject: Reply with quote

Disciple

Joined: 29 Jun 2010
Posts: 169

Wouldn't it be better to write a stored procedure/view and call that instead of including the sql code inside your compute node? I have noticed performance gain when doing this.
Back to top
View user's profile Send private message
mayheminMQ
PostPosted: Tue Feb 26, 2013 9:00 am    Post subject: Reply with quote

Voyager

Joined: 04 Sep 2012
Posts: 77
Location: UK beyond the meadows of RocknRoll

Shared Row variables are better than Environment as per my personal exprience handling large amounts of cached data.

Do the selects over the row variables and probably optimising your select query itself might save you precious time.

Did you try the approach of reading the whole file into memory and running through it? Try a POC and check the timings agains tyour current flow. (Assuming you have enough memory set in your EG to handle this.)
_________________
A Colorblind man may appear disadvantaged but he always sees more than just colors...
Back to top
View user's profile Send private message
longng
PostPosted: Tue Feb 26, 2013 10:10 am    Post subject: Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

kirank wrote:
After doing some additional digging I found that its not really the file node as Kimbert pointed. It is a Compute node that is taking more time. I enabled Accounting and Statistics and found that compute node was taking most of the time.
The Compute node has Loop which does lookup against Database table. There are 4 different SELECT statements for 4 different lookup values. So for 50,000 records these SELECT statements were gettting executed 200,000 times. I thought we can move these SELECTs outside of loop and do select just once and store it in Environmen. Then inside loop we can read from Environment by doing SELECT against Environment tree. However that approach is taking even longer time.

Is there any other better approach?

Regards

Kiran


A most common cause for these symptoms may have to do with excessive message tree traversals inside the loop(s). It would be a good idea to convert any variables to reference variables inside loops.

Consider the following scenario:

Code:

LOOP for 5000 times
SET InputRoot.MRM.layer1.layer2.layer3.layer4.Name = 'Jon'
SET InputRoot.MRM.layer1.layer2.layer3.layer4.Address = 'Oak street'
SET InputRoot.MRM.layer1.layer2.layer3.layer4.Phone = '123-456-7890'
...


The above ESQL fragment would force multiple tree traversals (InputRoot, MRM, layer1, layer2, layer3 and layer4) before arriving at a each variable for every loop iteration.

Code:

DECLARE layer4Ref REFERENCE TO  InputRoot.MRM.layer1.layer2.layer3.layer4;

LOOP for 5000 times
SET layer4Ref.Name = 'Jon'
SET layer4Ref.Address = 'Oak street'
SET layer4Ref.Phone = '123-456-7890'



With reference variable, the starting point is always layer4, hence it's much more efficient and it performs much better since there's no need to traverse from the root of the tree everytime.

As a matter of fact, I used this technique and was able to reduce the runtime of a flow from over two days down to 45 minutes!
Back to top
View user's profile Send private message
ah.khalafallah
PostPosted: Sun Mar 03, 2013 3:58 am    Post subject: Reply with quote

Newbie

Joined: 03 Mar 2013
Posts: 5

I think you need to describe how the flow is implemented & if you are splitting the logic on 2 flows over MQ or it's a single flow scenario,

also Consider Longng's Reference Solution but you need to have a field created at the beginning before referencing it,

P.S Longng, you can't "SET InputRoot"

but anyway the example could be changed to the following

Quote:

LOOP for 5000 times
SET OutputRoot.MRM.layer1.layer2.layer3.layer4.Name = 'Jon'
SET OutputRoot.MRM.layer1.layer2.layer3.layer4.Address = 'Oak street'
SET OutputRoot.MRM.layer1.layer2.layer3.layer4.Phone = '123-456-7890'
...


The above ESQL fragment would force multiple tree traversals (InputRoot, MRM, layer1, layer2, layer3 and layer4) before arriving at a each variable for every loop iteration.

Quote:

CREATE FIELD OutputRoot.MRM.layer1.layer2.layer3.layer4;
DECLARE layer4Ref REFERENCE TO OutputRoot.MRM.layer1.layer2.layer3.layer4;

LOOP for 5000 times
SET layer4Ref.Name = 'Jon'
SET layer4Ref.Address = 'Oak street'
SET layer4Ref.Phone = '123-456-7890'

_________________
Middleware Developer
Certified Websphere Message Broker v.7.0
Certified Websphere Transformation Extender v.8.2
Back to top
View user's profile Send private message MSN Messenger
longng
PostPosted: Sun Mar 03, 2013 3:21 pm    Post subject: Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

ah.khalafallah wrote:

P.S Longng, you can't "SET InputRoot"


You're spot on, while focusing upon the technique I used the wrong type of structure for the example!
Back to top
View user's profile Send private message
ah.khalafallah
PostPosted: Sun Mar 03, 2013 9:40 pm    Post subject: Reply with quote

Newbie

Joined: 03 Mar 2013
Posts: 5

No prob Longng you were pointing at a good point which I really saw what it differs.
So let's hope that this would differ
Back to top
View user's profile Send private message MSN Messenger
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Large File Handling
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.