Author |
Message
|
RAJESHRAMAKRISHNAN |
Posted: Tue Oct 04, 2005 11:54 pm Post subject: Processing a large repeating XML Structure |
|
|
Voyager
Joined: 01 May 2004 Posts: 96
|
I have an XML document with too many repeating elements. I want to select only few elements from those repeating elements and concatenate in a string (Comma delimited). If I use Cardinality and a while loop it takes more that 1 hour to process to an 8 MB file.
I guess I have to use Select statements instead of while loop to over come this performance problem. But don't know how to implement this. Any help is greatly appreciated.
For example:
<Root>
<A>
<A1/>
<A2/>
</A>
<B>
<B1/>
<B2/>
</B>
</Root>
Here assume that A and B can repeat 1000 times. I want to pick only A1 and B1.This is just an example. |
|
Back to top |
|
 |
elvis_gn |
Posted: Wed Oct 05, 2005 12:46 am Post subject: |
|
|
 Padawan
Joined: 08 Oct 2004 Posts: 1905 Location: Dubai
|
Hi Rajesh,
You can use the SELECT by making a reference to the first A and B, and then doing a SELECT A1 FROM A[] ......
and similarly SELECT B1 FROM B[].....
I would also suggest using the mapping node by first creating a message set with say two variables "StoresAs" and "StoresBs"...
After that, map the input source of A1[All] to the string StoresAs with a comma.........
similarly for B, B1[All] map to StoresBs.....
I think this should work also..... |
|
Back to top |
|
 |
lillo |
Posted: Wed Oct 05, 2005 12:50 am Post subject: |
|
|
Master
Joined: 11 Sep 2001 Posts: 224
|
You could also try using REFERENCEs instead of CARDINALITY. There is a performance measure about this in the redbook called "Developing solutions in WebSphere MQ Integrator".
Best regards, _________________ Lillo
IBM Certified Specialist - WebSphere MQ |
|
Back to top |
|
 |
kimbert |
Posted: Wed Oct 05, 2005 12:55 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
If I use Cardinality and a while loop it takes more that 1 hour to process to an 8 MB file. |
That's not too surprising. Using CARDINALITY on large messages is a bad idea. It forces the entire message to be parsed so that the cardinality can be calculated. In your case, that uses a lot of syntax elements, a lot of processing, and a lot of memory.
There is a standard way of dealing with large XML messages, and it is published on DeveloperWorks: http://www-128.ibm.com/developerworks/websphere/library/techarticles/0505_storey/0505_storey.html
The general idea is that you parse one repeat at a time, generate the output, then delete all the intermediate nodes from the tree and process the next repeat.
I presume that you are building up your output string using ESQL. You might want to consider defining a message set and using the MRM TDS physical format to do the transformation from XML to string. |
|
Back to top |
|
 |
elvis_gn |
Posted: Wed Oct 05, 2005 1:12 am Post subject: |
|
|
 Padawan
Joined: 08 Oct 2004 Posts: 1905 Location: Dubai
|
Hi kimbert,
You are definitely correct about the parsing and CARDINALITY etc...
But if the messages have 1000 repeating segments, dont you think it will eventually still take a lot of time, even if we went and did a REFERENCING only for the segments that we require ??? |
|
Back to top |
|
 |
kimbert |
Posted: Wed Oct 05, 2005 2:19 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
It might still take a lot of time, yes. I suppose it depends on your exact requirements, and what you consider acceptable. You can certainly improve on 1 hour though!
In your example you said that you wanted to process only A1 and B1. Since B1 occurs about half-way through your message, you cannot avoid parsing the first half of the message, but you can avoid the memory overhead of n * 1000 syntax elements being created and left in the message tree. |
|
Back to top |
|
 |
|