ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Urgent! Very big XML processing

Post new topic  Reply to topic
 Urgent! Very big XML processing « View previous topic :: View next topic » 
Author Message
yaakovd
PostPosted: Thu Feb 11, 2010 9:16 am    Post subject: Urgent! Very big XML processing Reply with quote

Partisan

Joined: 20 Jan 2003
Posts: 319
Location: Israel

Hi

I have few scenarios require processing/generation of huge XML files 300-700 MB. According to my experience even 4 MB XML requires huge memeory allocation in MB.

Will appreciate best practice and patterns to handle:

1. Reading huge XML (in portions?)
2. Generation of big XML (e.g. from flat file)
3. Sorting within XML or generated output

Additional fact - client is Windows oriented and preferrably uses starter edition (limited to 2 CPU and single exeqution group).
_________________
Best regards.
Yaakov
SWG, IBM Commerce, Israel
Back to top
View user's profile Send private message Send e-mail
Gaya3
PostPosted: Thu Feb 11, 2010 9:28 am    Post subject: Re: Urgent! Very big XML processing Reply with quote

Jedi

Joined: 12 Sep 2006
Posts: 2493
Location: Boston, US

yaakovd wrote:
Hi

I have few scenarios require processing/generation of huge XML files 300-700 MB. According to my experience even 4 MB XML requires huge memeory allocation in MB.

Will appreciate best practice and patterns to handle:

1. Reading huge XML (in portions?)
2. Generation of big XML (e.g. from flat file)
3. Sorting within XML or generated output

Additional fact - client is Windows oriented and preferrably uses starter edition (limited to 2 CPU and single exeqution group).


XML in portions or splitting the same xml in to number of portions, but here we have to understand about the XML business Data.

say if you are getting number of records in a single XML, we could think of dividing those.
_________________
Regards
Gayathri
-----------------------------------------------
Do Something Before you Die
Back to top
View user's profile Send private message
Vitor
PostPosted: Thu Feb 11, 2010 10:28 am    Post subject: Re: Urgent! Very big XML processing Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

yaakovd wrote:
I have few scenarios require processing/generation of huge XML files 300-700 MB. According to my experience even 4 MB XML requires huge memeory allocation in MB.


This has been discussed a few times in here (The Search Facility Is Your Friend) and there's a developerworks article somewhere that talks about this.

In summary, make sure you have the parsing set to on demand, don't use [index] to access the XML (which you shouldn't really be doing anyway) and prune the tree once you've processed a given section.

Have fun.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
Vitor
PostPosted: Thu Feb 11, 2010 10:30 am    Post subject: Re: Urgent! Very big XML processing Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Gaya3 wrote:
say if you are getting number of records in a single XML, we could think of dividing those.


You'd still need to bring the entire message in so that you could PROPOGATE the individual records. But yes, this is a good way of handling the situation if there's no affinity between XML stanzas & doesn't contradict what I said above (in this example you'd remove the given record once it was propogated).
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
kimbert
PostPosted: Thu Feb 11, 2010 1:45 pm    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

http://www-128.ibm.com/developerworks/websphere/library/techarticles/0505_storey/0505_storey.html
Back to top
View user's profile Send private message
Vitor
PostPosted: Thu Feb 11, 2010 1:54 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

kimbert wrote:
http://www-128.ibm.com/developerworks/websphere/library/techarticles/0505_storey/0505_storey.html


This time I must remember to bookmark this!
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
yaakovd
PostPosted: Thu Feb 11, 2010 3:50 pm    Post subject: Reply with quote

Partisan

Joined: 20 Jan 2003
Posts: 319
Location: Israel

Hi ALL

thanks for replies and basics of working with mesage tree.
It really helps with 5 MB messages.

Of course I tried to find something helpfull on search.

My question if anybody had experience working with 500 MB?

Any idea how long it may take on 2 CPU / 8 GB WIN machine if at all...
I can think also about SAX based input plugin...
_________________
Best regards.
Yaakov
SWG, IBM Commerce, Israel
Back to top
View user's profile Send private message Send e-mail
fjb_saper
PostPosted: Thu Feb 11, 2010 7:16 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

yaakovd wrote:
Hi ALL

thanks for replies and basics of working with mesage tree.
It really helps with 5 MB messages.

Of course I tried to find something helpfull on search.

My question if anybody had experience working with 500 MB?

Any idea how long it may take on 2 CPU / 8 GB WIN machine if at all...
I can think also about SAX based input plugin...

In my experience a 500 MB message seldom contains a single atomic transaction. Cut your message down to single atomic transaction size and put those into the input queue of the real flow...

If you cannot use a file input node, do like Jeff & Vitor said see their link . Parsing on demand only, use references and prune each parsed node from the tree after propagation.

Have fun
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
jzhang2009
PostPosted: Fri Feb 12, 2010 5:05 pm    Post subject: re: large XML Reply with quote

Newbie

Joined: 12 Feb 2010
Posts: 1

Have you looked at vtd-xml, sounds like you definitely want to check it out?
Back to top
View user's profile Send private message
Amitha
PostPosted: Sat Feb 13, 2010 6:01 am    Post subject: Reply with quote

Voyager

Joined: 20 Nov 2009
Posts: 80
Location: Newyork

VTD-XML seems to improve XML parsing performance and memory usage compared to DOM or SAX. I think WMB XMLNSC parser is very good in performance and it is a C++ engine.In my view VTD-XML Parser is something which WESB can make use of, not WMB.
Back to top
View user's profile Send private message Send e-mail
newtobroker
PostPosted: Sat Feb 13, 2010 10:23 am    Post subject: Reply with quote

Novice

Joined: 04 Feb 2010
Posts: 23

one option that we are trying is to dynamically delete the tags of huge xmls as we complete its processing... not sure if it applies to your business requirement.

Thanks,
c*
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Urgent! Very big XML processing
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.