ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Processing Large XML with Containers

Post new topic  Reply to topic Goto page 1, 2  Next
 Processing Large XML with Containers « View previous topic :: View next topic » 
Author Message
prasunad
PostPosted: Sun Aug 17, 2014 8:27 am    Post subject: Processing Large XML with Containers Reply with quote

Novice

Joined: 10 Jul 2014
Posts: 22

Hi Gurus,

Need some information on processing xml files with containers as if i read the full file, there is a limitation of 2GB.
http://www-01.ibm.com/support/knowledgecenter/SSKM8N_8.0.0/com.ibm.etools.mft.doc/ac55150_.htm

If i have xml files of format..
<data>
<hdr>
<x/>
<y/>
<hdr>
<pl>
<a/>
<b/>
<pl1>
<p/>
<q/>
<r/>
</pl1>
<pl2>
<p1/>
<q2/>
<r3/>
</pl2>
</pl>
<pl>
<a/>
<b/>
<pl1>
<p/>
<q/>
<r/>
</pl1>
<pl2>
<p1/>
<q2/>
<r3/>
</pl2>
</pl>
<pl>
<a/>
<b/>
<pl1>
<p/>
<q/>
<r/>
</pl1>
<pl2>
<p1/>
<q2/>
<r3/>
</pl2>
</pl>
</data>

Please advise if i could still use the fileinput node to slice the data from
<pl>..</pl>, primary doubt being it has multiple complex items and also had a container header.

Thanks
Back to top
View user's profile Send private message
Vitor
PostPosted: Mon Aug 18, 2014 4:55 am    Post subject: Re: Processing Large XML with Containers Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

prasunad wrote:
Need some information on processing xml files with containers as if i read the full file, there is a limitation of 2GB.


That limitation is at an OS level not a WMB level. If you've got an XML document larger than 2Gb you can't store it in most OS file systems.

Also you need to think seriously about your design. 2Gb in a single XML document? Really? Especially when (by your own admission) it contains a repeating structure that can be processed in isolation. What do you intend to do if you have a failure halfway through processing this monster? Shrug and restart from the beginning?

prasunad wrote:
Please advise if i could still use the fileinput node to slice the data from
<pl>..</pl>, primary doubt being it has multiple complex items and also had a container header.


No, the node won't slice it like that. You might be able to use a data pattern to extract the "containers" into separate records, but that will push the already slow processing and memory requirements through the roof and I would urge you not to risk it unless you have no other options.

Another option would be to buy WebSphere Transformation Extender which can slice files as you describe. You'd probably also need some consultancy to help you with the map development. This will be expensive.

You can use the advice here to post process it, but again I say you really need to rethink the design. If the file is already 2Gb, what's going to happen in a year when it's 3Gb? 4Gb? 40Gb?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Aug 18, 2014 5:03 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20697
Location: LI,NY

Have you tried parsed record processing the pl tag? What problems did you hit?

@Vitor It appears that the 2GB is an internal limit of the broker for the whole file set up. See in the RFE's or previous posts under my name for the RFE to remove that limit for streaming parsers (specially DFDL).
http://www.mqseries.net/phpBB2/viewtopic.php?t=67961&highlight=rfe
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Vitor
PostPosted: Mon Aug 18, 2014 5:45 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

fjb_saper wrote:
Have you tried parsed record processing the pl tag? What problems did you hit?


As I said above, parsed record processsing is a method of last resort, and may not help the OP if he needs header information and the contents of the pl tags.

fjb_saper wrote:
@Vitor It appears that the 2GB is an internal limit of the broker for the whole file set up. See in the RFE's or previous posts under my name for the RFE to remove that limit for streaming parsers (specially DFDL).
http://www.mqseries.net/phpBB2/viewtopic.php?t=67961&highlight=rfe
[/quote]

Well you learn something new every day.

I stand by my comments that a 2Gb XML file is a poor design choice. I can see the use case for a streaming file passing that limit (and you may expect my vote for your RFE momentarially) and I'm enthused that an OS can handle files of that size. But in this use case (non-streamed XML) and in this specific case (where the XML is that size purely due to the number of loosely coupled "container" objects) I continue to question the wisdom.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Aug 18, 2014 6:24 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20697
Location: LI,NY

100% about the (poor) design choice...
Have the architects go back to the drawing board...
Switch it from a file / batch oriented integration to an online/ message oriented integration?
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
prasunad
PostPosted: Mon Aug 18, 2014 7:26 am    Post subject: Reply with quote

Novice

Joined: 10 Jul 2014
Posts: 22

Thanks Guys, Have already raised the file size, and have been shown the example of the large processing, trying hard to explain on the single document and multiple document structures as wel;..
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Aug 18, 2014 7:38 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20697
Location: LI,NY

Ditch the root tags and you could process it as parsed record for each pl? Hope you don't have a single pl over 2 GB!
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Vitor
PostPosted: Mon Aug 18, 2014 7:43 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

I also repeat my question (which should be referred to whoever designed this) regarding what you're supposed to do if there's a processing failure in the middle of the file.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
prasunad
PostPosted: Mon Aug 18, 2014 7:50 am    Post subject: Reply with quote

Novice

Joined: 10 Jul 2014
Posts: 22

In case of an ideal scenario, where record by record splitting is possible, the error data will be propogated to error flows, but in this case there would be a transaction roll back, and rerun of file.
Back to top
View user's profile Send private message
smdavies99
PostPosted: Mon Aug 18, 2014 7:53 am    Post subject: Reply with quote

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

A spec dropped on my desk a few weeks ago for a requirement to process a load of data that was sent from a system via FTP. The Data was XML and the originating system even had the capability to send data via MQ (Client) yet the admins refused to enable it saying we know FTP and... well you can guess the rest.
Almost all of the communication into and out of the Broker system is via files and the users are already complaining about the timliness of the data on the system. Pah!
I've come to realize that in at least one member of the BRIC countries and despite the illusion they present to the outside world a good section of the IT business is deeply wedded to files and FTP in particular. Telling the Architects that there is a better way is just like talknig to a BRIC wall.
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.
Back to top
View user's profile Send private message
prasunad
PostPosted: Mon Aug 18, 2014 8:04 am    Post subject: Reply with quote

Novice

Joined: 10 Jul 2014
Posts: 22

I stand by you..
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Mon Aug 18, 2014 8:04 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20697
Location: LI,NY

(humor: upgradation to bric a brac is not possible)
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
mqjeff
PostPosted: Mon Aug 18, 2014 8:05 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

smdavies99 wrote:
A spec dropped on my desk a few weeks ago for a requirement to process a load of data that was sent from a system via FTP. The Data was XML and the originating system even had the capability to send data via MQ (Client) yet the admins refused to enable it saying we know FTP and... well you can guess the rest.
Almost all of the communication into and out of the Broker system is via files and the users are already complaining about the timliness of the data on the system. Pah!
I've come to realize that in at least one member of the BRIC countries and despite the illusion they present to the outside world a good section of the IT business is deeply wedded to files and FTP in particular. Telling the Architects that there is a better way is just like talknig to a BRIC wall.


time to start jiggling keys on the junction boxes.....
Back to top
View user's profile Send private message
Vitor
PostPosted: Mon Aug 18, 2014 8:07 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

prasunad wrote:
In case of an ideal scenario, where record by record splitting is possible


It's very, very possible. Just at the sending end, not at your end. There's no good reason why this all has to be sent as a single file, and of course once we start talking about this we start talking WMQ messages. Though of course most businesses are too firmly wedded to file technology to take that step.

prasunad wrote:
in this case there would be a transaction roll back, and rerun of file.


So what's that going to do to your SLA? You've got 2Gb of data in a single unit of work; how long is that going to take to roll back? How long is that going to take to [i]commit?[/]

Be clear I'm not having a go at you; this has been designed to be sent as a single file of excessive size. The designer, when he decided to send it as a single lump, should have factored into his design that using a single lump has consequences. He cannot legitimately design it as a single lump and then expect you to automagically split it up for him because a single lump has consequences.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
Vitor
PostPosted: Mon Aug 18, 2014 8:09 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

prasunad wrote:
In case of an ideal scenario, where record by record splitting is possible


Of course, record splitting within WMB becomes much more possible if the design abandons the use of XML & sends the data as a conventional tagged format. Then you feed it through a streaming WMB parser and job's done.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Processing Large XML with Containers
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.