Author |
Message
|
PEPERO |
Posted: Sun May 21, 2017 7:56 am Post subject: A Huge XML Document parsing |
|
|
Disciple
Joined: 30 May 2011 Posts: 177
|
Hi;
Suppose an input file contains an XML document in which there is one complex element as the file header ;meaning have some local element about the whole file ; and a hundred thousands occurences of another complex element as the body records. As each iteration of the message flow starting with the FILEINPUT node must consume one record, a hundred thousands plus one iteration is required parsing all of the file. What type of parser should i choose to recognise the header element or each of the body records and also validate them. Consider i don't want to use the whole file as a message.
I used the XMLNSC but as it contains a root element, i cannot use the 'PARSED RECORD SEQUENCE' for record detection. |
|
Back to top |
|
 |
timber |
Posted: Sun May 21, 2017 2:03 pm Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
Good problem description. You are not the first to ask about this. The answer is to change your message model from
Code: |
Message
sequence
Header (1/1)
Record*
Footer (1/1) |
to
Code: |
Message
Record*
choice
Header (1/1)
InnerRecord (1/1)
Footer (1/1) |
That way, the header is also a 'record' and will be reported first. Then, each inner record is reported as an invidivual 'record'. Finally, the footer gets reported. |
|
Back to top |
|
 |
PEPERO |
Posted: Sun May 21, 2017 8:08 pm Post subject: |
|
|
Disciple
Joined: 30 May 2011 Posts: 177
|
thanks for the given solution. |
|
Back to top |
|
 |
PEPERO |
Posted: Mon May 22, 2017 12:33 am Post subject: |
|
|
Disciple
Joined: 30 May 2011 Posts: 177
|
So as the originator makes and sends the message in XML format , I've to parse and validate each record separately. In the other side the input message is contained within a single Document element. How could i use the parsed record sequence rule for record detection when having the follow message format.
Code: |
Message
Record*
choice
Header (1/1)
InnerRecord (1/1)
|
Notice that the Message part is put into a Document element. |
|
Back to top |
|
 |
timber |
Posted: Mon May 22, 2017 11:55 pm Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
You are correct - my advice does not work for an XML document where the header and body are enclosed within a root tag. However, if you use the search button you should find lots of advice on how to deal with large messages. You will probably end up using the techniques described in this document:
https://www.ibm.com/developerworks/websphere/library/techarticles/0505_storey/0505_storey.html
but using XMLNSC instead of MRM (of course). There may even be a sample that demonstrates how to do it. |
|
Back to top |
|
 |
|