Author |
Message
|
nmaddisetti |
Posted: Wed Sep 16, 2009 4:30 am Post subject: Doubt on using Parse Record Sequense |
|
|
Centurion
Joined: 06 Oct 2004 Posts: 145
|
Hi All,
I used parse record sequense for the file having repeating record structures and I am able to read record by record.
eg:
<RootTag><Child1></Child1><Child2></Child2></RootTag>
<RootTag><Child1></Child1><Child2></Child2></RootTag>
<RootTag><Child1></Child1><Child2></Child2></RootTag>
<RootTag><Child1></Child1><Child2></Child2></RootTag>
I have gone through some documents and came to know that for Tagged Delimeter Structure we can use parse record sequense.
eg:
Header:ele1,ele2
Record:ele3,ele4
Record:ele5,ele6
Record:ele7,ele8
Trailer:ele9,ele10
Now my requirement is I have an XML file in which there are repeating structures after 5 levels from the Root tag. Can I read these repeating structures record by record. If possible can some one through some light on this message modeling or any documentation links.
<ROOTTAG>
<Child1></Child1>
<Child2></Child2>
<Child3></Child3>
<CHILD4>
<CHILD5>
<CHILD6>
<CHILD7>
<CHILD8>
<Child9></Child9><Child10></Child10>
</CHILD8>
<CHILD8>
<Child9></Child9><Child10></Child10>
</CHILD8>
<CHILD8>
<Child9></Child9><Child10></Child10>
</CHILD8>
<CHILD8>
<Child9></Child9><Child10></Child10>
</CHILD8>
<CHILD8>
<Child9></Child9><Child10></Child10>
</CHILD8>
<CHILD8>
<Child9></Child9><Child10></Child10>
</CHILD8>
<CHILD8>
<Child9></Child9><Child10></Child10>
</CHILD8>
</CHILD7>
</CHILD6>
</CHILD5>
</CHILD4>
<Child2></Child2>
</ROOTTAG>
Thanks in Advance,
Venkat. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Sep 16, 2009 4:41 am Post subject: Re: Doubt on using Parse Record Sequense |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
nmaddisetti wrote: |
Can I read these repeating structures record by record. If possible can some one through some light on this message modeling or any documentation links.
|
If you did that you'd run the risk that the record fragment wouldn't be a well-formed XML document. Even when it worked, you'd waste a lot of time/resources parsing each line as if it's a new XML document.
What's wrong with parsing the entire document in the conventional way, then read off the repeating structures as you describe from the message tree; i.e. the traditional way to process XML? _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
nmaddisetti |
Posted: Wed Sep 16, 2009 4:57 am Post subject: |
|
|
Centurion
Joined: 06 Oct 2004 Posts: 145
|
Vitor,
Thanks for the quick reply.
As of now we are able to process 70 MB file with conventional parsing and reading data from that.
Now we have 150 Mb file and we recevied OutOfMemory exception then in the test environment we increased ( by the way we are reading this fie using JCN then processing, this is because we will get the file name and location in another message in a queue ) JVM heap size to one and half GB ( previously for 70 MB file it is one GB) and using test flow I am able to read 150 MB file and parse into XML then write onto output folder using FileOutput node.
Now we have 150 MB file and tomorrow it may 250 MB file.
So client dont want to proceed furthur in this approach as to keep on increasing JVM is not good.
Can you please suggest me any better way of handling this kind of big files.
Thanks,
Venkat. |
|
Back to top |
|
 |
Vitor |
Posted: Wed Sep 16, 2009 5:10 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
nmaddisetti wrote: |
Can you please suggest me any better way of handling this kind of big files.
|
I suggest you search this forum - there have been a number of posts on handling large messages and somewhere there's an IBM white paper on it. Simplistically it's about exploiting the way WMB parses XML on demand and pruning the message tree as you read it. Be aware I've no idea if you can do this in a JCN; the examples I've seen and used are in ESQL. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
nmaddisetti |
Posted: Wed Sep 16, 2009 8:14 am Post subject: |
|
|
Centurion
Joined: 06 Oct 2004 Posts: 145
|
Hi,
I have gone through many documents like file handling in WMB 6.1 and searched this forum regarding large file handling and how to write code for Reducing memory usage in WebSphere Business Integration Message Broker And We have taken take care all these aspects in our code.
We are using JCN just for reading the file and processing is done in esql only.
My actul question from the first post is can we read the repeating structures available in the inner level of XML using Parse Record Sequense option.( from my first post example i want to access child1,child2,child3 and under CHILD4,5,6 and 7 we have some tags those need to be accessed and then we have to access reeating records)
Reading and parsing complete XML then accessing records then deleting processed records stuff ruled out from client side and existing flow is like that only.
Can some one throgh some light like can I use parse record sequense for my requirement or not.
Thanks in advance,
Venkat. |
|
Back to top |
|
 |
kimbert |
Posted: Wed Sep 16, 2009 2:15 pm Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
My actul question from the first post is can we read the repeating structures available in the inner level of XML using Parse Record Sequense option |
I don't think so. 'Parsed Record Sequence' treats the input as a sequence of 'records'. Each record must be a complete message. |
|
Back to top |
|
 |
nmaddisetti |
Posted: Thu Sep 17, 2009 5:47 am Post subject: |
|
|
Centurion
Joined: 06 Oct 2004 Posts: 145
|
Hi Kimbert,
Thankq.
I have one more question.
As I understood from the documentation and from forum Fileinput node will read the file by opening the stream and it wont take much resources.
If I am reading the file using FileIput node with parsing option Ondemand. and in the esql If I am travelig with move next sibling in while lastmove and deleting the processed siblings. Can I compare this impementation with SAX implementation in java( I mean writing java code using SAX Parser which doesnt put complete file in memory If I am not wrong.) which one might be better.
Can you please share your thoughts.
We are planning to change the design If my understaning about the FileInput node & Ondemand parsing is correct.
Thanks in Advance.
Venkat. |
|
Back to top |
|
 |
kimbert |
Posted: Thu Sep 17, 2009 6:19 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
If I am travelig with move next sibling in while lastmove and deleting the processed siblings. Can I compare this impementation with SAX implementation in java |
SAX : input might or might not be streamed ( depends on the parser ). Parser reports one event at a time. Receiver can choose to discard events after processing them.
Your solution: Input *is* streamed ( FileInput node supports streaming ) so the input bit stream will be processed in small chunks. Your ESQL is ensuring that only one record's worth of data is parsed and held in memory at any one time. Processed records are discarded.
So they're pretty similar. Not sure what you need to change, if you're already doing what the 'Reducing memory usage' document says. |
|
Back to top |
|
 |
nmaddisetti |
Posted: Thu Sep 17, 2009 6:26 am Post subject: |
|
|
Centurion
Joined: 06 Oct 2004 Posts: 145
|
Hi Kimbert,
Thankq verymuch for your inputs.
I mean to say change required in design is currently inthe middle of the flow we are reading the file using JCN then processing in esql because of this reading using JCN the complete file is in memory.
Now I am planning to move the file using the same JCN to another folder then start reading it using FileInput node.
Thanks,
Venkat. |
|
Back to top |
|
 |
|