MQSeries.net :: View topic

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » DFDL Parsing

DFDL Parsing

« View previous topic :: View next topic »

Author

Message

kash3338

Posted: Sat Dec 06, 2014 10:29 pm Post subject: DFDL Parsing

Shaman

Joined: 08 Feb 2009
Posts: 709
Location: Chennai, India

Hi,

I have a requirement wherein I need to parse the below format of CSV message,

I use a FileInput node to get the file and have set the property of "Record detection" as "Parsed Record Sequence". The fields are of order - Unique-ID,Code,Value,Company

Code:

10001,XYZ,20,ABC
10001,LMN,20,ABC
10002,EAR,20,ABC
10002,QUG,20,ABC
10003,OPI,20,ABC

Each row is a record. There can be up-to 500 records in a file. I need to parse all records of same Unique-ID in a parsed record sequence. For example, all 10001 should be parsed first, I create a XML and propagate, then I need to parse all 10002 and create XML and propagate and so on.

How do I achieve this using DFDL parser? I tried creating a DFDL in below way,

Code:

RootRecord
Sequence
Record
Field1
Field2
Field3
Field4

And set the Initiator of sequence to the Field1 xpath. I am sure I am wrong in this, but would like to know how to proceed with this. Any help appreciated.

kimbert

Posted: Sun Dec 07, 2014 12:32 pm Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

DFDL is certainly capable of expressing those rules. But before I launch into a complex DFDL explanation...have you considered this solution:
- Define a 'record' as 1 line. So your DFDL model describes exactly one line. Then the FileInput node will return one line at a time from the file.
- In the message flow, add each line to a list of lines. Detect when the value of the first field changes and propagate.

This solution seems simpler than creating a DFDL model. And it could be safer from a memory usage point of view as well; what happens if the input file contains 1 million records with the same Unique-ID value?
_________________
Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too.

kash3338

Posted: Sun Dec 07, 2014 11:41 pm Post subject:

Shaman

Joined: 08 Feb 2009
Posts: 709
Location: Chennai, India

kimbert wrote:

But before I launch into a complex DFDL explanation...have you considered this solution:
- Define a 'record' as 1 line. So your DFDL model describes exactly one line. Then the FileInput node will return one line at a time from the file.
- In the message flow, add each line to a list of lines. Detect when the value of the first field changes and propagate.

Thanks Kimbert!

The approach defined above is exactly my current solution. But I had a view that if I could achieve this using the DFDL parser, it would be better in terms of performance and exception handling.

kimbert wrote:

This solution seems simpler than creating a DFDL model. And it could be safer from a memory usage point of view as well; what happens if the input file contains 1 million records with the same Unique-ID value?

For sure I know in my case that the maximum records for a single Unique-ID is not more than 10. Hence I felt the DFDL parser approach is better.

Is the DFDL for this very complex? I am pretty new to DFDL now

kimbert

Posted: Mon Dec 08, 2014 2:54 am Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

No, not very complex. Conceptually, you need to implement a do...while loop in your DFDL model. You can do it like this:

Code:

message
sequence
groupOfRecords maxOccurs='unbounded' dfdl:occursCountKind='implicit'
sequence
firstRecord minOccurs=1 maxOccurs=1
recordsWithSameUniqueID maxOccurs='unbounded' dfdl:occursCountKind='implicit' dfdl:discriminator='{./uniqueId = ../firstRecord/uniqueID}'

Please note: The model shown above is unlikely to work without some debugging. I have not tested it. However, the general approach is correct. You should be able to make this work by using the DFDL Test perspective and reading the DFDL Trace when things don't work.
_________________
Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too.

kash3338

Posted: Mon Dec 08, 2014 3:41 am Post subject:

Shaman

Joined: 08 Feb 2009
Posts: 709
Location: Chennai, India

Thanks for the suggestion Kimbert. I will certainly try out this method and update you back.

But, is this a better approach than doing it in ESQL? Considering the file size to reach up to 1K or 2K sometimes, which is going to be a better approach?

kimbert

Posted: Mon Dec 08, 2014 5:00 am Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

The files are tiny. No problem there.
It's up to you whether you use ESQL or DFDL. But you did ask

_________________
Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too.

Display posts from previous:

Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » DFDL Parsing

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP