|  | 
 
  
    | RSS Feed - WebSphere MQ Support | RSS Feed - Message Broker Support |  
 
  
	|    |  |  
  
	| DFDL Parsing | « View previous topic :: View next topic » |  
  	| 
		
		
		  | Author | Message |  
		  | kash3338 | 
			  
				|  Posted: Sat Dec 06, 2014 10:29 pm    Post subject: DFDL Parsing |   |  |  
		  | Shaman
 
 
 Joined: 08 Feb 2009Posts: 709
 Location: Chennai, India
 
 | 
			  
				| Hi, 
 I have a requirement wherein I need to parse the below format of CSV message,
 
 I use a FileInput node to get the file and have set the property of "Record detection" as "Parsed Record Sequence". The fields are of order - Unique-ID,Code,Value,Company
 
 
 
   
	| Code: |  
	| 10001,XYZ,20,ABC 10001,LMN,20,ABC
 10002,EAR,20,ABC
 10002,QUG,20,ABC
 10003,OPI,20,ABC
 
 |  
 Each row is a record. There can be up-to 500 records in a file. I need to parse all records of same Unique-ID in a parsed record sequence. For example, all 10001 should be parsed first, I create a XML and propagate, then I need to parse all 10002 and create XML and propagate and so on.
 
 How do I achieve this using DFDL parser? I tried creating a DFDL in below way,
 
 
   
	| Code: |  
	| RootRecord Sequence
 Record
 Field1
 Field2
 Field3
 Field4
 
 |  
 And set the Initiator of sequence to the Field1 xpath. I am sure I am wrong in this, but would like to know how to proceed with this. Any help appreciated.
 |  |  
		  | Back to top |  |  
		  |  |  
		  | kimbert | 
			  
				|  Posted: Sun Dec 07, 2014 12:32 pm    Post subject: |   |  |  
		  |  Jedi Council
 
 
 Joined: 29 Jul 2003Posts: 5543
 Location: Southampton
 
 | 
			  
				| DFDL is certainly capable of expressing those rules. But before I launch into a complex DFDL explanation...have you considered this solution: - Define a 'record' as 1 line. So your DFDL model describes exactly one line. Then the FileInput node will return one line at a time from the file.
 - In the message flow, add each line to a list of lines. Detect when the value of the first field changes and propagate.
 
 This solution seems simpler than creating a DFDL model. And it could be safer from a memory usage point of view as well; what happens if the input file contains 1 million records with the same Unique-ID value?
 _________________
 Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too.
 |  |  
		  | Back to top |  |  
		  |  |  
		  | kash3338 | 
			  
				|  Posted: Sun Dec 07, 2014 11:41 pm    Post subject: |   |  |  
		  | Shaman
 
 
 Joined: 08 Feb 2009Posts: 709
 Location: Chennai, India
 
 | 
			  
				| 
   
	| kimbert wrote: |  
	| But before I launch into a complex DFDL explanation...have you considered this solution: - Define a 'record' as 1 line. So your DFDL model describes exactly one line. Then the FileInput node will return one line at a time from the file.
 - In the message flow, add each line to a list of lines. Detect when the value of the first field changes and propagate.
 |  
 Thanks Kimbert!
 
 The approach defined above is exactly my current solution. But I had a view that if I could achieve this using the DFDL parser, it would be better in terms of performance and exception handling.
 
 
 
   
	| kimbert wrote: |  
	| This solution seems simpler than creating a DFDL model. And it could be safer from a memory usage point of view as well; what happens if the input file contains 1 million records with the same Unique-ID value? |  
 For sure I know in my case that the maximum records for a single Unique-ID is not more than 10. Hence I felt the DFDL parser approach is better.
 
 Is the DFDL for this very complex? I am pretty new to DFDL now
  |  |  
		  | Back to top |  |  
		  |  |  
		  | kimbert | 
			  
				|  Posted: Mon Dec 08, 2014 2:54 am    Post subject: |   |  |  
		  |  Jedi Council
 
 
 Joined: 29 Jul 2003Posts: 5543
 Location: Southampton
 
 | 
			  
				| No, not very complex. Conceptually, you need to implement a do...while loop in your DFDL model. You can do it like this: 
 
 
   
	| Code: |  
	| message
 sequence
 groupOfRecords maxOccurs='unbounded' dfdl:occursCountKind='implicit'
 sequence
 firstRecord minOccurs=1 maxOccurs=1
 recordsWithSameUniqueID maxOccurs='unbounded' dfdl:occursCountKind='implicit' dfdl:discriminator='{./uniqueId = ../firstRecord/uniqueID}'
 
 |  
 Please note:  The model shown above is unlikely to work without some debugging. I have not tested it. However, the general approach is correct. You should be able to make this work by using the DFDL Test perspective and reading the DFDL Trace when things don't work.
 _________________
 Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too.
 |  |  
		  | Back to top |  |  
		  |  |  
		  | kash3338 | 
			  
				|  Posted: Mon Dec 08, 2014 3:41 am    Post subject: |   |  |  
		  | Shaman
 
 
 Joined: 08 Feb 2009Posts: 709
 Location: Chennai, India
 
 | 
			  
				| Thanks for the suggestion Kimbert. I will certainly try out this method and update you back. 
 But, is this a better approach than doing it in ESQL? Considering the file size to reach up to 1K or 2K sometimes, which is going to be a better approach?
 |  |  
		  | Back to top |  |  
		  |  |  
		  | kimbert | 
			  
				|  Posted: Mon Dec 08, 2014 5:00 am    Post subject: |   |  |  
		  |  Jedi Council
 
 
 Joined: 29 Jul 2003Posts: 5543
 Location: Southampton
 
 | 
			  
				| The files are tiny. No problem there. It's up to you whether you use ESQL or DFDL. But you did ask
  _________________
 Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too.
 |  |  
		  | Back to top |  |  
		  |  |  
		  |  |  |  
  
	|    |  | Page 1 of 1 |  
 
 
  
  	| 
		
		  | 
 
 | You cannot post new topics in this forum You cannot reply to topics in this forum
 You cannot edit your posts in this forum
 You cannot delete your posts in this forum
 You cannot vote in polls in this forum
 
 |  |  |  |