Author |
Message
|
deepak_paul |
Posted: Wed Dec 05, 2012 2:12 pm Post subject: Dynamic parser in Broker |
|
|
Centurion
Joined: 04 Oct 2008 Posts: 147 Location: US
|
All,
We are to trying to find some solution around parsing the content of a file dynamically.
Here is the scenario,
We have few healthcare members who send data with their own metadata(record delimiter, field delimiter, header lines, field count, end of file). Please see variations below. We want to be able to dynamically parse the message with the member specific rules. Please let me know how we can do this in Broker.
For healthcare member 1:
HeaderLine: 2
RecordDelimiter: <ENTER>
FieldDelimiter: <TAB>
FieldCount: 5
FieldOrder: MemberName PatientMRN encounterNumber Location Phone
EndOfFile: EOF
Sample message:
------------------
From member1
MemberName PatientMRN encounterNumber Location Phone
MEM1<TAB>9877787<TAB>868822<TAB>GA<TAB>404-765-4567<ENTER>
MEM1<TAB>9877319<TAB>868123<TAB>NY<TAB>214-098-1346<ENTER>
…
EOF
------------------
For healthcare member 2:
HeaderLines: 3
RecordDelimiter: @
FieldDelimiter: |
FieldCount: 5
FieldOrder: PatientMRN MemberName encounterNumber Phone Location
EndOfFile: EndOfFile
Sample message:
------------------
From member2
Date sent at 12-05-2012 13:12:56
Total records 3459
9871234|MEM2|09897722|678-712-1232|TX@98722332|MEM2|9876431|861219|214-098-1346|NY@...EndOfFile
------------------
For healthcare member 3:
HeaderLines: 1
RecordDelimiter: |
FieldDelimiter: ,
FieldCount: 5
FieldOrder: PatientMRN EncounterNumber MemberName Location Phone
EndOfFile: >>>End
Sample message:
------------------
From member3
9871234,09897722,MEM2,TX,678-712-1232|98722332,9876431,MEM2,NY,214-098-1346|...>>>End _________________ Regards
Paul |
|
Back to top |
|
 |
lancelotlinc |
Posted: Wed Dec 05, 2012 2:13 pm Post subject: |
|
|
 Jedi Knight
Joined: 22 Mar 2010 Posts: 4941 Location: Bloomington, IL USA
|
|
Back to top |
|
 |
deepak_paul |
Posted: Wed Dec 05, 2012 2:15 pm Post subject: |
|
|
Centurion
Joined: 04 Oct 2008 Posts: 147 Location: US
|
7.0.0.4. _________________ Regards
Paul |
|
Back to top |
|
 |
kash3338 |
Posted: Wed Dec 05, 2012 6:23 pm Post subject: |
|
|
Shaman
Joined: 08 Feb 2009 Posts: 709 Location: Chennai, India
|
One way you can do this, read the message as BLOB in your input and determine first which sort of message type it is and then set Parse the message against the appropriate Message Set in ESQL.
What is your input node? File or MQ? |
|
Back to top |
|
 |
kimbert |
Posted: Thu Dec 06, 2012 1:00 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
This would be easy if you were using DFDL. But that requires v8.
It is possible to model this using MRM with the TDS physical format.
1. Set the HL7-specific delimiters from the field values in the header using the 'Interpret value as' property on the TDS physical format. This property is only available when the object is a simple element of type xs:string.
2. Specify the dynamic ( variable ) delimiters using the 'mnemonics' listed here: http://publib.boulder.ibm.com/infocenter/wmbhelp/v8r0m0/topic/com.ibm.etools.mft.doc/ad09270_.htm
As always, it's best to start with a small example using one delimiter and increase the complexity gradually. |
|
Back to top |
|
 |
rekarm01 |
Posted: Thu Dec 06, 2012 3:16 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
kimbert wrote: |
1. Set the HL7-specific delimiters ... |
The sample messages are delimited, but they aren't HL7 messages. |
|
Back to top |
|
 |
kimbert |
Posted: Thu Dec 06, 2012 3:32 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
I realize that the OP's message format is not the usual HL7 2.x format. However, it does looks as if it uses a similar number of varying delimiters ( not too surprising, really ). The TDS facilities that make dynamic delimiters possible for HL7 2.x will also work for the OP's format. |
|
Back to top |
|
 |
deepak_paul |
Posted: Thu Dec 06, 2012 5:22 am Post subject: |
|
|
Centurion
Joined: 04 Oct 2008 Posts: 147 Location: US
|
kash3338 wrote: |
One way you can do this, read the message as BLOB in your input and determine first which sort of message type it is and then set Parse the message against the appropriate Message Set in ESQL.
What is your input node? File or MQ? |
I can dynamically set the message domain/set/type upfront and use File read node in the middle flow to read/parse the file content. But the problem is
1. These variations are numerous not just three. There are 3000 members who send data with their own fashion
2. Most important thing is these rules for metadata will change in future. If member n may start sending the data with delimiter as ':' from '|', we will change the rules in ILOG/WODM for that member and our code shoudl be able to retrieve the new delimiter and set the delimiter dynamically while parsing. _________________ Regards
Paul |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Dec 06, 2012 5:27 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Then you have excellent business justification to upgrade to Broker v8 and use DFDL. |
|
Back to top |
|
 |
deepak_paul |
Posted: Thu Dec 06, 2012 6:11 am Post subject: |
|
|
Centurion
Joined: 04 Oct 2008 Posts: 147 Location: US
|
kimbert wrote: |
This would be easy if you were using DFDL. But that requires v8.
It is possible to model this using MRM with the TDS physical format.
1. Set the HL7-specific delimiters from the field values in the header using the 'Interpret value as' property on the TDS physical format. This property is only available when the object is a simple element of type xs:string.
2. Specify the dynamic ( variable ) delimiters using the 'mnemonics' listed here: http://publib.boulder.ibm.com/infocenter/wmbhelp/v8r0m0/topic/com.ibm.etools.mft.doc/ad09270_.htm
As always, it's best to start with a small example using one delimiter and increase the complexity gradually. |
It is not just about field delimiters. Please look at the other variations like record delimiter, end of file and especially Field Order. Not sure if we can accommodation all these in DFDL. _________________ Regards
Paul |
|
Back to top |
|
 |
Vitor |
Posted: Thu Dec 06, 2012 6:47 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
deepak_paul wrote: |
Not sure if we can accommodation all these in DFDL. |
Ooo....fighting talk!
My bet's on kimbert...... _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
kimbert |
Posted: Thu Dec 06, 2012 7:34 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
deepak_paul said:
Quote: |
It is not just about field delimiters. Please look at the other variations like record delimiter, end of file and especially Field Order. Not sure if we can accommodation all these in DFDL. |
If we forget the 'Field Order ' thing for a moment, then these statements are true:
- MRM / TDS can model this data format. Even if there are several different delimiters/separators/terminators that can all vary independently from one message to another.
- DFDL can do anything that MRM can do. And quite a lot more besides.
The 'Field Order' field is a bit of a hacky way to design a data format. A tag would have been a lot easier to work with. A full implementation of DFDL can parse this format, but I cannot think of a solution that uses the current version of IBM DFDL. So what to do? Well, the message flow can help out here.
1. DFDL parses the 'FieldOrder' into an array of 5 strings
2. DFDL parses the body into an array of 5 strings
3. Message flow iterates over the 'FieldOrder' array and assigns the value to the appropriate field in OutputRoot ( or LocalEnvironment, or wherever you want your message tree ).
So, to summarise:
- both MRM and DFDL can deal with the varying delimiters
- In v8.0.0.1, neither MRM nor DFDL can make use of the 'FieldOrder' field
- ...future versions of IBM DFDL will almost certainly implement the missing features of DFDL, and allow FieldOrder to be handled as well. |
|
Back to top |
|
 |
deepak_paul |
Posted: Sun Dec 09, 2012 7:02 pm Post subject: |
|
|
Centurion
Joined: 04 Oct 2008 Posts: 147 Location: US
|
kimbert wrote: |
If we forget the 'Field Order ' thing for a moment, then these statements are true:
- MRM / TDS can model this data format. Even if there are several different delimiters/separators/terminators that can all vary independently from one message to another... |
Thanks for the explanation Kimbert. Can you please help me understand this clearly on above statement - How we can achieve this without creating all possible message definitions in the message set as there are lot more than 3000 combinations. _________________ Regards
Paul |
|
Back to top |
|
 |
kimbert |
Posted: Mon Dec 10, 2012 1:35 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
As stated in my earlier post, the dynamic delimiters can be handled as follows:
1. Set the HL7-specific delimiters from the field values in the header using the 'Interpret value as' property on the TDS physical format. This property is only available when the object is a simple element of type xs:string.
2. Specify the dynamic ( variable ) delimiters using the 'mnemonics' listed here: http://publib.boulder.ibm.com/infocenter/wmbhelp/v8r0m0/topic/com.ibm.etools.mft.doc/ad09270_.htm
Not sure what else I can say, without further input/questions from you! |
|
Back to top |
|
 |
|