|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
General parsing of edifact interchanges |
« View previous topic :: View next topic » |
Author |
Message
|
jonasb |
Posted: Thu Jun 11, 2009 1:10 am Post subject: General parsing of edifact interchanges |
|
|
Apprentice
Joined: 20 Dec 2006 Posts: 49 Location: Sweden
|
Hi
When dealing with edifact messages one file (interchange) can contain
several messages. Our idea is to have general MessageSets that can parse any edifact interchange, we can then split the messages and process them one by one.
A typical message can look like this (simplified)
(This interchange cointains two messages)
UNA:+.? '
UNB+UNOA:1+FROM+TO+090610:0920+SOMEREF'
UNH+MSGREF+1:2+68+1:2'
BGM+MSGREF+1:2+68+1:2'
UNT+3+MSGREF'
UNH+MSGREF2+1:2+68+1:2'
BGM+MSGREF2+1:2+68+1:2'
UNT+3+MSGREF2'
UNZ+1+SOMEREF'
A line is called a segment, and each segment starts with a three letter identifier (e.g. UNA) and ends with a "'".
We would like to be able to parse these type of interchanges into a structure like this one.
INTERCHANGE
UNA(0,1)
UNB(1,1)
MESSAGE(0,-1)
UNH(1,1)
"ANY"(0,-1) <ANY SEGMENT EXCEPT UNT/UNZ>(0,-1)
UNT(1,1)
UNZ(1,1)
We have not been able to find anything that would match all types of
segments except UNT/UNZ. In our attempts, the "ANY" segment also consume UNT, UNZ and UNH, resulting in something like this:
INTERCHANGE
UNA = UNA:+.? '
UNB = UNB+UNOA:1+FROM+TO+090610:0920+SOMEREF'
MESSAGE
UNH = UNH+MSGREF+1:2+68+1:2'
"ANY" = BGM+MSGREF+1:2+68+1:2'
"ANY" = UNT+3+MSGREF'
"ANY" = UNH+MSGREF2+1:2+68+1:2'
"ANY" = BGM+MSGREF2+1:2+68+1:2'
"ANY" = UNT+3+MSGREF2'
"ANY" = UNZ+1+SOMEREF'
A way of doing it is of course to name all 24^3 possible tags (excluding UNT and UNZ), but that does not seem like a good solution....
If anyone has any pointers they would be much appreciated.
Kind Regards,
contact admin _________________ contact admin |
|
Back to top |
|
 |
kimbert |
Posted: Thu Jun 11, 2009 2:45 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
We have not been able to find anything that would match all types of
segments except UNT/UNZ. In our attempts, the "ANY" segment also consume UNT, UNZ and UNH, resulting in something like this: |
I think data patterns are the only solution here:
- Wrap the 'ANY' in a sequence with Data Element Separation set to 'Use Data Pattern'.
- On the UNA segment, set the data pattern to
This means U then N then A, then any number of [any character except '] then '
- Add a similar data pattern to every other segment
- On the ANY, set the data pattern to
Code: |
([^U]|U[^N]|UN[^T])[^']*' |
. This means ('not U, or U then not N, or UN then not T), then any number of [any character except '] then '[/i]
You could continue to use Tagged Delimited in the rest of the message definition, but it might be easiest to change the entire message definition to use data patterns. |
|
Back to top |
|
 |
jonasb |
Posted: Thu Jun 11, 2009 3:14 am Post subject: |
|
|
Apprentice
Joined: 20 Dec 2006 Posts: 49 Location: Sweden
|
Some bad formatting in my last post, this will make it more clear:
INTERCHANGE
....UNA(0,1)
....UNB(1,1)
....MESSAGE(0,-1)
........UNH(1,1)
........"ANY"(0,-1) <ANY SEGMENT EXCEPT UNT/UNZ>(0,-1)
........UNT(1,1)
....UNZ(1,1) _________________ contact admin |
|
Back to top |
|
 |
kimbert |
Posted: Thu Jun 11, 2009 3:24 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
Some bad formatting in my last post, this will make it more clear: |
You should use [code] tags - easier for you, and even more readable for us.
Your update doesn't really alter the problem. The general approach which I suggested should still apply. |
|
Back to top |
|
 |
jonasb |
Posted: Thu Jun 11, 2009 3:36 am Post subject: |
|
|
Apprentice
Joined: 20 Dec 2006 Posts: 49 Location: Sweden
|
Just to close this thread. Kimbert, thanks for your input, it works just fine now!!
And your absolutely right, I should use code tags. My only excuse is that my title is "novice" :-)
This is what i did:
Code: |
INTERCHANGE
UNA(0,1)
UNB(1,1)
MESSAGE(0,-1)
UNH(1,1)
MESSAGEBODY(0,1) [Data Element Separation = Use Data Pattern]
"ANY"(1,-1) [Data Pattern = ([^U]|U[^N]|UN[^T])[^']*']
UNT(1,1)
UNZ(1,1)
|
_________________ contact admin |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|