ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » General parsing of edifact interchanges

Post new topic  Reply to topic
 General parsing of edifact interchanges « View previous topic :: View next topic » 
Author Message
jonasb
PostPosted: Thu Jun 11, 2009 1:10 am    Post subject: General parsing of edifact interchanges Reply with quote

Apprentice

Joined: 20 Dec 2006
Posts: 49
Location: Sweden

Hi

When dealing with edifact messages one file (interchange) can contain
several messages. Our idea is to have general MessageSets that can parse any edifact interchange, we can then split the messages and process them one by one.

A typical message can look like this (simplified)
(This interchange cointains two messages)

UNA:+.? '
UNB+UNOA:1+FROM+TO+090610:0920+SOMEREF'
UNH+MSGREF+1:2+68+1:2'
BGM+MSGREF+1:2+68+1:2'
UNT+3+MSGREF'
UNH+MSGREF2+1:2+68+1:2'
BGM+MSGREF2+1:2+68+1:2'
UNT+3+MSGREF2'
UNZ+1+SOMEREF'

A line is called a segment, and each segment starts with a three letter identifier (e.g. UNA) and ends with a "'".

We would like to be able to parse these type of interchanges into a structure like this one.

INTERCHANGE
UNA(0,1)
UNB(1,1)
MESSAGE(0,-1)
UNH(1,1)
"ANY"(0,-1) <ANY SEGMENT EXCEPT UNT/UNZ>(0,-1)
UNT(1,1)
UNZ(1,1)

We have not been able to find anything that would match all types of
segments except UNT/UNZ. In our attempts, the "ANY" segment also consume UNT, UNZ and UNH, resulting in something like this:

INTERCHANGE
UNA = UNA:+.? '
UNB = UNB+UNOA:1+FROM+TO+090610:0920+SOMEREF'
MESSAGE
UNH = UNH+MSGREF+1:2+68+1:2'
"ANY" = BGM+MSGREF+1:2+68+1:2'
"ANY" = UNT+3+MSGREF'
"ANY" = UNH+MSGREF2+1:2+68+1:2'
"ANY" = BGM+MSGREF2+1:2+68+1:2'
"ANY" = UNT+3+MSGREF2'
"ANY" = UNZ+1+SOMEREF'

A way of doing it is of course to name all 24^3 possible tags (excluding UNT and UNZ), but that does not seem like a good solution....

If anyone has any pointers they would be much appreciated.

Kind Regards,
contact admin
_________________
contact admin
Back to top
View user's profile Send private message
kimbert
PostPosted: Thu Jun 11, 2009 2:45 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
We have not been able to find anything that would match all types of
segments except UNT/UNZ. In our attempts, the "ANY" segment also consume UNT, UNZ and UNH, resulting in something like this:

I think data patterns are the only solution here:
- Wrap the 'ANY' in a sequence with Data Element Separation set to 'Use Data Pattern'.
- On the UNA segment, set the data pattern to
Code:
UNA[^']*'

This means U then N then A, then any number of [any character except '] then '
- Add a similar data pattern to every other segment
- On the ANY, set the data pattern to
Code:
([^U]|U[^N]|UN[^T])[^']*'
. This means ('not U, or U then not N, or UN then not T), then any number of [any character except '] then '[/i]

You could continue to use Tagged Delimited in the rest of the message definition, but it might be easiest to change the entire message definition to use data patterns.
Back to top
View user's profile Send private message
jonasb
PostPosted: Thu Jun 11, 2009 3:14 am    Post subject: Reply with quote

Apprentice

Joined: 20 Dec 2006
Posts: 49
Location: Sweden

Some bad formatting in my last post, this will make it more clear:

INTERCHANGE
....UNA(0,1)
....UNB(1,1)
....MESSAGE(0,-1)
........UNH(1,1)
........"ANY"(0,-1) <ANY SEGMENT EXCEPT UNT/UNZ>(0,-1)
........UNT(1,1)
....UNZ(1,1)
_________________
contact admin
Back to top
View user's profile Send private message
kimbert
PostPosted: Thu Jun 11, 2009 3:24 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
Some bad formatting in my last post, this will make it more clear:
You should use [code] tags - easier for you, and even more readable for us.

Your update doesn't really alter the problem. The general approach which I suggested should still apply.
Back to top
View user's profile Send private message
jonasb
PostPosted: Thu Jun 11, 2009 3:36 am    Post subject: Reply with quote

Apprentice

Joined: 20 Dec 2006
Posts: 49
Location: Sweden

Just to close this thread. Kimbert, thanks for your input, it works just fine now!!

And your absolutely right, I should use code tags. My only excuse is that my title is "novice" :-)

This is what i did:

Code:

INTERCHANGE
    UNA(0,1)
    UNB(1,1)
    MESSAGE(0,-1)
        UNH(1,1)
        MESSAGEBODY(0,1) [Data Element Separation = Use Data Pattern]
            "ANY"(1,-1)          [Data Pattern = ([^U]|U[^N]|UN[^T])[^']*']
        UNT(1,1)
    UNZ(1,1)


_________________
contact admin
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » General parsing of edifact interchanges
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.