ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Data pattern problem in message set

Post new topic  Reply to topic Goto page 1, 2  Next
 Data pattern problem in message set « View previous topic :: View next topic » 
Author Message
er_pankajgupta84
PostPosted: Tue Nov 17, 2009 2:12 pm    Post subject: Data pattern problem in message set Reply with quote

Master

Joined: 14 Nov 2008
Posts: 203
Location: charlotte,NC, USA

What would be equivalent of the following pattern for a Message Set - Data Pattern.

0001.*?<>

If the input string is 0001jjkfjw<>0001nhwfeff<>0001kfnkwenf<> then after parsing the records should be

0001jjkfjw<>
0001nhwfeff<>
0001kfnkwenf<>

I have tried giving the above regular expression in data patterns in message set but its given invalid pattern and if i try - 0001.*<>
then it given just one record after parsing i.e. the entire string.
Parsers does a greedy parsing for "*" so it takes the entire string as one record.

Any pointer is appreciated.
Back to top
View user's profile Send private message AIM Address Yahoo Messenger
kimbert
PostPosted: Tue Nov 17, 2009 2:26 pm    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Two important points here:
1. It's not just message broker that interprets regular expressions in this way. Any reg ex parser will do the same. Message broker's is actually based on the Xerces reg exp. engine.
2. I think you might be asking the wrong question anyway. You are asking us to make your solution work. You might get a better result if you describe the message format and how you want it to be parsed. It is quite possible that 'Use Data Pattern' is a really roundabout way to solve your problem.
Back to top
View user's profile Send private message
er_pankajgupta84
PostPosted: Tue Nov 17, 2009 3:25 pm    Post subject: Reply with quote

Master

Joined: 14 Nov 2008
Posts: 203
Location: charlotte,NC, USA

Thanks for your reply..

My actual problem is quite complex and i doubt on explaining the same in words..

I agree with you that it depends on the regex engine how to parse RE but it is not working as expected in a message set.

i need to generate a regular expression that can accept any character upto a given delimiter. In this case it is "<>". I am able to create a regular expression for that and that is working in java but not in message set.

I will try to explain my problem in a short:

I have a message that has 2 complex type record having 2 fields each. This records can occurs n number of times
1. First field is fixed length : say 5 characters and
2. Second field is of variable length.

So i have to use data patterns to decide the type of record and the "variable length delimited" data element separation at the record level to get the fields.

input is: 00001nlkvkjehnvjhnejvnnnsdk<>00001vbdhcv<>00002bccjkdcj<>00002jkjd<>

Now it has 2 occurences of each type of record (00001, 00002)
I have given Data element separation as "Use data pattern" on message level and "variable length delimited" at record level.

I cannot use TAG delimited as the TAG (00001 or 00002) is not fixed it could be (11111 and 11112). Its like it ends with 1 or 2 but can have any character for first 4 places.

so the regular exp that i mentioned earlier, in actual, will like:

[0-9]{4}1.*?<>

unfortunately it works in java but not in mset.
Back to top
View user's profile Send private message AIM Address Yahoo Messenger
er_pankajgupta84
PostPosted: Tue Nov 17, 2009 4:13 pm    Post subject: Reply with quote

Master

Joined: 14 Nov 2008
Posts: 203
Location: charlotte,NC, USA

I think the problem is with the "?" which we add to make an expression non greedy. Message set is not recognizing this character and generating an exception "Invalid pattern".
Back to top
View user's profile Send private message AIM Address Yahoo Messenger
kimbert
PostPosted: Tue Nov 17, 2009 6:02 pm    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Thanks - that's clear enough now.

The . (period) is too greedy because it matches *any* character, including the ones which are supposed to terminate the match.

You need to say
'not <, or < not followed by >'*

The reg ex that you need is
Code:
[0-9]{4}1([^<]|(<[^>]))*<>

or, if you are certain that '<' cannot occur in the data, you could shorten it to
Code:
[0-9]{4}1[^<]*<>
Back to top
View user's profile Send private message
er_pankajgupta84
PostPosted: Tue Nov 17, 2009 6:33 pm    Post subject: Reply with quote

Master

Joined: 14 Nov 2008
Posts: 203
Location: charlotte,NC, USA

Thanks ..

you got the problem right... the data may have the delimiter as part of it so i cannot use the second pattern suggested by you...

[0-9]{4}1([^<]|(<[^>]))*<> might work well but what if the delimiter is <||>

I specify only a part of my problem earlier...

in actual i have the delimiter as <||> and strings like <|, ||, |>, |||, <||, ||> make come in the data.

i have tried following expressions:


[0-9]{4}1.[^(<\|\|>)]*<\|\|>
but did nt work

i would appreciate any kind of pointers...
Back to top
View user's profile Send private message AIM Address Yahoo Messenger
kimbert
PostPosted: Wed Nov 18, 2009 12:52 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Welcome to the wonderful world of regular expressions! You need an even more complex regex which looks like this:
Code:

[0-9]{4}1[^<]|(<[^|])|(<|[^|])|(<||[^>])*<||>


The important bit is this
Code:
[^<]|(<[^|])|(<|[^|])|(<||[^>])
which matches:
- not < OR
- < not followed by | OR
- <| not followed by | OR
- <|| not followed by >
Back to top
View user's profile Send private message
er_pankajgupta84
PostPosted: Wed Nov 18, 2009 6:18 am    Post subject: Reply with quote

Master

Joined: 14 Nov 2008
Posts: 203
Location: charlotte,NC, USA

I tried this but this was not working for all inputs ...

for ex: if my input has some partial random order of delimiter then it is not identifying the records correctly.

input:

00001hcjdhfnsd<|| > <<< \> || ||>nhjkh <||>00002hcjdhfnsd<|| > <<< \> || ||>nhjkh <||>

its failing for this kind of input.
Back to top
View user's profile Send private message AIM Address Yahoo Messenger
kimbert
PostPosted: Wed Nov 18, 2009 7:40 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
its failing for this kind of input.

Not surprising really - I haven't tested this. I was assuming that you would get the idea and work out a solution.
I suspect that you need to escape the literal | ( pipe ) characters in in the regex:
Code:
[0-9]{4}1[^<]|(<[^\|])|(<\|[^\|])|(<\|\|[^>])*<\|\|>
If that doesn't work, you'll have to do some testing / research to work out what's going wrong.
Back to top
View user's profile Send private message
er_pankajgupta84
PostPosted: Wed Nov 18, 2009 7:47 am    Post subject: Reply with quote

Master

Joined: 14 Nov 2008
Posts: 203
Location: charlotte,NC, USA

i have already taken care of escaping.
while testing in java use \\ to escape | and in mset used \ to escape |.
<0x3E>||<<0x3F> is the delimiter i have used in my message set.
Back to top
View user's profile Send private message AIM Address Yahoo Messenger
er_pankajgupta84
PostPosted: Wed Nov 18, 2009 8:34 am    Post subject: Reply with quote

Master

Joined: 14 Nov 2008
Posts: 203
Location: charlotte,NC, USA

we need to find something like this..

read upto you do not encounter a <||> i.e [^(<||>)] but unfortunately this does not behaves as group. Basically when we put some character in a () then it behaves as group but if we put a ^ sign before () then it becomes individual characters.

I am not able to form any regular exp that can read upto <||> for a message set.
Back to top
View user's profile Send private message AIM Address Yahoo Messenger
er_pankajgupta84
PostPosted: Wed Nov 18, 2009 2:04 pm    Post subject: Solved Reply with quote

Master

Joined: 14 Nov 2008
Posts: 203
Location: charlotte,NC, USA

Finally after some r n d it got solved.

This expression works for me:

[0-9]{4}1([^<]|(<[^\|])|(<\|[^\|])|(<\|\|[^>]))*<\|\|>

It just need the () on top of all the negate expressions.

Thanks kimbert for all of your pointers. I really appreciate your help.
Back to top
View user's profile Send private message AIM Address Yahoo Messenger
er_pankajgupta84
PostPosted: Wed Mar 03, 2010 3:44 pm    Post subject: Reply with quote

Master

Joined: 14 Nov 2008
Posts: 203
Location: charlotte,NC, USA

This regular expression is blowing up the execution group sometimes.

Quote:
[0-9]{4}1([^<]|(<[^\|])|(<\|[^\|])|(<\|\|[^>]))*<\|\|>


This was the regular expression i used to parse records in message set.

Now,

records like 12341nf;jkhe;jkrgf;rejkg;jekrgj;kerhfvhfbvlhbv<||>
are getting thru but when i increase the size of this record then its abending the execution group. For example 12341nf;jkhe;jkrgf;rejkg;jekrgj;kerhfvhfbvlhbvhrjherbghbhljer.....2000 more bytes here<||>

No reason is given in the user and service trace.

I replicate the scenario in java node as well. This pattern is blowing up the JVM in java itself if I try to parse longer input string.

Can anyone validate this regular expression (Data pattern).
Basically I am trying to read as many characters till i encounter <||>
Back to top
View user's profile Send private message AIM Address Yahoo Messenger
kimbert
PostPosted: Thu Mar 04, 2010 12:22 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

You have hit a known problem with regular expressions and long messages. It's possible that there is a fix for this available, so you *could* open a PMR and see what IBM says.

However...that does not solve today's problem. I'm fairly sure that you can model this format without using data patterns. The first 4 characters can be modelled as a fixed-length field. The rest of the line up to the <> can be modelled as a tagged delimited group.

If you want to try that, let me know and I'll post the details.
Back to top
View user's profile Send private message
er_pankajgupta84
PostPosted: Thu Mar 04, 2010 6:49 am    Post subject: Reply with quote

Master

Joined: 14 Nov 2008
Posts: 203
Location: charlotte,NC, USA

yes if some other way is possible then definitely i will try. Here are some more detail that might help you in guiding me.

Message looks like this:

00083773(||)00000(||)jjkjdks(||)8737384(||)jfufuifuief[||]
00083774773(||)00100(||)jjjfjkfkjdks(||)873847384(||)ncnjfufuifuief[||]
00084574773(||)00200(||)jjjfjkfkjdks(||)873847384(||)ncnjfufuifuief[||]
...so on..

Where [||] is the delimiter between records and (||) is the delimiter between fields. Each field other than 2nd field of each record is of variable length. Second field of each record will identify the record type. For example : "00000", "00100", "00200" etc.

Please note that there is no carriage return with in the message. I have added it for readability.

Once we retrieve the record we can retrieve the fields by using "All elements delimited" as data element separation at record level. But the problem is in retrieving the records.

So I used "Use data Pattern" as DES(data element separation) at message level.

I used following data pattern:

First record:
[ 0-9]+\(||\)00000([^\[]|(\[[^\|])|(\[\|[^\|])|(\[\|\|[^\]]))*\[\|\|\]

Second record:
[ 0-9]+\(||\)00100([^\[]|(\[[^\|])|(\[\|[^\|])|(\[\|\|[^\]]))*\[\|\|\]

and so on..

"\" is the escape character used for escaping [ and ( in regular expression.

Let me know if you think that this message can be modeled without using data patterns.
Back to top
View user's profile Send private message AIM Address Yahoo Messenger
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Data pattern problem in message set
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.