MQSeries.net :: View topic - Parse a line containing only <CR><LF> into a mes

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Parse a line containing only <CR><LF> into a mes

Parse a line containing only <CR><LF> into a mes

« View previous topic :: View next topic »

Author

Message

NiceGuy

Posted: Tue Aug 24, 2010 8:11 pm Post subject: Parse a line containing only <CR><LF> into a mes

Apprentice

Joined: 11 Jun 2009
Posts: 37

Hi community,

Does anyone have an idea how I can enable my message set to parse/accept lines containing only the newline (<CR><LF>) into a member in my message set.

For Example:

Code:

You'll note the newline characters are by themselves and have no other characters in their respective lines. This occurs at various places in my message input.

Currently I have created an element in my message set called NEWLINE
that has its "Data Element Separation" set to "Use Data Pattern"

The Data Pattern for the element I put simply as: [\n]

OF course the element repeats, that is, Min Occurs=1 .. Max Occur=-1

Unfortunately this does not appear to work .. any ideas.

I'm a junior .. be gentle
Thanks community

fjb_saper

Posted: Tue Aug 24, 2010 8:16 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

Check out the TDS parser in the infocenter.
You might have to use "<CR><LF>" as a pattern instead of "\n"

Have fun

_________________
MQ & Broker admin

mqjeff

Posted: Wed Aug 25, 2010 1:42 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

You should only use Data Patterns if you can't use other features of TDS to match the data, they are significantly slower.

Can you explain further what you are trying to see in your logical tree from your model? Are you trying to see empty but existing elements for each of these blank lines? Are you trying to see elements that contain the value "<CR><LF>"? Are you trying to NOT have any elements added to your tree for these blank lines?

kimbert

Posted: Wed Aug 25, 2010 3:48 am Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

Quote:

Currently I have created an element in my message set called NEWLINE
that has its "Data Element Separation" set to "Use Data Pattern"

The Data Pattern for the element I put simply as: [\n]

The Data Element Separation property applies to the children of the complex type/group. So if you have set the Data Pattern on the NEWLINE element it will be ignored unless Data Element Separation='Use Data Pattern' on its parent group/type.

Other than that, mqjeff is corrrect - we need to understand what your input looks like, what message tree structure you want to obtain, and why.

NiceGuy

Posted: Wed Aug 25, 2010 6:52 am Post subject:

Apprentice

Joined: 11 Jun 2009
Posts: 37

Thanks everyone this far for helping out,

K allow me to clarify further. Let me start out by presenting a larger segment of my input message. Hopefully this helps explain further.

Code:

DETAIL_LINE 1(min occurs) -1(max occurs)
CRLF 1(min occurs) -1(max occurs)

Code:

DETAIL_LINE:
-detailLine1
-productNumber
-open
-shipped
-order
-tax
-price
-uom
-extended
-detailLine2
-descriptionline1
-detailLine3
-descriptionLine2
-detailLine4
-emptyline
CRLF

Input Message:

Code:

05710610155 1 1 0 Y 35.74 CS 35.74<CR><LF>
FORK PLASTIC SILVER (600) <CR><LF>
REFLECTIONS <CR><LF>
<CR><LF>
<CR><LF>
<CR><LF>
<CR><LF>

*please note the last three <CR><LF> can vary in frequency, that is, 3,4,5 could appear in theory. So I show only three here for brevity.

First Line (detailLine1):
The the first line starting with 05710610155 represents first line in the invoice detail. This generally parses fine I've configured this segment as tagged delimited with a group terminator of <CR><LF>.

Second Line (detailLine2):
The second line also parses fine .. again I've configured this segment as tagged delimited with a group terminator of <CR><LF>.

Third Line (detailLine3):
Same as first two.

Fourth Line (detailLine4):
The fourth line is a long empty line of spaces followed by a <CR><LF>.
This line is set to Data Element Separation: Use Data Pattern. The only element inside EmptyLine has its Data Pattern: [ ]+. The parent detailLine4 has its Group Terminator: <CR><LF>

CRLF
This is where the majority of my problems reside, well in theory it could be the parsing transition from detailLine4 to CRLF so my problem could reside in either of the two. I've tried two variations of this
one by adding an element to CRLF, that essentially tries to swallow the newline .. the element name (call it "newline") uses the Data Pattern: [\n]+. The parent CRLF has its Group Terminator: left blank.

The second variation was simply removing the ("newline") element inside
the CRLF parent and simply setting the CRLF Group Terminator: <CR><LF>. The CRLF then was set to repeating that in the message, that is, 1 (min occurs) and -1 (max occurs).

I apologize for the lengthy post .. never realized how difficult it is to explain something I.T related in writing.

Regardless, I hope I did a better job this time explaining my problem.

Thanks again for helping out.

lancelotlinc

Posted: Wed Aug 25, 2010 7:22 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

If nothing else works well, you could drop into a Java Compute Node and parse the data in the JCN. On the MQInput node, if you choose this method, choose "BLOB" (ie. no parsing from one of Broker's default parsers).
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

kimbert

Posted: Wed Aug 25, 2010 7:57 am Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

Thanks - much clearer now. I've got more questions, though.

a) will detailLine4 always consist entirely of spaces, with at least one space being present always?
b) Why do you get these trailing <CR><LF>s - do they represent empty records? If so, are you 100% certain that those records will always be empty?
Is 5 the absolute maximum number that you will ever get?

Once I know the answers to those questions, I'll have some suggestions.

kimbert

Posted: Wed Aug 25, 2010 7:58 am Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

Quote:

If nothing else works well, you could drop into a Java Compute Node and parse the data in the JCN

Sometimes that's the correct approach, but not this time. This is a simple enough format.

NiceGuy

Posted: Wed Aug 25, 2010 8:22 am Post subject:

Apprentice

Joined: 11 Jun 2009
Posts: 37

Thanks once again for your assistance,

Let's see if I can clarify ....

Quote:

1) will detailLine4 always consist entirely of spaces, with at least one space being present always?
2) Why do you get these trailing <CR><LF>s - do they represent empty records? If so, are you 100% certain that those records will always be empty?
3)Is 5 the absolute maximum number that you will ever get?

Answers:

A1) Yes detiailLine4 is a long line consisting of entire spaces followed by a CFLF. Though the amount of spaces is not definitive, in theory, yes at least one space is assumed/expected before a <CR><LF> is reached.

A2) To be honest .. the lines containing the <CR><LF> alone, like those following the detiailLine4 are meaningless/garbage. If i had to give my opinion, they are put there to act as a transition into the second part of the invoice message input (I have purposely left out the bottom portion in my posting since its irrelevant at this point.) They do not represent anything meaningful I suppose, except to separate Invoice Page 1 from Invoice Page 2.

For example:

Code:

Invoice Top Portion
<CR><LF>
<CR><LF>
<CR><LF>
Invoice Bottom Portion

A5) No in fact there is no definitive value on the number of <CR><LF> that could follow after all the INVOICE_DETAILS have been parsed. Expect the unexpected sort of speaking.

I guess my goal is to simply allow my message set to expect/absorb these <CR><LF> before processing the bottom portion of the invoice.

I realized that someone might suggest a JaveComputeNode but, being only a junior and given my relative inexperience ... still I already feel pretty comfortable saying that, that approach, seem's like more logic than perhaps necessary.

Hoping that I clarified better, please forward any other questions
you desire .. at this point .. I am pretty much at a standstill.

Thanks again

kimbert

Posted: Thu Aug 26, 2010 1:34 am Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

Good answers. Here are my suggestions:

1. Remove all data patterns from your model. This is a fairly basic line-oriented format, so All Elements Delimiter or Tagged Delimited can do the job, and will be faster.
2. Treat the <CR><LF>s between the top portion and bottom portion as markup, so that they don't appear as elements in the message tree. I've sketched out a possible model below. I haven't tested it, but I think something like this will work:

Code:

element name='Invoice'
complexType DataElementSeparation='All Elements Delimited' Delimiter='<CR><LF>'
element name='TopPortion' minOccurs='1' maxOccurs='1'
complexType DataElementSeparation='All Elements Delimited' Delimiter='<CR><LF>' GroupTerminator='<CR><LF>'
element name='detailLine1'
element name='detailLine2'
element name='detailLine3'
element name='detailLine4'
sequence GroupIndicator='<CR><LF>' minOccurs='0' maxOccurs='unbounded'
<no members for this sequence group>
element name='BottomPortion' minOccurs='1' maxOccurs='1'
<content of bottom portion>

Display posts from previous:

Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Parse a line containing only <CR><LF> into a mes

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP