Author |
Message
|
dkeister |
Posted: Wed Apr 14, 2004 12:13 pm Post subject: Repeating fields from Excel |
|
|
Disciple
Joined: 25 Mar 2002 Posts: 184 Location: Purchase, New York
|
I'm having trouble with generating a message set that maps a file created from a spread sheet.
In the simple case there are two columns (A and B)
A contains data, sometimes B contains data.
When I export the data from Excel they are tab delimited with cr/lf at end of each line so
A B
C
D E
becomes
A<tab>B<CRLF>C<tab><CRLF>D<tab>E<CRLF>
I have a type called ROW with all elements delimited with <HT> and containing two elements COL1 and COL2
I have a second type called ROWS with all elements delimited with <CR><LF> containing one element ROWELEMENT of type ROW, Repeating with repeating element delimiter <CR><LF>
When I try to use this with the sample, AB and C get parsed and DE is ignored.
It seems the parser stops with after the 'empty' COL2 field.
What am I missing? |
|
Back to top |
|
 |
Missam |
Posted: Wed Apr 14, 2004 12:40 pm Post subject: |
|
|
Chevalier
Joined: 16 Oct 2003 Posts: 424
|
It looks like you missed some thing
let me tell what i understood from you data
A and B will be of a compound type
A,B and C will be of another compound type
if this is repeating you need to create one more compound type saying the above compound type is repeating.
I guess you need one more compound type |
|
Back to top |
|
 |
dkeister |
Posted: Wed Apr 14, 2004 1:02 pm Post subject: |
|
|
Disciple
Joined: 25 Mar 2002 Posts: 184 Location: Purchase, New York
|
I don't know which 'row' has the second field. Think of it as a spread sheet with two columns A and B, either of which might be empty.
A B
R
C D
G
H I
This looks like
A<HT>B<CRLF>
R<HT><CRLF>
C<HT>D<CRLF>
<HT>G<CRLF>
H<HT>I<CRLF>
Each element in a row from the spread sheet may or may not contain a value. |
|
Back to top |
|
 |
kimbert |
Posted: Thu Apr 15, 2004 2:27 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Try this:
ROW is an unbounded repeating element (maxOccurs = 0). Its repeating element delimiter is <CRLF>.
The complex type of ROW (ROWTYPE) contains elements COL1 and COL2. COL1 and COL2 are optional (minOccurs = 0, maxOccurs=1).
ROWTYPE is All Elements Delimited with a delimiter of <HT>.
In the message tree you should get
Code: |
MRM
ROW
- COL1
- COL2
ROW
- COL1
- COL2
ROW
- COL1
ROW
- COL1
- COL2
|
etc |
|
Back to top |
|
 |
dkeister |
Posted: Thu Apr 15, 2004 8:54 am Post subject: |
|
|
Disciple
Joined: 25 Mar 2002 Posts: 184 Location: Purchase, New York
|
That's the way I thought it should work but...
What I see is that when the message parser encounters the row:
A<HT>B<CR><LF> parses correctly and parser continues to next row
<HT>G<CR><LF> parses correctly and parser continues to next row
As soon as it hits an empty second element:
R<HT><CR><LF> parses correctly but parser stops!
I have defined a message
ROWS_M a message with ROWS_T type
- ROWS_T is all elements delimited <CR><LF> and contains
. - ROW element of type ROW_T with repeat Y and rep-del <CR><LF>
. - ROW_T is all elements delimited <HT> and contains
. - COL1 and
. - COL2
I get the same results if COL1 and COL2 are
- Connection Repeat Y, Min 0, Max 1 with rep-del <HT>
or
- Connection Repeat N, Min 0, Max 1 |
|
Back to top |
|
 |
Tibor |
Posted: Fri Apr 16, 2004 12:35 am Post subject: |
|
|
 Grand Master
Joined: 20 May 2001 Posts: 1033 Location: Hungary
|
Here is a simpler definition:
ROWS_M a message with ROWS_T type
- ROWS_T is all elements delimited <CR><LF> and contains
. - ROW element of type ROW_T with repeat Y and rep-del <CR><LF>
. - ROW_T is all elements delimited <HT> and contains
. - COL only one element
we use this way and works fine when different names isn't needed for all values
Tibor |
|
Back to top |
|
 |
dkeister |
Posted: Fri Apr 16, 2004 6:20 am Post subject: |
|
|
Disciple
Joined: 25 Mar 2002 Posts: 184 Location: Purchase, New York
|
Tibor, I've tried your configuration before and what I get is COL contains the whole row, not an individual elementment.
MRM
COL
A<HT>B
COL
R<HT>
COL
C<HT>D
COL
<HT>G
COL
H<HT>I
Did I miss something in your example?
What I have done is to take this input into a compute node and propagate each row as a separate message to a ResetContentDescriptor which has the field definitions for each element in the row. This works but I feel I should be able to have the rows parsed without this extra step. (By the way, if the data values are fixed length and I use that attribute rather than the <HT> separation, all works fine.) |
|
Back to top |
|
 |
fjcarretero |
Posted: Fri Apr 16, 2004 6:38 am Post subject: |
|
|
Voyager
Joined: 13 Oct 2003 Posts: 88
|
Make sure that you have the property 'Supress Absent Element Delimiters' in the message set, sen to 'Never'.
Hope this helps.
Cheers
Felipe |
|
Back to top |
|
 |
dkeister |
Posted: Fri Apr 16, 2004 7:23 am Post subject: |
|
|
Disciple
Joined: 25 Mar 2002 Posts: 184 Location: Purchase, New York
|
Filipe,
I have never heard of 'Supress Absent Element Delimiters' and can't seem to find it in the HELP or documentation. I'm on 2.1 using TDS.
??? |
|
Back to top |
|
 |
Missam |
Posted: Fri Apr 16, 2004 8:08 am Post subject: |
|
|
Chevalier
Joined: 16 Oct 2003 Posts: 424
|
Hi
I Tried your case on my box and i'm able to parse it
here are the steps i followed
1)Create a compound type Type1
2)create two elements E1 and E2 of type String
3) Set Compund type Type1 TDS property Type Composition Sequence Type Content Closed All elements delimited with delimiter <HT>
4)Set E2 connection property Repeat to yes and min ocuurs 0 max occurs 1 and repeating element delimiter to what ever you want because it wont repeat more than once and the element E2 can be optional because of min occurs 0
5) create an element e_Type1 of type Type1
6) Create a compund type Type2 and add e_Type1 to this compund type
7) Set Type2 's TDS property Type Composition Sequence Type Content Closed All elements delimited with deleimiter what ever you want because you have only one element in this compound type
8)Set e_Type1's element property connection to repeat and repeating element delimiter to <CR><LF>
9) create a message of type2 and start parsing
hope this helps |
|
Back to top |
|
 |
dkeister |
Posted: Fri Apr 16, 2004 9:04 am Post subject: |
|
|
Disciple
Joined: 25 Mar 2002 Posts: 184 Location: Purchase, New York
|
Hello IamSam,
I cross my heart and hope to die, I did what you said.
When I put the following sequence.
A<HT>B<CR><LF>
<HT>G<CR><LF>
H<HT>I<CR><LF>
R<HT><CR><LF>
C<HT>D<CR><LF>
and run it through a message flow InputNode --> OutputNode
and use the debugger, the parsing shows
MRM
..EType1
....E1 = A
....E2 = B
..EType1
....E2 = G
..EType1
....E1 = H
....E2 = I
..EType1
....E1 = R
and that's all folks. No sign of C or D
Further, I added a compute node to take each EType1 and propagate as a separate message. Four are propagated, not the five from the input message.
When you tried your sample, did you have a complete row at the end? Did it parse? |
|
Back to top |
|
 |
Missam |
Posted: Fri Apr 16, 2004 10:39 am Post subject: |
|
|
Chevalier
Joined: 16 Oct 2003 Posts: 424
|
Shall i tell what the problem is
MRM doesn't support optional elements
to suppress that some times we define repeating element property min occurs to 0
In your case either of your elements are optional,when i was testing your scenerio i didn't noticed that first element can also be optional. i thought only second element is optional.
try making the first element also optional and give a hit.i hope it doesn't work
If any one has the solution for this iam glad to hear ,but i guess we can deal this with NEONMSG parser |
|
Back to top |
|
 |
dkeister |
Posted: Fri Apr 16, 2004 10:52 am Post subject: |
|
|
Disciple
Joined: 25 Mar 2002 Posts: 184 Location: Purchase, New York
|
At least I'm not as dim as I was thinking I was...
Yes, I tried making the first optional and it truely does not work.
If I read your reply correctly, I would have thought my case to be a rather common occurence - I can't believe I'm the first...
NO, NO, NO, I'm not going back to NEON...
What I have done is to create two message definitions.
.The input node uses the first parsing each 'row' as a single element.
.I then have a compute node that
..uses the cardinality function to determine the row count
..propagates each row as a new message
.A content descriptor resets to the second message definition which describes all the fields in one row.
.A second compute node can now work with the fields (i.e. put to DB, convert to XML, what ever...
A bit of a hack but what the heck
(By the way, how did you know MRM doesn't support optional elements?)
Thanks to all that made suggestions |
|
Back to top |
|
 |
Missam |
Posted: Fri Apr 16, 2004 12:40 pm Post subject: |
|
|
Chevalier
Joined: 16 Oct 2003 Posts: 424
|
The way you are processing one element as one row is the least possible step we can take.
Quote: |
NO, NO, NO, I'm not going back to NEON... |
i don't know why people are scared of using NEON,and its still supported in WBIMB 5.0 also.
And the fact here is you can handle almost any kind of format using NEON and MRM can't handle all(the one is yours).
i didn't remember exactly where but you can still find in some documentation about optional elements in MRM(Working With Messages) |
|
Back to top |
|
 |
dkeister |
Posted: Fri Apr 16, 2004 1:12 pm Post subject: |
|
|
Disciple
Joined: 25 Mar 2002 Posts: 184 Location: Purchase, New York
|
I'm not afraid of using NEON but ...
The people in Hursley discourage it and others have to support it so I try to KISS |
|
Back to top |
|
 |
|