Author |
Message
|
Kjell |
Posted: Mon Dec 13, 2004 11:26 am Post subject: BLOB to XML query |
|
|
Acolyte
Joined: 26 Feb 2002 Posts: 73
|
Hi
I would like someone to advice me with this one. It's a bit complicated to explain, but what I try to to is:
1. Read a "flat-file" input message. Just a bunch of "segments" concatenated into one string. I read them into a compute node, as a BLOB and cast them to
a variable, "InRec".
2. In this compute node I want to convert to XML, referring to a previuosly define message set. For each segment in the input I have a procedure that I
call.
Take a look at this code sample:
-----------------------------------------
CREATE COMPUTE MODULE Create_XML
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
CALL CopyMessageHeaders();
SET OutputRoot.Properties.MessageSet = 'NVC7IB0002001';
SET OutputRoot.Properties.MessageType = 'msgData02';
SET OutputRoot.Properties.MessageFormat = 'XML1';
CREATE LASTCHILD of OutputRoot DOMAIN 'XML';
SET OutputRoot.Properties.MessageDomain = 'XML';
SET OutputRoot.XML.(XML.XmlDecl).(XML.Version)='1.0';
SET OutputRoot.XML.(XML.XmlDecl).(XML.Encoding)='UTF-8';
-- Loop thru input segments and choose mapping
SET iPos = 1;
WHILE iPos <= Length(inRec) DO
SET segLength = CAST(substring(inRec from iPos for 4) AS INTEGER);
SET segID = substring(inRec from (iPos+4) for 3);
SET mapRec = substring(inRec from iPos for segLength);
CASE segID
WHEN '010' THEN
CREATE LASTCHILD OF OutputRoot.XML.XML NAME 'F010';
CALL MAP_010(mapRec,OutputRoot.XML.XML.F010[<]);
WHEN '011' THEN
CREATE LASTCHILD OF OutputRoot.XML.XML NAME 'F011';
CALL MAP_011(mapRec,OutputRoot.XML.XML.F011[<]);
END CASE;
SET iPos = iPos + segLength;
END WHILE;
RETURN TRUE;
END;
CREATE PROCEDURE MAP_010(IN record CHARACTER,
IN outref REFERENCE) BEGIN
SET outref.fld1 = substring(record from 1 FOR 4);
SET outref.fld2 = substring(record from 5 FOR 3);
SET outref.fld3 = substring(record from 8 FOR 3);
RETURN;
END;
-------------------------------------------
PROBLEM: The way it seems to me, the flow does not seem to bother about the message set! I want to use the message set to
control the names of the XML tags, and I also want to validate against the msgSet (XML schema) so that the XML produce actually conforms to the
schema in the MsgSet. The way it is now, the node seems to produce generic XML, even though I have stated a valid MsgSet, MsgType and Format.
What am I doing wrong? |
|
Back to top |
|
 |
jefflowrey |
Posted: Mon Dec 13, 2004 11:50 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Your message domain is not "XML", it is "MRM".
At least, if you want to validate against a message set, it is.
XML is only used for self-defining messages, not for modelled messages.
Also, XML is semi-deprecated. You should use XMLNS for anything with a schema. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
kirani |
Posted: Mon Dec 13, 2004 11:30 pm Post subject: |
|
|
Jedi Knight
Joined: 05 Sep 2001 Posts: 3779 Location: Torrance, CA, USA
|
It seems your input message also can be modeled with MRM-CWF or MRM-TDS. It'd be much cleaner to model your message and use it in your compute node for transformation. _________________ Kiran
IBM Cert. Solution Designer & System Administrator - WBIMB V5
IBM Cert. Solutions Expert - WMQI
IBM Cert. Specialist - WMQI, MQSeries
IBM Cert. Developer - MQSeries
|
|
Back to top |
|
 |
Kjell |
Posted: Mon Dec 13, 2004 11:38 pm Post subject: |
|
|
Acolyte
Joined: 26 Feb 2002 Posts: 73
|
No, I can not use CWF/TDS to model the message. Believe me, I have really tried this (I have spent days on it). The segments arrive just in a string, but they are of a hierarcical nature and I cannot describe this in a good way in a message set.
So, I have to coop with the fact that the message arrives into the compute node as a BLOB and has to leave it as XML, validated against a message set, and that's what my problem is all about. |
|
Back to top |
|
 |
kimbert |
Posted: Tue Dec 14, 2004 2:21 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Kirani is correct. It would be cleaner to use the MRM for this. I would be surprised if your message is impossible to describe using a message set. Can you describe your input message format in detail, and I'll try to suggest a workable approach. |
|
Back to top |
|
 |
Kjell |
Posted: Tue Dec 14, 2004 2:53 am Post subject: |
|
|
Acolyte
Joined: 26 Feb 2002 Posts: 73
|
Be my guest!
Message is as follows
Seg 1 occurs 1 to n times
- SegID char 3 value '001'
- SegLength INTEGER 4
- fld 1 char ...
- fld 2 char ...
.
.
Seg 2 occurs 1 to n times
- SegID char 3 value '002'
- SegLength INTEGER 4
- fld 1 char ...
- fld 2 char ...
.
.
Seg 3 occurs 1 to n times child of Seg 2
- SegID char 3 value '003'
- SegLength INTEGER 4
- fld 1 char ...
- fld 2 char ...
.
.
Seg 4 occurs 1 to n times child of Seg 3
- SegID char 3 value '004'
- SegLength INTEGER 4
- fld 1 char ...
- fld 2 char ...
.
.
Seg 5 occurs 1 to n times
- SegID char 3 value '005'
- SegLength INTEGER 4
- fld 1 char ...
- fld 2 char ...
.
.
Off course, above described is much simplified. The real situation is that I have som 30 different segment types, but the principle is the same.
They all come in a "flat-file" format ( a string with no separation between them). Each segment has an identifier and a segment length. These two fields I can use to parse the BLOB.
However i cannot describe this as CWF (or can I?), for the reason that in the current segment there is nothing indicating what comes next. I have tried to use a "repeat reference" on the message, but that one needs to be an integer on the same level, so I could't do that. There is nothing in a child segment telling whihch parent it belongs to.
I appreciate your efforts to assist. Any suggestions are welcome. |
|
Back to top |
|
 |
jefflowrey |
Posted: Tue Dec 14, 2004 4:46 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
You certainly should be able to model that message using the Tagged Delimited Strings parser. All of your elements are uniquely identified by SegIDs, which will be your tags. You don't say if all of your individual child fields are fixed length or not. Regardless, the SegLength should help.
You also can't use the XML domain if you want to validate against a message set. You will have to model your XML in the message set, and then use the MRM domain in your ESQL to indicate that you are using the MRM for modelling your XML. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
Kjell |
Posted: Wed Dec 15, 2004 12:39 am Post subject: |
|
|
Acolyte
Joined: 26 Feb 2002 Posts: 73
|
Thanks, I have havbe a try on this.
One thing I forgot to mention, in above outline the SEGID is the very first field in the segment. In the real situation the SEGID goes as the 2:nd field in each segment (there's one field before it, with variable content).
Does that change the possibilities in using TDS? |
|
Back to top |
|
 |
Kjell |
Posted: Wed Dec 15, 2004 3:17 am Post subject: |
|
|
Acolyte
Joined: 26 Feb 2002 Posts: 73
|
Two more queries in the area of a TDS solution:
--------------------------------------------------------
As said the SEGID is NOT in pos 1 of the segment, rather in pos 5-7.
I'm thinking of using a Data Pattern to have the parser identify the tag.
Say the SegID is '010', how do I setup a regular expression to say
"The presence of 010, but it has to be in position 5-7"?
I have the SegLength telling me how big the segment is but I failed to use the Length Reference, since it has to be of integer and it has to go BEFORE the segment, not as a part of it.
So again, is it possible to describe my segment structure with TDS? |
|
Back to top |
|
 |
kimbert |
Posted: Wed Dec 15, 2004 3:32 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
Say the SegID is '010', how do I setup a regular expression to say
"The presence of 010, but it has to be in position 5-7"? |
Like this :
I'm interested in these segment lengths. You have multiple fields in your segments, but you also know the length of each field (at least, your MAP010 function suggests so). So why do you need a separate segment length? Are there variable numbers of repeats/variable length fields within some segments? |
|
Back to top |
|
 |
Kjell |
Posted: Wed Dec 15, 2004 3:57 am Post subject: |
|
|
Acolyte
Joined: 26 Feb 2002 Posts: 73
|
Yes I know the length of each segment but I never know what segment type comes next, is it a child or a new seg at the same level.
Also, there is no delimiter between the segments. They are tiled to eachother with nothing between them. Do you men in this case, that I do not have to state the seglength anywhere, that it's enough to have a Segment Identifier, resolved by a data pattern? |
|
Back to top |
|
 |
jefflowrey |
Posted: Wed Dec 15, 2004 4:18 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
You don't even need to use a Data Pattern, necessarily.
You can use Tagged-Fixed Width seperation. As long as every segment IS identified by a unique identifier, AND every element is fixed in length, then the parser can tell which segment it has found by the ID and knows when to stop parsing. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
Kjell |
Posted: Wed Dec 15, 2004 4:35 am Post subject: |
|
|
Acolyte
Joined: 26 Feb 2002 Posts: 73
|
OK, two questions,
1.
When you say Tagged-Fixed Width separation, what do you mean, exactly what to code in which attribute on the properties?
2.
As said before I have a large number of segments, each segment has a large number of fields.
I had an XML and a CWF layer. I now added a TDS one.
However it seems like the length of each field is not inherited to the TDS1 layer. On the TDS1 layer I now have length = 0 on all fields + warning message "Warning Element 'fld1'' has default TDS Length of zero. If it is used in a TDS message, that message will not be parsed successfully. Physical format: 'TDS1'."
I tried to delete the message definition file and then re-create it again, from the XML schema. The result is the same, length is OK on CWF layer, but 0 on the TDS one.
Do I have to hack myself thru all these millions of fields to set the length or is there a better way? |
|
Back to top |
|
 |
jefflowrey |
Posted: Wed Dec 15, 2004 5:53 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Kjell wrote: |
OK, two questions,
1.
When you say Tagged-Fixed Width separation, what do you mean, exactly what to code in which attribute on the properties? |
OOOPS..
Tagged Fixed Length.
Create a message set project.
Create a message set
Create a message definition file.
Add the TDS physical format layer.
Create a Complex Type for your first segment.
Go to Properties. Select the TDS physical properties in the navigation.
Set "Data Element Separation" to "Fixed Length".
Add the various local elements to represent the fields of your segment.
Create a new complex type to represent your message.
Set it's Data Element Separation to "Tagged Fixed Length". Also set the "Distinguish tag and data values using:" to "Tag Length", and set the value to indicate the length of the tags in question.
Add a local element, whose TYPE is the type of your segment, that you just created.
In the TDS physical properties of the local element you just added, assign the tag value. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
jefflowrey |
Posted: Wed Dec 15, 2004 6:00 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Kjell wrote: |
Do I have to hack myself thru all these millions of fields to set the length or is there a better way? |
If there is a better way... I don't know what it is.
An unsupported way would be to reverse engineer the mxsd file, which should be XML, and populate the correct entries.
Do this with your eclipse shut down, for good measure, and ON A BACKUP FILE. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
|