ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum IndexWebSphere Message Broker (ACE) SupportTDS Fixed Length message but it's not fixed length!

Post new topicReply to topic Goto page 1, 2  Next
TDS Fixed Length message but it's not fixed length! View previous topic :: View next topic
Author Message
longng
PostPosted: Mon Mar 04, 2013 8:24 pm Post subject: TDS Fixed Length message but it's not fixed length! Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

I've come across a rather perplex problem and may have to pursue it with a PMR. In the meantime I just want to know if someone has come across the same thing.

For simplicity, I have trivialized the actual issue but it may still be a bit complex for some readers! You've been warned!

Here's an XML input that has a single element containing text in German

Code:
<?xml version="1.0" encoding="UTF-8"?>
<Data1Msg>
  <Data1>
    <E1>Software Maint. für GPFS</E1>
 </Data1>
</Data1Msg>


Just in case, the contents of E1 element in hex (I purposedly insert a space between every four bytes for visual convenience)with a total length of 27 bytes.
Code:

"536F6674 77617265 204D6169 6E742E20 66C383C2 BC722047 504653"


Someone knows German may be able to tell me the meaning of the above but I am not exactly particularly interested in the meaning right now.
Again, the above data has a count of 27 in length.

An input Data1InMsg and an output Data1OutMsg messages have been defined (WMBv7 with message set!) for the above data using the same data type of
Code:

Data1Type
 ** sequence of TDS Data Element Separation=Fixed Length
  Data1 (Local complexType) Render=XMLElement, XML Name = 'Data1'
    {Local complexType} TDS Data Element Separation = Fixed Length
      E1 Render=XMLElement, XML Name = 'E1', TDS Length = 40


To test the above definition, I have a flow that has
1. A MQInput with Input Message Parsing set to a format of XML1 of the above definition
2. A Compute node feeding off the above MQInput that basically sets the output to TDS format of the input XML (note the UTF-8 explicit setting to 1208)

Code:

      SET OutputRoot.Properties.MessageSet = 'DEG1BMK002001';
      SET OutputRoot.Properties.MessageType = 'Data1OutMsg';
      SET OutputRoot.Properties.MessageFormat = 'Text1';

      SET OutputRoot.MQMD.CodedCharSetId = 1208;
      
      SET OutputRoot.MRM.Data1.E1 = InputRoot.MRM.Data1.E1;

3. A MQOutput just for the sake of completeness!

The output of execution of the above flow and the above input


Code:
00000000   Software    Maint.    536F6674 77617265 204D6169 6E742E20
00000016   f├â┠  ¬â•.r GP   66E2949C C3A2E294 ACE2959D 72204750
00000032   FS                     46532020 20202020 20202020 202020


If you're counting, then that's 47 bytes against the TDS Length of 40 as in the definition!

Using the same data definitions and the same flow but I substitute the input with 'pure' English characters (X's in this case) for every German character:


Code:
<?xml version="1.0" encoding="UTF-8"?>
<Data1InMsg>
   <Data1>
      <E1>Software MaintXXXXXXXGPFS</E1>
   </Data1>
</Data1InMsg>


I would now get the expected output of

Code:

00000000   Software    MaintXX   536F6674 77617265 204D6169 6E745858
00000016   XXXXXGPF   S          58585858 58475046 53202020 20202020
00000032                          20202020 20202020


The above is exactly 40 in length as defined in the TDS definition!

Even with a trivial example above, I hope that you would agree with me that the contents of the fixed length message has been shifted... Imagine the above single field is a part of of a big message that has many fields. How about the fields defined after the above field, which has been shifted....?

Please share your input!
Back to top
View user's profile Send private message
rekarm01
PostPosted: Tue Mar 05, 2013 2:42 am Post subject: Re: TDS Fixed Length message but it's not fixed length! Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 1415

longng wrote:
Here's an XML input that has a single element containing text in German

Code:
...<E1>Software Maint. für GPFS</E1>...

This isn't really German. It's garbled text, most likely due to a bad input ccsid (windows-1252?). It should probably look more like:

Code:
...<E1>Software Maint. für GPFS</E1>...

If the input message is UTF-8, the input ccsid should be, too.

longng wrote:
An input Data1InMsg and an output Data1OutMsg messages have been defined (WMBv7 with message set!) for the above data using the same data type of

Code:
Data1Type
 ** sequence of TDS Data Element Separation=Fixed Length
  Data1 (Local complexType) Render=XMLElement, XML Name = 'Data1'
    {Local complexType} TDS Data Element Separation = Fixed Length
      E1 Render=XMLElement, XML Name = 'E1', TDS Length = 40

TDS Length = 40 what? bytes? characters? "Fixed Length" depends on how the message set defines the element's 'Length Units'.

longng wrote:
The output of execution of the above flow and the above input

Code:
00000000   Software    Maint.    536F6674 77617265 204D6169 6E742E20
00000016   f├â┠  ¬â•.r GP   66E2949C C3A2E294 ACE2959D 72204750
00000032   FS                     46532020 20202020 20202020 202020

The 'ü' just keeps growing ... this looks like another bad input ccsid (msdos-437?)
Back to top
View user's profile Send private message
kimbert
PostPosted: Tue Mar 05, 2013 2:45 am Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Your first big mistake is to use MRM XML in a new message flow. I would be interested to know why you have done this.

You have provided lots of useful info, but we need more ( as always). What is the CCSID used by the input message. What is the CCSID used by the output message? What is 'Length Units' set to in the TDS message definition for this field?
Back to top
View user's profile Send private message
longng
PostPosted: Tue Mar 05, 2013 5:53 am Post subject: Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

@rekarm01 & @kimbert: It's to do with me being sloppy and the combination of using RFHUtil and Windows' cut & paste that messes up the actual contents of the field. At least I provide hex values!

Anyway, after posting the original query, I tinkered around further with the message definition and was able to get the expected output of 40 BYTES!
Code:

00000000   Software    Maint.    536F6674 77617265 204D6169 6E742E20
00000016   für GPF   S          66C3BC72 20475046 53202020 20202020
00000032                          20202020 20202020


The above expected output has been achieved by having the field's Length Units set to Bytes instead of Characters. If this is how things work then I am reluctantly OK with the changes that need be done. On the other hand, our legacy message sets have hundreds (if not thousands) of fields that need to be changed as to accommodate languages other than just English and German! @kimbert: I may have indrectly answered your question about the rationale of using MRM!

Technically, I believe that the parser should either truncate the data as to conform to the definition or to throw an exception indicating the length has been exceeded or do both. It's a fixed length setting after all. As thing stands currently, it's not acceptable for the parser just simply grows the field beyond its setting and shift everything else out. I will initiate a PMR.

In defensive programming, it sounds like we should use the Length Units in Bytes regardless of the data being in Binary or Characters. Care to comment?
Back to top
View user's profile Send private message
kimbert
PostPosted: Tue Mar 05, 2013 6:00 am Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
Technically, I believe that the parser should either truncate the data as to conform to the definition or to throw an exception indicating the length has been exceeded or do both. It's a fixed length setting after all. As thing stands currently, it's not acceptable for the parser just simply grows the field beyond its setting and shift everything else out. I will initiate a PMR.
I suggest that you put the PMR on hold. The MRM parser is doing *exactly* what you have asked. The output is 40 *characters* in length. The problem is that characters do not always occupy a fixed number of bytes. Your COBOL application was probably originally designed for single-byte EBCDIC characters, in which case the distinction between characters and bytes would not matter. It is now being expected to handle UTF-8 data, and it is breaking. This is not IBM's problem - it is a problem that crops up continually all over the world when programmers fail to take into account the facts explained here: http://www.joelonsoftware.com/articles/Unicode.html
Back to top
View user's profile Send private message
kimbert
PostPosted: Tue Mar 05, 2013 6:01 am Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

I note that you still have not given any good reason why you are using the MRM parser to output XML.
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Mar 05, 2013 6:05 am Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

longng wrote:
@kimbert: I may have indrectly answered your question about the rationale of using MRM!


Not to me. So you've got a lot of legacy message sets; that's nice. What's that got to do with not using XMLNSC?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
longng
PostPosted: Tue Mar 05, 2013 7:13 am Post subject: Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

Vitor wrote:
longng wrote:
@kimbert: I may have indrectly answered your question about the rationale of using MRM!


Not to me. So you've got a lot of legacy message sets; that's nice. What's that got to do with not using XMLNSC?



I hear you, I hear you! Who am I to argue ? Sometime in the past (before my time), it was decided to have a common message flow to serve as a single entry point to a portfolio of hundreds of downstream flows. And yes, the development effort started back in V6. Apart from doing other things, the common flow also set up RFH2 and explicitly set the domain to 'mrm'... Is there a magic wand that I can wave as to make wholesale changes to all the downstream flows?
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Mar 05, 2013 7:36 am Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

longng wrote:
Is there a magic wand that I can wave as to make wholesale changes to all the downstream flows?


Yes - it's called "money".

All these flows should have been identified & migrated into a more modern model as you moved version. This is the cheap & easy way of doing it. The longer you stay with this outmoded model the more problems you're going to hit going forwards and the more money you'll need to spend in a fire-fighting timeframe.

Mention this to your budget holder.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
longng
PostPosted: Tue Mar 05, 2013 7:37 am Post subject: Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

kimbert wrote:
Quote:
Technically, I believe that the parser should either truncate the data as to conform to the definition or to throw an exception indicating the length has been exceeded or do both. It's a fixed length setting after all. As thing stands currently, it's not acceptable for the parser just simply grows the field beyond its setting and shift everything else out. I will initiate a PMR.
I suggest that you put the PMR on hold. The MRM parser is doing *exactly* what you have asked. The output is 40 *characters* in length. The problem is that characters do not always occupy a fixed number of bytes. Your COBOL application was probably originally designed for single-byte EBCDIC characters, in which case the distinction between characters and bytes would not matter. It is now being expected to handle UTF-8 data, and it is breaking. This is not IBM's problem - it is a problem that crops up continually all over the world when programmers fail to take into account the facts explained here: http://www.joelonsoftware.com/articles/Unicode.html


Thanks Kimbert, for your input and perspective. I don't intend to defend something being been done in the past, but I still maintain that the parser should observe the definition as opposed to 'silently' expanding a fixed length field.
Back to top
View user's profile Send private message
kimbert
PostPosted: Tue Mar 05, 2013 7:44 am Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
I still maintain that the parser should observe the definition as opposed to 'silently' expanding a fixed length field.
Two points here:
1. It *is* observing the definition. It is outputting a fixed number of characters. There is no defect here, and this behaviour is actually required by some users.
2. The TDS parser could have tried harder to explain what it was doing, and why. This is one reason why DFDL is a better choice going forward - DFDL is pretty good at explaining its actions.
Back to top
View user's profile Send private message
longng
PostPosted: Tue Mar 05, 2013 7:51 am Post subject: Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

Vitor wrote:
longng wrote:
Is there a magic wand that I can wave as to make wholesale changes to all the downstream flows?


Yes - it's called "money".

All these flows should have been identified & migrated into a more modern model as you moved version. This is the cheap & easy way of doing it. The longer you stay with this outmoded model the more problems you're going to hit going forwards and the more money you'll need to spend in a fire-fighting timeframe.

Mention this to your budget holder.


We are thinking the same way!
Back to top
View user's profile Send private message
mqjeff
PostPosted: Tue Mar 05, 2013 8:01 am Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

Put in a mediator flow that deletes the RFH.
Back to top
View user's profile Send private message
kimbert
PostPosted: Tue Mar 05, 2013 8:02 am Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Just to avoid any misunderstandings...
- just because you upgrade, that does not mean that you need to rewrite all your flows to use XMLNSC. In fact I would advise against it unless there is a pressing need to exploit the improved performance/standards compliance of XMLNSC.
- migration to XMLNSC can be expensive and difficult in some cases

but...
- writing new flows that use MRM XML is not good practice. It sounded as if the OP was doing that.
- if you are using MRM XML for writing XML then it should be pretty simple to switch to using XMLNSC instead.
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Mar 05, 2013 8:42 am Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

kimbert wrote:
writing new flows that use MRM XML is not good practice. It sounded as if the OP was doing that.




But if you are changing an existing flow that's a great time to think about embracing new technologies. For instance, if you have a flow reading a file using MRM you might still want to consider changing to DFDL.


kimbert wrote:
- if you are using MRM XML for writing XML then it should be pretty simple to switch to using XMLNSC instead.



_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
Display posts from previous:
Post new topicReply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum IndexWebSphere Message Broker (ACE) SupportTDS Fixed Length message but it's not fixed length!
Jump to:



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP


Theme by Dustin Baccetti
Powered by phpBB 2001, 2002 phpBB Group

Copyright MQSeries.net. All rights reserved.