Author |
Message
|
Diwakar |
Posted: Wed Jan 12, 2011 11:55 pm Post subject: Unable to TRIM spaces/tabs/newlines on either sides of a tag |
|
|
Apprentice
Joined: 06 Apr 2005 Posts: 33
|
Hi,
We use WMB v7 on MQ v7 on Windows 2008 servers.
We have an input XML file that we read using a FileInputNode. We use XMLNSC parser in FIN. The xml in our file is a valid XML as we can open it in a browser and other xml software.
The input xml looks like:
<Message>
<Header>
<price>
123.45
</price>
<name>
John
</name>
. . . . .
</Header>
. . . . .
</Message>
The xml has newlines, tabs, spaces, and extra characters on either sides of the tag values (123.45, and John). How do I get rid of them ? I tried TRIM( ) but it didn't remove all extra characters.
Don't know how to get rid of the unwanted spaces/tabs/newlines on either sides of a tag value (inside a tag). Can some one please advise ? If I don't remove the characters I get an error when I CAST the price tag to a decimal.
Thanks
diwakar. _________________ Diwakar. |
|
Back to top |
|
 |
fatherjack |
Posted: Thu Jan 13, 2011 1:00 am Post subject: Re: Unable to TRIM spaces/tabs/newlines on either sides of a |
|
|
 Knight
Joined: 14 Apr 2010 Posts: 522 Location: Craggy Island
|
Diwakar wrote: |
I tried TRIM( ) but it didn't remove all extra characters. |
TRIM should do it. Post your code here. _________________ Never let the facts get in the way of a good theory. |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Jan 13, 2011 3:00 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
You shouldn't need to do this.
You should perfectly be able to create an XML Schema that describes your XML message, including indicating that price is a Decimal field.
Then turn this into a message model and instruct the XMLNSC parser to use the message set and to build the tree using native types.
Then XMLNSC will do all of this for you, and your field will appear as a decimal already by the time it leaves your FileInput node. Or your FileInput node will receive an error that the message can not be validated against the message set if, for example, price is NOT a valid decimal number. |
|
Back to top |
|
 |
fatherjack |
Posted: Thu Jan 13, 2011 3:12 am Post subject: |
|
|
 Knight
Joined: 14 Apr 2010 Posts: 522 Location: Craggy Island
|
mqjeff wrote: |
Or your FileInput node will receive an error that the message can not be validated against the message set if, for example, price is NOT a valid decimal number. |
Isn't this what he's trying to prevent. For example the price is not a valid decimal because
Quote: |
The xml has newlines, tabs, spaces, and extra characters on either sides of the tag values |
It's these he's trying to remove to make the data valid.
You might argue that the sender of the file should fix the problem and I'd probably agree. _________________ Never let the facts get in the way of a good theory. |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Jan 13, 2011 3:44 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
There exists a possibility that the XMLNSC parser will handle this whitespace and still treat the value as a valid decimal number.
A possibility - I've not tested it and it's as likely wrong as not. |
|
Back to top |
|
 |
Vitor |
Posted: Thu Jan 13, 2011 5:16 am Post subject: Re: Unable to TRIM spaces/tabs/newlines on either sides of a |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Diwakar wrote: |
The xml has newlines, tabs, spaces, and extra characters on either sides of the tag values (123.45, and John). |
How does the XML Schema which describes this document describe these fields?
Diwakar wrote: |
The xml in our file is a valid XML as we can open it in a browser and other xml software |
No, it's well formed. If it was valid it would match it's schema and wouldn't have all this rubbish in the tags.
How is this document produced? Is it actually written out by an application or has it been produced manually as test data by someone with an editor?
(Which it sounds like to me)
If so, you may be trying to fix a problem that isn't there. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Diwakar |
Posted: Thu Jan 13, 2011 8:08 am Post subject: |
|
|
Apprentice
Joined: 06 Apr 2005 Posts: 33
|
Thanks everyone.
The xml is in production, produced by a non-human.
We are migrating from version MB v6.0 to MB v7.
We had a JText Adapter do it in version 6.0, using a schema that specified every field in the input xml as xsd:string of length 255.
We are re-writing the adapter's code in msgflows(replacing adapters) in MB v7, and we are using the same xml schemas. The schemas are used all over downstream, so we can't change the schema's at this point. Also, the fields are referred to as character strings at a lot of places in the code, the CAST to decimal is done for storing it at one point.
I was thinking it would be a simple thing to TRIM and then CAST, but somehow it's failing. The XMLNSC parser removes extra/unwanted spaces in between tags, but not inside a tag ( as it could be a valid value and probably needed). We need to write code to remove leading/trailing spaces from inside a tag's fieldvalue.
Here is my line of code that's not removing all the spaces:
SET OutputRoot.XMLNSC.Message.Header. price = TRIM(InputRoot.XMLNSC.Message.Header.price);
There are many elements inside the Header tag and I need to TRIM each of them. So, I coded a loop to copy the elements one-after-another from InputRoot to OutputRoot, just like in the given factory-supplied ESQL method copyMessageHeaders(); and that's also not removing all extra spaces/tabs from each element.
Does the TRIM function remove only spaces by default ? Do we need to code for tabs, newlines, and all other invisible characters to be removed explicitly ? Looks like it's behaving that way, but I am not sure.
Please advise.
Thanks. _________________ Diwakar. |
|
Back to top |
|
 |
Vitor |
Posted: Thu Jan 13, 2011 8:35 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Diwakar wrote: |
Does the TRIM function remove only spaces by default ? Do we need to code for tabs, newlines, and all other invisible characters to be removed explicitly ? Looks like it's behaving that way, but I am not sure. |
If you'd looked at the documentation for the TRIM function here not only would you be sure, but you'd have been sure a while ago, i.e. before posting here.
If you have a number of characters to remove, you might find TRANSLATE a more efficient method. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Diwakar |
Posted: Thu Jan 13, 2011 8:39 am Post subject: |
|
|
Apprentice
Joined: 06 Apr 2005 Posts: 33
|
Mr. Vitor,
I posted after looking at the info center. How would I know the syntax of the command if I hadn't looked where you pointed ?
It didn't work, and that's why I posted.
Thank _________________ Diwakar. |
|
Back to top |
|
 |
Vitor |
Posted: Thu Jan 13, 2011 8:42 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Diwakar wrote: |
I posted after looking at the info center. How would I know the syntax of the command if I hadn't looked where you pointed ?
|
Then how can you be unsure if the default trim for a CHARACTER is a space when that page says:
Quote: |
If trim_singleton is not specified, a default singleton is assumed. The default singleton depends on the data type of source_string:
Character ' ' (space)
BLOB X'00'
Bit B'0'
|
 _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Diwakar |
Posted: Thu Jan 13, 2011 8:51 am Post subject: |
|
|
Apprentice
Joined: 06 Apr 2005 Posts: 33
|
Vitor,
Since we don't know what all invisible characters could appear inside tag values, do we need to write code to TRIM each of them (about 30 or so in ASCII) ? That sounds awkward for each tag-copy from InputRoot to OutputRoot.
I thought there's a better way.
Thanks. _________________ Diwakar. |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Jan 13, 2011 8:53 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
You are in an awkward situation, where you have done the wrong thing up front - configuring an XSD that provides no meaningful definition of any of your elements.
You can either continue to provide awkward solutions for this situation, or you can correct the XSDs in use so that they do provide meaningful definitions. |
|
Back to top |
|
 |
Vitor |
Posted: Thu Jan 13, 2011 9:12 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Diwakar wrote: |
I thought there's a better way. |
There is, and I suggested it.  _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Vitor |
Posted: Thu Jan 13, 2011 9:13 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
mqjeff wrote: |
You are in an awkward situation, where you have done the wrong thing up front - configuring an XSD that provides no meaningful definition of any of your elements. |
 _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Diwakar |
Posted: Thu Jan 13, 2011 9:22 am Post subject: |
|
|
Apprentice
Joined: 06 Apr 2005 Posts: 33
|
I can't change the 'Awakward situation' for now, as I am not allowed to change all the code written already. So, I'll have to live with TRANSLATE I guess.
DECLARE invisibleCharList CHAR CAST(x'0d0a09 .....' AS CHAR ccsid 1208); -- all the invisible chars go in here
Declare outString CHAR TRANSLATE(sourceString, invisibleCharList);
So, the above 2 lines of code should do what I need ?
thanks. _________________ Diwakar. |
|
Back to top |
|
 |
|