ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » MQInput Node xml validation

Post new topic  Reply to topic Goto page 1, 2  Next
 MQInput Node xml validation « View previous topic :: View next topic » 
Author Message
dbennett9283
PostPosted: Wed Jan 21, 2009 6:45 am    Post subject: MQInput Node xml validation Reply with quote

Novice

Joined: 30 Aug 2006
Posts: 11

I have an MQInput Node w/Message Domain XML and Validate=None. This node is allowing invalid characters through that are causing trouble in a downstream process that is performing a web service call. See xml segment below. The second AddressLine contains extended ASCII characters that should probably be tossed out on the inbound parsing but are not. I am up on WBIMB v5.0.6. Any ideas on how to tighten up the validation to toss an exception if invalid xml such as this comes through?

<Address xsi:type="RetailTransactionAddress360">
<AddressLine>1117 S. RIVER ST. UNIT A</AddressLine>
<AddressLine>*CASH ONLYññññññññññ</AddressLine>
<City>SANTA ANA</City>
<State>CA</State>
<PostalCode>92707</PostalCode>
<CountryCode/>
</Address>
Back to top
View user's profile Send private message
kimbert
PostPosted: Wed Jan 21, 2009 7:50 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
I have an MQInput Node w/Message Domain XML and Validate=None
You are dealing with web services. Web services always use namespaces. XML domain does not support namespaces.
You should switch to XMLNS ( as you're on v5, XMLNSC is not available ).
btw, the XML/XMLNS parsers don't even look at the 'Validate' setting. That is for schema validation only.
Quote:
The second AddressLine contains extended ASCII characters that should probably be tossed out on the inbound parsing but are not
No need to use words like 'probably'. The XML specification clearly states the set of charactrers which are legal for an XML document. I would be surprised if the WMB XML parser was getting this wrong, but it's possible.
You should also check that you are parsing using the correct code page - using the wrong CCSID can generate illegal chars.
Back to top
View user's profile Send private message
dbennett9283
PostPosted: Wed Jan 21, 2009 8:44 am    Post subject: Reply with quote

Novice

Joined: 30 Aug 2006
Posts: 11

kimbert - Thank you for your response. The downstream web service call that I was referring to is contained in a separate message flow. The message flow that I am having an issue with simply pulls in the xml message, converts the message to BLOB data in a Compute Node, then writes the resulting output to a MQSeries queue for backend mainframe processing. The offending character that is "sneaking" through the upfront parser is #xF1 -- the W3C specification (4.2.2) states that characters above #x7F should be 'escaped'. This contradiction explains why I used the word "probably" to highlight an apparent shortcoming in Broker xml parsing behavior. I suppose I could add some sort of filtering logic in the Compute Node, but this solution could not easily be appied to the many fields that are being mapped from xml to BLOB...
Back to top
View user's profile Send private message
kimbert
PostPosted: Wed Jan 21, 2009 12:01 pm    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

You should be looking at section 2.2, not 4.2.2:
http://www.w3.org/TR/2006/REC-xml-20060816/#NT-Char
According to section 2.2, Unicode character 0xF1 is valid in an XML document.

Section "4.2.2 Entity Declarations" specifies the characters allowed within URI references.
Back to top
View user's profile Send private message
dbennett9283
PostPosted: Fri Jan 23, 2009 12:03 pm    Post subject: Reply with quote

Novice

Joined: 30 Aug 2006
Posts: 11

On the subsequent web service call via HTTP Request node, the following SOAP segment is generated in MRM and sent to WebSphere Application Server 6.1.

<NS9:customerInfo xmlns:NS9="http://XYZRewards.com">
<NS9:customerNumber>12024547</NS9:customerNumber>
<NS9:firstName>patrick</NS9:firstName>
<NS9:lastName>pazier</NS9:lastName>
<NS9:addressLine1>3222 test</NS9:addressLine1>
<NS9:addressLine2>*CASH ONLYññññññññññ</NS9:addressLine2> <NS9:city>phila</NS9:city>
<NS9:state>PA</NS9:state>
<NS9:postalCode>19132</NS9:postalCode>
<NS9:country/>
<NS9:emailAddress>test444@test.com</NS9:emailAddress>
<NS9:telephoneNumber>2155551212</NS9:telephoneNumber>
</NS9:customerInfo>

An HTTP Internal Server Error (500) is generated with the following faultstring:

org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0xffffffff) was found in the element content of the document. Message being parsed:

Any idea as to why Message Broker has no issue with these characters but WAS 6.1 does? Thanks for any support you can provide...
Back to top
View user's profile Send private message
kimbert
PostPosted: Fri Jan 23, 2009 12:33 pm    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
Any idea as to why Message Broker has no issue with these characters but WAS 6.1 does?
It is possible that message broker and WAS are seeing different characters. Remember that they both receive a string of bytes. In order to convert those bytes to characters, they use a code page. So if they are using different code pages, then they will see different characters.
I don't blame you for being suspicious, but this is unlikely to be a broker defect. The core XML parser used by broker is tried and tested code.
Back to top
View user's profile Send private message
dbennett9283
PostPosted: Mon Jan 26, 2009 12:33 pm    Post subject: Reply with quote

Novice

Joined: 30 Aug 2006
Posts: 11

Shown below is another SOAP excerpt. This is what is being passed from WBIMB to WAS via the service call. Could 'encoding="UTF-8"' on the xml declaration be converting the extended ascii characters in "addressLine2" to invalid control characters?

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Header>
<MessageType>HTTP_Request</MessageType>
</soapenv:Header>
<soapenv:Body>
<NS1:submitPurchase xmlns:NS1="http://enterprise.xyz.com">
<NS1:receiptIn>
<NS2:storeID xmlns:NS2="http://xyzRewards.com">8831</NS2:storeID>
<NS3:registerID xmlns:NS3="http://xyzRewards.com">8202</NS3:registerID>
<NS4:receiptID xmlns:NS4="http://xyzRewards.com">001845</NS4:receiptID>
<NS5:loyaltyMemberID xmlns:NS5="http://xyzRewards.com">440000105444</NS5:loyaltyMemberID>
<NS6:employeeID xmlns:NS6="http://xyzRewards.com"/>
<NS7:cashierID xmlns:NS7="http://xyzRewards.com">000020034 </NS7:cashierID>
<NS8:receiptDateTime xmlns:NS8="http://xyzRewards.com">2008-07-02T09:48:38</NS8:receiptDateTime>
<NS9:customerInfo xmlns:NS9="http://xyzRewards.com">
<NS9:customerNumber>12024547</NS9:customerNumber>
<NS9:firstName>patrick</NS9:firstName>
<NS9:lastName>pazier</NS9:lastName>
<NS9:addressLine1>3222 test</NS9:addressLine1>
<NS9:addressLine2>*CASH ONLYññññññññññ</NS9:addressLine2>
<NS9:city>phila</NS9:city>
<NS9:state>PA</NS9:state>
<NS9:postalCode>19132</NS9:postalCode>
<NS9:country/>
<NS9:emailAddress>test444@test.com</NS9:emailAddress>
<NS9:telephoneNumber>2155551212</NS9:telephoneNumber>
</NS9:customerInfo>
Back to top
View user's profile Send private message
kimbert
PostPosted: Mon Jan 26, 2009 3:29 pm    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
This is what is being passed from WBIMB to WAS via the service call
Not necessarily. WBIMB is passing bytes, not characters to WAS. WAS converts those bytes into characters using whatever code page it decides to use ( probably UTF-8, as specified in the header, but it might be overridden by a message header. I don't know what rules WAS uses to choose the code page ).
Quote:
Could 'encoding="UTF-8"' on the xml declaration be converting the extended ascii characters in "addressLine2" to invalid control characters?
It depends what you mean. Message broker holds those characters in UTF16 when they are in the message tree. I presume that your Compute node is using ASBITSTREAM to convert the message tree to a BLOB. So you must be specifying a code page for ASBITSTREAM to use. The resulting XML document will be a stream of bytes in the specified code page. So if your message flow is not producing a stream of UTF-8 bytes, then you should either change the XML declaration or change your flow.
Or to put it more simply: is the document written by WBIMB really in UTF-8? Have you tested it in a browser, or with some other XML parser?
Back to top
View user's profile Send private message
dbennett9283
PostPosted: Wed Jan 28, 2009 6:30 am    Post subject: Reply with quote

Novice

Joined: 30 Aug 2006
Posts: 11

Kimbert - Thanks again for your help as your responses have been very helpful. One more question...I tried tinkering w/the HTTP Request Header in order to pass the proper code page to WAS so these strange characters would stop error-ing out. I did this by unchecking "Generate default HTTP headers from input" on the HTTP Request Node properties setting and included the following code in the front-end Compute Node:

SET OutputRoot.HTTPRequestHeader."POST" = '/XYZzzzServiceRouter/services/ZzzService';
SET OutputRoot.HTTPRequestHeader."Host" = 'xyzwasenttest.xyz.com';
SET OutputRoot.HTTPRequestHeader."Content-Type" = 'text/xml; charset=ISO-8859-1';
SET OutputRoot.HTTPRequestHeader."SOAPAction" = '';

Shown below are the results...Bottom line...I do not seem to be having the desired effect of altering the code page. Am I going about this the right way? Does the HTTP Request Header govern the code page used by the called service to decode the incoming bitstream?

HTTPRequestHeader
POST:CHARACTER:/XYZzzzServiceRouter/services/ZzzService
Host:CHARACTER:xyzwasenttest.xyz.com
Content-Type:CHARACTER:text/xml; charset=ISO-8859-1
SOAPAction:CHARACTER:
Content-Length:CHARACTER:150

HTTPResponseHeader
X-Original-HTTP-Status-Line:CHARACTER:HTTP/1.0 500 Internal Server Error
X-Original-HTTP-Status-Code:INTEGER:500
Content-Type:CHARACTER:text/xml; charset=utf-8
Content-Language:CHARACTER:en-US
Content-Length:CHARACTER:506
Date:CHARACTER:Wed, 28 Jan 2009 14:20:22 GMT
Server:CHARACTER:WebSphere Application Server/6.1

faultstring..:
WSWS3400I: Info: unexpected exception.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Wed Jan 28, 2009 6:38 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

It doesn't matter what determines the codepage used by the called service to decode the incoming bitstream, if the incoming bitstream is inconsistent in the codepage it is encoded with.

Kimbert pointed you at your ASBITSTREAM. You decided to adjust your HTTP headers. Why?
Back to top
View user's profile Send private message
dbennett9283
PostPosted: Wed Jan 28, 2009 11:53 am    Post subject: Reply with quote

Novice

Joined: 30 Aug 2006
Posts: 11

ASBITSTREAM-to-BLOB does not apply in my case as I am building my SOAP message via MRM in the XMLNS domain. The HTTP Request Header has a "charset" value on "Content-Type" and I assumed that that value allows Broker to pass the extended ASCII codepage on to the called service, in this case WAS. If that is an invalid assumption, please pass along the correct procedure. I would appreciate your help.
Back to top
View user's profile Send private message
kimbert
PostPosted: Wed Jan 28, 2009 2:50 pm    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
ASBITSTREAM-to-BLOB does not apply in my case
You could easily have said so in your last reply. Help us to help you.

This should be simple to debug if you go back to first principles. Please perform all of the following steps.
1. Insert a Trace node immediately before the HttpRequest node. Look at the value of addressLine2 in the Trace node output. What characters do you see in the UTF16 string which is in the message tree?
2. Intercept the output message and look at the bytes of the output message. Are they the bytes that you were expecting to see ( i.e. the UTF8 representation of the message tree )? If not, check OutputRoot.Properties.CodedCharSetId ( your Trace node output will show the properties folder )
3. If 2. was OK, then message broker has done its job, and the solution lies with WAS. It must be using the wrong code page. It is now your job to find out why.

When you reply to this thread, please respond to all of the above points.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Wed Jan 28, 2009 3:25 pm    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

Kimbert - That will only prove that the problem lies with WAS if the bad characters were not in the input message to the flow in the first place. Which hasn't been shown...
Back to top
View user's profile Send private message
dbennett9283
PostPosted: Thu Jan 29, 2009 6:48 am    Post subject: Reply with quote

Novice

Joined: 30 Aug 2006
Posts: 11

Kimbert - I followed your instructions. Shown below is the trace output produced right before the HTTP Request Node. I've highlighted both the properties ccsid setting and the addressLine2 value per your suggestion. It does indeed appear that the character has changed from ñ to ±. Very interesting...why did this occur?
- -
- - Trace Node..: xyzServiceRequest.msgflow - Trace #2
- -
- - Date-Time...: 2009-01-29 09:02:43.722803
- -
LocalEnvironment: (
(0x01000000):Properties = (
(0x03000000):MessageSet = 'I5M4H2K002001'
(0x03000000):MessageType = 'Envelope'
(0x03000000):MessageFormat = 'XML1'
(0x03000000):Encoding = 546
(0x03000000):CodedCharSetId = 437
(0x03000000):Transactional = TRUE
(0x03000000):Persistence = FALSE
(0x03000000):CreationTime = GMTTIMESTAMP '2009-01-29 14:02:32.050'
(0x03000000):ExpirationTime = -1
(0x03000000):Priority = 0
(0x03000000):ReplyIdentifier = X'000000000000000000000000000000000000000000000000'
(0x03000000):ReplyProtocol = 'MQ'
(0x03000000):Topic = NULL
(0x03000000):ContentType = ''
(0x03000000):IdentitySourceType = ''
(0x03000000):IdentitySourceToken = ''
(0x03000000):IdentitySourcePassword = ''
(0x03000000):IdentitySourceIssuedBy = ''
(0x03000000):IdentityMappedType = ''
(0x03000000):IdentityMappedToken = ''
(0x03000000):IdentityMappedPassword = ''
(0x03000000):IdentityMappedIssuedBy = ''
)
(0x01000000):MQMD = (
(0x03000000):SourceQueue = 'LOYALEXT.RECV'
(0x03000000):Transactional = TRUE
(0x03000000):Encoding = 546
(0x03000000):CodedCharSetId = 437
(0x03000000):Format = ' '
(0x03000000):Version = 2
(0x03000000):Report = 0
(0x03000000):MsgType = 8
(0x03000000):Expiry = -1
(0x03000000):Feedback = 0
(0x03000000):Priority = 0
(0x03000000):Persistence = 0
(0x03000000):MsgId = X'414d51205742524b36315f44454641553cb5814920002815'
(0x03000000):CorrelId = X'000000000000000000000000000000000000000000000000'
(0x03000000):BackoutCount = 0
(0x03000000):ReplyToQ = ' '
(0x03000000):ReplyToQMgr = 'WBRK61_DEFAULT_QUEUE_MANAGER '
(0x03000000):UserIdentifier = 'db2admin '
(0x03000000):AccountingToken = X'16010515000000f094bd96fe266e8037da94230404000000000000000000000b'
(0x03000000):ApplIdentityData = ' '
(0x03000000):PutApplType = 11
(0x03000000):PutApplName = 'nstallation\ih03\rfhutil.exe'
(0x03000000):PutDate = DATE '2009-01-29'
(0x03000000):PutTime = GMTTIME '14:02:32.050'
(0x03000000):ApplOriginData = ' '
(0x03000000):GroupId = X'000000000000000000000000000000000000000000000000'
(0x03000000):MsgSeqNumber = 1
(0x03000000):Offset = 0
(0x03000000):MsgFlags = 0
(0x03000000):OriginalLength = -1
)
)
Environment.....: (
(0x01000000):Variables = (
(0x03000000):OutputMsg = '7931 101 001845 2008-07-02 09:48:38 3 '
(0x01000000):SOAP_Hdr = (
(0x03000000):MessageType = 'HTTP_Request'
)
)
)
Root............: (
(0x01000000):Properties = (
(0x03000000):MessageSet = NULL
(0x03000000):MessageType = NULL
(0x03000000):MessageFormat = NULL
(0x03000000):Encoding = NULL
(0x03000000):CodedCharSetId = NULL (0x03000000):Transactional = NULL
(0x03000000):Persistence = NULL
(0x03000000):CreationTime = NULL
(0x03000000):ExpirationTime = NULL
(0x03000000):Priority = NULL
(0x03000000):ReplyIdentifier = NULL
(0x03000000):ReplyProtocol = 'MQ'
(0x03000000):Topic = NULL
(0x03000000):ContentType = NULL
(0x03000000):IdentitySourceType = NULL
(0x03000000):IdentitySourceToken = NULL
(0x03000000):IdentitySourcePassword = NULL
(0x03000000):IdentitySourceIssuedBy = NULL
(0x03000000):IdentityMappedType = NULL
(0x03000000):IdentityMappedToken = NULL
(0x03000000):IdentityMappedPassword = NULL
(0x03000000):IdentityMappedIssuedBy = NULL
)
(0x01000010):XMLNS = (
(0x05000018): = (
(0x06000011): = '1.0'
(0x06000012): = 'UTF-8'
)
(0x01000000)http://schemas.xmlsoap.org/soap/envelope/:Envelope = (
(0x01000000)http://schemas.xmlsoap.org/soap/envelope/:Header = (
(0x01000000):MessageType = (
(0x02000000): = 'HTTP_Request'
)
)
(0x07000012)xmlns:soapenv = 'http://schemas.xmlsoap.org/soap/envelope/'
(0x01000000)http://schemas.xmlsoap.org/soap/envelope/:Body = (
(0x01000000)http://loyalty.zzz.enterprise.xyz.com:submitPurchase = (
(0x01000000)http://loyalty.zzz.enterprise.xyz.com:receiptIn = (
(0x01000000)http://xyzRewards.com:storeID = (
(0x02000000): = '7931'
)
(0x01000000)http://xyzRewards.com:registerID = (
(0x02000000): = '101'
)
(0x01000000)http://xyzRewards.com:receiptID = (
(0x02000000): = '001845'
)
(0x01000000)http://xyzRewards.com:loyaltyMemberID = (
(0x02000000): = '99555555457'
)
(0x01000000)http://xyzRewards.com:employeeID = (
(0x02000000): = ''
)
(0x01000000)http://xyzRewards.com:cashierID = (
(0x02000000): = '000020034 '
)
(0x01000000)http://xyzRewards.com:receiptDateTime = (
(0x02000000): = '2008-07-02T09:48:38'
)
(0x01000000)http://xyzRewards.com:customerInfo = (
(0x01000000)http://xyzRewards.com:customerNumber = (
(0x02000000): = '12024547'
)
(0x01000000)http://xyzRewards.com:firstName = (
(0x02000000): = 'patrick'
)
(0x01000000)http://xyzRewards.com:lastName = (
(0x02000000): = 'pazier'
)
(0x01000000)http://xyzRewards.com:addressLine1 = (
(0x02000000): = '3222 test'
)
(0x01000000)http://xyzRewards.com:addressLine2 = (
(0x02000000): = '*CASH ONLY±±±±±±±±±±'
)
(0x01000000)http://xyzRewards.com:city = (
(0x02000000): = 'phila'
)
(0x01000000)http://xyzRewards.com:state = (
(0x02000000): = 'PA'
)
(0x01000000)http://xyzRewards.com:postalCode = (
(0x02000000): = '19132'
)
(0x01000000)http://xyzRewards.com:country = (
(0x02000000): = ''
)
(0x01000000)http://xyzRewards.com:emailAddress = (
(0x02000000): = 'test444@test.com'
)
(0x01000000)http://xyzRewards.com:telephoneNumber = (
(0x02000000): = '2155551212'
)
)
(0x01000000)http://xyzRewards.com:lineItems = (
(0x01000000)http://xyzRewards.com:lineItemType = (
(0x02000000): = 'ReturnCore'
)
(0x01000000)http://xyzRewards.com:sequenceNumber = (
(0x02000000): = '00001'
)
(0x01000000)http://xyzRewards.com:voidFlag = (
(0x02000000): = '0'
)
(0x01000000)http://xyzRewards.com:departmentID = (
(0x02000000): = '25'
)
(0x01000000)http://xyzRewards.com:catgegoryID = (
(0x02000000): = '123'
)
(0x01000000)http://xyzRewards.com:subCatgegoryID = (
(0x02000000): = '040'
)
(0x01000000)http://xyzRewards.com:itemNumber = (
(0x02000000): = '9260032'
)
(0x01000000)http://xyzRewards.com:description = (
(0x02000000): = 'REMANUFACTURED WIDGET'
)
(0x01000000)http://xyzRewards.com:quantity = (
(0x02000000): = '0'
)
(0x01000000)http://xyzRewards.com:extendedAmount = (
(0x02000000): = '-0000040.00'
)
(0x01000000)http://xyzRewards.com:discountAmount = (
(0x02000000): = '00000.00'
)
(0x01000000)http://xyzRewards.com:taxAmount = (
(0x02000000): = '-003.30'
)
(0x01000000)http://xyzRewards.com:orderType = (
(0x02000000): = 'RegOrder'
)
(0x01000000)http://xyzRewards.com:originalStoreID = (
(0x02000000): = '7931'
)
(0x01000000)http://xyzRewards.com:originalRegisterID = (
(0x02000000): = '101'
)
(0x01000000)http://xyzRewards.com:originalReceiptID = (
(0x02000000): = '001844'
)
(0x01000000)http://xyzRewards.com:originalReceiptDateTime = (
(0x02000000): = '2008-07-01T09:48:38'
)
)
(0x01000000)http://xyzRewards.com:paymentsTendered = (
(0x01000000)http://xyzRewards.com:tenderType = (
(0x02000000): = 'Cash'
)
(0x01000000)http://xyzRewards.com:amount = (
(0x02000000): = '-0000043.30'
)
(0x01000000)http://xyzRewards.com:cardNumber = (
(0x02000000): = ''
)
)
)
)
)
)
)
)
-+-
-+- - -+- - -+- - -+- - -+- - -+- - -+- - -+- - -+-
-+-
Back to top
View user's profile Send private message
kimbert
PostPosted: Thu Jan 29, 2009 7:47 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Hi,

Before you do anything else, you need to understand the basic facts about code pages and Unicode. This article by Joel Storey is good: http://www.joelonsoftware.com/articles/Unicode.html

When you understand what Unicode is you will not need to ask questions like this one
Quote:
It does indeed appear that the character has changed from ñ to ±. Very interesting...why did this occur?

[edit]That sounds really bad. What I'm trying to say is that the code page determines the character. So it's not really very interesting or surprising.[/edit]
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » MQInput Node xml validation
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.