ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » How to handle unescaped control characters

Post new topic  Reply to topic
 How to handle unescaped control characters « View previous topic :: View next topic » 
Author Message
ruturajw
PostPosted: Tue May 29, 2012 10:30 pm    Post subject: How to handle unescaped control characters Reply with quote

Newbie

Joined: 05 Jan 2010
Posts: 8

Hi,
I'm dealing with data that has been migrated to a database from a legacy source. The data during migration wasn't scrubbed and now has control characters e.g. RS, SUB without < > around them.

I've a message flow which reads from this database and presents as XML output. The XML parser throws an exception when it encounters this data. The error is BIP5117 - Text = XMLHandler::error reported from the Xerces parser.196.Null pointer.1.1170.Invalid character (Unicode: 0x1A).

I've tried casting using CCSID 1208 (UTF- but no luck.

Would appreciate if you can offer any ideas.

Using WMB v7.0.0.3.

Cheers,
Ruturaj.
Back to top
View user's profile Send private message
smdavies99
PostPosted: Tue May 29, 2012 10:46 pm    Post subject: Reply with quote

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

so the data you are working with isn't clean?
Think of it this way,
You are on a raft heading down the Niagra river towards the falls. no matter how hard you paddle and how close you get to shore, there is always a new obstacle getting in your way and stopping you from reaching safety

RS & SUB ? It sounds like the legacy data was in Ascii ( possibly even 7bit )
and was inserted into the DB is some random CCSID.

You could read the data and treat it as a blob and work out what bad (non printable)characters are in it and remove them but unless you go through the whole DB you can never be sure that you have got everything.
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.
Back to top
View user's profile Send private message
kimbert
PostPosted: Wed May 30, 2012 12:51 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Quote:
I've a message flow which reads from this database and presents as XML output
So what's wrong with the data that is in the database? It looks to me as if the data is fine - but your message flow is putting that data, including the illegal-for-XML characters, into an XML message. That's a problem that your message flow needs to solve.
I think you need to put some code into your message flow that checks each string before assigning it to OutputRoot.XMLNSC. Remove or replace any characters that are not legal for XML.
Back to top
View user's profile Send private message
ruturajw
PostPosted: Wed May 30, 2012 7:29 pm    Post subject: Reply with quote

Newbie

Joined: 05 Jan 2010
Posts: 8

kimbert wrote:
Quote:
I've a message flow which reads from this database and presents as XML output
So what's wrong with the data that is in the database? It looks to me as if the data is fine - but your message flow is putting that data, including the illegal-for-XML characters, into an XML message. That's a problem that your message flow needs to solve.
I think you need to put some code into your message flow that checks each string before assigning it to OutputRoot.XMLNSC. Remove or replace any characters that are not legal for XML.


And that's what I'm struggling with i.e. check for illegal for XML characters. My last recourse (I think) is to check each char if it lies in a-z, A-Z etc. range. If not, drop it. This is cumbersome and not sure if will work.
Back to top
View user's profile Send private message
ruturajw
PostPosted: Wed May 30, 2012 7:49 pm    Post subject: Reply with quote

Newbie

Joined: 05 Jan 2010
Posts: 8

smdavies99 wrote:

You could read the data and treat it as a blob


Hi, tried this and it failed too. Casting to blob throws an exception.
Back to top
View user's profile Send private message
smdavies99
PostPosted: Wed May 30, 2012 10:41 pm    Post subject: Reply with quote

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

ruturajw wrote:
smdavies99 wrote:

You could read the data and treat it as a blob


Hi, tried this and it failed too. Casting to blob throws an exception.


What I meant was that you read the message as a BLOB. Work on that to remove the bad characters and then parse it into something that can be output as XML.
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » How to handle unescaped control characters
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.