MQSeries.net :: View topic - XML Message Truncated while routing through HTTP Request

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » XML Message Truncated while routing through HTTP Request

Goto page Previous 1, 2

XML Message Truncated while routing through HTTP Request

« View previous topic :: View next topic »

Author

Message

mqbrks

Posted: Wed Oct 24, 2018 10:56 am Post subject:

Voyager

Joined: 17 Jan 2012
Posts: 75

Vitor wrote:

Connect Direct doesn't do a binary to UTF-8 conversion; it either does an EBCDIC to ASCII conversion or no conversion. If it's converting a file with binary data as if the binary is a set of EBCDIC characters, all sorts of weirdness will occur. Like strange EOF marks turning up in the middle of a file.

This is the process that is being used by CD. It is converting the IBM-037 to UTF as we had several data problem when being sent as binary with lot of invalid characters sent by mainframes. UTF-8 is taken as smiley in the below code snippet.

COPYCONR PROCESS
STEP1 COPY FROM(&NODE DSN=&DSN1 DISP=SHR -
SYSOPTS="CODEPAGE=(IBM-037,UTF-

") -
TO ( DSN=&DSN2 DISP=(&DISP1,&DISP2) -
SYSOPTS=":datatype=text:xlate=no:strip.blanks=no:")
IF (STEP1=0) THEN
STEP2 RUN TASK SNODE (PGM=UNIX) SYSOPTS="mv &DSN2 &DSN3"
EIF

Vitor wrote:

Get whoever owns the mainframe JCL that's doing the Connect Direct transfer to add a Sort jobstep before the Connect Direct jobstep. It's 5 lines of JCL and half a dozen sort control cards. The Connect Direct step is probably double that.

Yeah this requires a change on mainframes and resources aren't available to make any changes

Vitor

Posted: Wed Oct 24, 2018 11:15 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

mqbrks wrote:

") -
TO ( DSN=&DSN2 DISP=(&DISP1,&DISP2) -
SYSOPTS=":datatype=text:xlate=no:strip.blanks=no:")
IF (STEP1=0) THEN
STEP2 RUN TASK SNODE (PGM=UNIX) SYSOPTS="mv &DSN2 &DSN3"
EIF

Exactly my point. That 'datatype=text' is telling Connect Direct that the file is composed of nothing but EBCDIC characters and it should move them all from CCSID 037 to UTF-8. If the mainframe file is not in fact all text then this conversion will result in spurious character sequences and the results you are seeing.

mqbrks wrote:

It is converting the IBM-037 to UTF as we had several data problem when being sent as binary with lot of invalid characters sent by mainframes

You mean all of the characters?

EBCDIC (IBM-037) has nothing in common with ASCII (and UTF-8 is just a superset of ASCII). If you move an entirely text file from the mainframe, as binary, onto a Unix or any other distributed box, it will appear to be gibberish with a limited number of printable characters and almost no alphanumerics. This is simply because alaphnumerics are represented by different hex values in EBCDIC.

mqbrks wrote:

Vitor wrote:

Yeah this requires a change on mainframes and resources aren't available to make any changes

Then you're doomed. The fix to your XML truncation problem is to change the Connect Direct datatype to binary and remove that codepage clause (which if memory serves me causes a syntax error if the datatype isn't text). You can then read the file through the File Input node by setting the File Input node code page to '037' not 'Broker Default'.

If you can't make that 2 line change because there are no resources then learn to live with this truncation problem.

If you find someone, get them to cut and paste this above the Connect Direct step (the EXEC card above where those parameters go into SYSIN):

Code:

//SIMPLE EXEC PGM=SORT
//*
//* THIS IS MUCH MORE EFFICIENT THAN DOING A SORT IN IIB
//*
//SORTIN DD DSN=&DSN1,DISP=(MOD,KEEP,KEEP)
//SORTOUT DD DSN=&DSN1,DISP=(MOD,KEEP,KEEP)
//SYSOUT DD SYSOUT=*
//SYSIN DD *
however the file needs to be sorted
/*

Normally I charge $$ for coding. You're welcome.
_________________
Honesty is the best policy.
Insanity is the best defence.

fjb_saper

Posted: Wed Oct 24, 2018 12:53 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

mqbrks wrote:

") -
TO ( DSN=&DSN2 DISP=(&DISP1,&DISP2) -
SYSOPTS=":datatype=text:xlate=no:strip.blanks=no:")
IF (STEP1=0) THEN
STEP2 RUN TASK SNODE (PGM=UNIX) SYSOPTS="mv &DSN2 &DSN3"
EIF

I beg you to notice that in the SYSOPTS for the copy process you have specified XLATE=NO.
So why would you expect to see the text in UTF-8 at the other end if you specifically told Connect Direct not to translate it???

_________________
MQ & Broker admin

Vitor

Posted: Thu Oct 25, 2018 5:07 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

Specifying 2 code pages causes code page transformation. Irrespective of xlate setting.

Don't ask - apparently it's a "feature".

We send everything through Connect Direct as binary and figure it out someplace else, and use Connect Direct for the management & auditing capabilities.
_________________
Honesty is the best policy.
Insanity is the best defence.

mqbrks

Posted: Thu Oct 25, 2018 6:09 pm Post subject:

Voyager

Joined: 17 Jan 2012
Posts: 75

Vitor wrote:

Specifying 2 code pages causes code page transformation. Irrespective of xlate setting.

Don't ask - apparently it's a "feature".

We send everything through Connect Direct as binary and figure it out someplace else, and use Connect Direct for the management & auditing capabilities.

Really appreciate your knowledge Vitor regarding this! I am still investigating between my deadlines for other projects, Seems like the file does have invalid characters. Need to explore more.

Question : Previously we tried to receive the file as Binary but mainframes app was sending many invalid characters like || (something like pipe) which were throwing parsing errors while the data is getting converted. We approached CD solution as CD experts advised to use code page converts in CD rather than IIB DFDL which will filter most of the invalid or gibberish characters. How can IIB filter the gibberish characters ?

Vitor

Posted: Fri Oct 26, 2018 5:03 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

mqbrks wrote:

Previously we tried to receive the file as Binary but mainframes app was sending many invalid characters like || (something like pipe) which were throwing parsing errors while the data is getting converted.

Vitor wrote:

EBCDIC (IBM-037) has nothing in common with ASCII (and UTF-8 is just a superset of ASCII). If you move an entirely text file from the mainframe, as binary, onto a Unix or any other distributed box, it will appear to be gibberish

Notice anything? Like the use of the word "gibberish"?

mqbrks wrote:

How can IIB filter the gibberish characters ?

It doesn't need to filter them, it needs to correctly interpret them. Like I said:

Vitor wrote:

You can then read the file through the File Input node by setting the File Input node code page to '037' not 'Broker Default'.

If you tell the FileInput node to read a file and use an ASCII code page to read it (and the 'BrokerDefault' code page on any distributed platform is an ASCII one) then by the time it gets to the DFDL the message tree is hosed. Internally IIB uses UTF-16 so to build the message tree it's converting what it thinks is some kind of ASCII file into that and you'll get gibberish.

If you tell the FileInput node to use CCSID 037 (which I'm assuming is the code page the file was written in as it's the source for the Connect Direct translation), the FileInput node will correctly interpret the byte stream and you'll get a clean message tree in UTF-16. You can then feed this to the DFDL model (which should be based on the actual record layout and identify which fields are text and which fields are binary).
_________________
Honesty is the best policy.
Insanity is the best defence.

timber

Posted: Sat Oct 27, 2018 1:12 pm Post subject:

Grand Master

Joined: 25 Aug 2015
Posts: 1292

Just to clarify something...and please ignore if you know this already.

The DFDL parser has access to all of IIB's extensive range of character encodings. It is no less (and no more) capable than any other IIB parser in this respect. It has full access to all ICU character tables, and can therefore read and write characters in any encoding (and it will not mind reading in one encoding and writing in a different one, like any IIB parser).

I suspect that the sender is sending an *invalid* EBCDIC character stream. Sounds as if there are UTF-8 characters mixed into the EBCDIC character stream. In which case, there is no tool in the world that can handle such a stream - not Java, not CD. Only custom code that is aware of the sender's format can deal with invalid character streams, and it will require a lot of care. Usually it's a lot simpler and cheaper to send valid a character stream in the first place.

fjb_saper

Posted: Sat Oct 27, 2018 8:33 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

I also saw 2 sysopts statements where I would have expected only one and some very bizarre formatting of the sysopts field:

Code:

sysopts="XXX")

It looks like the poster did not translate it to text and copy the full text.
The error could also have to do with form.
Did the OP right click and validate the process?
_________________
MQ & Broker admin

Display posts from previous:

Goto page Previous 1, 2

Page 2 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » XML Message Truncated while routing through HTTP Request

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP