ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » counting multibyte

Post new topic  Reply to topic
 counting multibyte « View previous topic :: View next topic » 
Author Message
bobbee
PostPosted: Mon Apr 06, 2015 5:14 am    Post subject: counting multibyte Reply with quote

Knight

Joined: 20 Sep 2001
Posts: 541
Location: Tampa

I have a SAP message and need to count the segments they typically have a fixed length. The SAP system is sending 1208 CCSID messages. The particular message I am testing is DBDC. I have tried the following to count the number of characters but it gives me a count not divisible by my segment length. This works when it is not DBCS.

SET DD_LEN = LENGTH(CAST(CONVERT_DD as char CCSID 1208 ENCODING 273));
Back to top
View user's profile Send private message Send e-mail AIM Address
fjb_saper
PostPosted: Mon Apr 06, 2015 7:42 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

The message length should be irrelevant. What is the EDI_DD40 separator? Is it just length, or is it something like CRLF or just LF?
In other words how do you get your message input? Raw IDOC message? Should go by length of EDI_DD40, as it will contain all the fillers. File, check for Line Feeds.
The break for the next message in a file would be the occurence of a new header (EDI_DC40) or any accompanying header (SAPH)...

Note that if you are looking at the IDOC tables on a BAPI you might have one table for headers and one for segments. In that case a change of Idoc number in the segment will signal a change of IDOC.

Hope it helps.
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
bobbee
PostPosted: Mon Apr 06, 2015 8:46 am    Post subject: Reply with quote

Knight

Joined: 20 Sep 2001
Posts: 541
Location: Tampa

These are straight SapGenericIDocObject's. The segments are determined by length. When there is a 1386 multibyte character in the segment. Things are not working the way they should because BYTES not equal to Characters. My code works flawlessly for a non-multibyte character set. But i have 3 inputs, both directions, that have multibyte characters in them
Back to top
View user's profile Send private message Send e-mail AIM Address
fjb_saper
PostPosted: Mon Apr 06, 2015 10:25 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

fjb_saper wrote:
The message length should be irrelevant. What is the EDI_DD40 separator? Is it just length, or is it something like CRLF or just LF?
In other words how do you get your message input? Raw IDOC message? Should go by length of EDI_DD40, as it will contain all the fillers. File, check for Line Feeds.
The break for the next message in a file would be the occurence of a new header (EDI_DC40) or any accompanying header (SAPH)...

Note that if you are looking at the IDOC tables on a BAPI you might have one table for headers and one for segments. In that case a change of Idoc number in the segment will signal a change of IDOC.

Note that the length of the EDI_DD40 record is fixed. The length of the SDATA which is part of the EDI_DD40 is variable and depends on the segment definition. Writing to file trimming trailing blanks makes the EDI_DD40 look like it is variable when in fact logically the trailing blanks are always there.

Hope it helps.

_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
bobbee
PostPosted: Mon Apr 06, 2015 11:48 am    Post subject: Reply with quote

Knight

Joined: 20 Sep 2001
Posts: 541
Location: Tampa

So you can see below the E2ED is the beginning of the next segment. there is no LF no CR not tags. The current C program is working and takes the segment data area. divides by 1055 and then processes each segment by removing the spaces, putting all the segments back to back with a length modifier in front of each one and drops it on a Q to the partner. This is a non-changeable imposed format from the external partner. I have to move that C code into IIB. Single byte works. Now I was given a Multi-BYTE message in CCSID 1386 and the model breaks. The length of all the segment data is no longer divisible bt 1055 because some characters are represented by more than one byte, I have been trying to get IIB to work with the message as a CCSID 1386 and process the multiple byte representation as a single character.

00000496 2 20202020 20202020 20202020 20202032
00000512 01502251 25235ZHV 30313530 32323531 32353233 355A4856
00000528 RSPORDER S02ZHVRS 5253504F 52444552 5330325A 48565253
00000544 P02AG 2 01502241 50303241 47202032 30313530 32323431
00000560 45356 E2ED 34353335 36202020 20202020 45324544
00000576 K0100510 00000000 4B303130 30353130 30303030 30303030
00000592 22744610 5000001E 32323734 34363130 35303030 30303145
00000608 2EDK0100 50000000 3245444B 30313030 35303030 30303030
00000624 10 004 U SD 1.0 31302030 30342055 53442020 20312E30
Back to top
View user's profile Send private message Send e-mail AIM Address
bobbee
PostPosted: Mon Apr 06, 2015 12:06 pm    Post subject: Reply with quote

Knight

Joined: 20 Sep 2001
Posts: 541
Location: Tampa

searching and ran across a discussion you and Tim were having helping someone else with 1386. So wanted to add this. The SAP system is set to send 1208 unicode messages to IIB through the adapter. While there are no properties arriving on the message, I am assuming it is 1208. This is when I first encountered the issue and than saw the DB characters in the buffer. I have been trying to get the broker to process the fields as DB and thus treat the data as Character and not byte. I went through my DFDLs which get involved later and changed BYTE to CHARACTER. But upfront I am trying to determine the length of the SEGMENT data area as with the SINGLE byte EDI_DC message and take the total length and divide by 1055. If I can get IIB to realize the message has DB characters I think I will be OK.
Back to top
View user's profile Send private message Send e-mail AIM Address
fjb_saper
PostPosted: Tue Apr 07, 2015 7:27 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

bobbee wrote:
searching and ran across a discussion you and Tim were having helping someone else with 1386. So wanted to add this. The SAP system is set to send 1208 unicode messages to IIB through the adapter. While there are no properties arriving on the message, I am assuming it is 1208. This is when I first encountered the issue and than saw the DB characters in the buffer. I have been trying to get the broker to process the fields as DB and thus treat the data as Character and not byte. I went through my DFDLs which get involved later and changed BYTE to CHARACTER. But upfront I am trying to determine the length of the SEGMENT data area as with the SINGLE byte EDI_DC message and take the total length and divide by 1055. If I can get IIB to realize the message has DB characters I think I will be OK.

Can you use the same arithmetic on characters instead of bytes? I believe there is a flag somewhere on the SAP adapter that specifies "target system is Double Byte" or something like it...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
bobbee
PostPosted: Tue Apr 07, 2015 11:40 am    Post subject: Reply with quote

Knight

Joined: 20 Sep 2001
Posts: 541
Location: Tampa

The SAP people and management, while upgrading the SAP system to the newest release, from an ancient release, and getting off R# Link (tels you how old) also specified that the SAP system and all outbound data would be in UNICODE. So it is coming out in 1200. They were trying to get the partners to accept UNICODE. No deal there. So the work for some of the stuff is in MQ conversion. And the routing and specialized processing (old exits) are in IIB.

So to restate. I have the iDocStream coming in which is the Control Data and then the segments. It seems when I do a LENGTH on it after casting the BLOB to CHAR of CCSID 1208 is counting bytes and not treating the data as characters, which by all scientific calculations , at least as explained in Intersteller the movie, should treat Multi-byte as a single character.

It isn't and i need to find out how to do it. I even have the DFDL configured as UTF-8 and it is not parsing. I have to turn on debug trace and see what is going on.

BTW, I am on ST, if I think I know whom you are.
Back to top
View user's profile Send private message Send e-mail AIM Address
fjb_saper
PostPosted: Tue Apr 07, 2015 11:46 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

Sorry don't have ST but I have skype...
Do you have any characters that would be multibyte? as the data is received in 1200 CCSID would it work to just double your number when you do the arithmetic? And do the arithmetic in CCSID 1200 before you cast to 1208... Why switch down to 1208 at all? Would the parsing work with 1200 or was it all defined in bytes and not in chars? Once the parsing is done you can output in which ever CCSID you want. Just beware of replacement chars... when there is no map for the character...

Unfortunately I do know...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
bobbee
PostPosted: Tue Apr 07, 2015 12:41 pm    Post subject: Reply with quote

Knight

Joined: 20 Sep 2001
Posts: 541
Location: Tampa

I am looking at the HEX and the 'typical ASCII characters show up as single byte. When the Simplified Chinese show there is multiple bytes that do not represent any thing readable. i am assuming this to be the double/multi-byte. I am under the assumption also that broker has converted the code page to 1208 internally. I have tried leaving it alone and doing no casting, casting it to 1200 and 1208. Still nothing works.

I am wondering if I have to force a parse on it some how.
Back to top
View user's profile Send private message Send e-mail AIM Address
fjb_saper
PostPosted: Tue Apr 07, 2015 4:16 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

bobbee wrote:
I am looking at the HEX and the 'typical ASCII characters show up as single byte. When the Simplified Chinese show there is multiple bytes that do not represent any thing readable. i am assuming this to be the double/multi-byte. I am under the assumption also that broker has converted the code page to 1208 internally. I have tried leaving it alone and doing no casting, casting it to 1200 and 1208. Still nothing works.

I am wondering if I have to force a parse on it some how.

you may have to. Although it's the first time I hear about an Idoc in Chinese... I guess I have been sheltered so far...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
bobbee
PostPosted: Wed Apr 08, 2015 1:43 am    Post subject: Reply with quote

Knight

Joined: 20 Sep 2001
Posts: 541
Location: Tampa

So the old system, each RFC that supports a different code page, and there are 10 of them, go to distinct QMGRS that are configured for that specific code page. The R3 Link in those QMGRS processes the messages in those code pages. So now all code pages are processed by IIB. Actually what is baffling me is there is a C program running as a MQ exit and he has this line of code to figure out the number of segments in the MQ returned buffer:

num = messlen - sapheader_len - control-len;
num_seg = num / data_len; /* data_len will never be zero */

messlen = MQ message len
sapheader-len = sap length
data-len = 1055

This works and is in production. In exvery QMGR regardless of the code page. The only thing I can figure out is the C programs routines are working in the Code Page and knows when it hits a multi-byte character. I have a copy of the before and after message from this exit with multi-byte in my hand, Was delivered last night. I am going to byte count and compare to see if this is correct.
Back to top
View user's profile Send private message Send e-mail AIM Address
bobbee
PostPosted: Thu Apr 09, 2015 6:20 am    Post subject: Reply with quote

Knight

Joined: 20 Sep 2001
Posts: 541
Location: Tampa

Thanks everyone who offered suggestions. Got it working this AM. I have the following ESQL code that creates the input DFDL for the MAP. the map replaces all the code in the C program that does the compression and building of the output message to the partners. Again thanks.

Code:
CREATE COMPUTE MODULE CBS_INBOUND_COMPRESSION_MF_Compute
   CREATE FUNCTION Main() RETURNS BOOLEAN
   BEGIN
      DECLARE i INTEGER;
      DECLARE j INTEGER;
      DECLARE IDocStreamData_CHAR_LEN INTEGER;
      DECLARE CONVERT_LEN  INTEGER;
      DECLARE CONVERT CHAR;

      -- CALL CopyMessageHeaders();
        -- CALL CopyEntireMessage();
--    Convert to character
       SET CONVERT = CAST(CAST(InputRoot.DataObject.ns1:SapGenericIDocObject.IDocStreamData as BLOB) as CHAR CCSID 1208 ENCODING 273);
       SET IDocStreamData_CHAR_LEN = LENGTH(CONVERT);
--    Build the Control Data area from iDocStream
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.tabnam     = SUBSTRING(CONVERT FROM 001 for 10);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.mandt      = SUBSTRING(CONVERT FROM 011 for 03);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.docnum     = SUBSTRING(CONVERT FROM 014 for 16);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.docrel     = SUBSTRING(CONVERT FROM 030 for 04);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.status     = SUBSTRING(CONVERT FROM 034 for 02);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.idoctyp    = SUBSTRING(CONVERT FROM 036 for 08);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.idocdirect = SUBSTRING(CONVERT FROM 044 for 01);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.rcvpor     = SUBSTRING(CONVERT FROM 045 for 10);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.rcvprt     = SUBSTRING(CONVERT FROM 055 for 02);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.rcvprn     = SUBSTRING(CONVERT FROM 057 for 10);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.rcvsad     = SUBSTRING(CONVERT FROM 067 for 21);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.rcvlad     = SUBSTRING(CONVERT FROM 088 for 70);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.edistd     = SUBSTRING(CONVERT FROM 158 for 01);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.stdvrs     = SUBSTRING(CONVERT FROM 159 for 06);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.mestyp     = SUBSTRING(CONVERT FROM 165 for 06);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.mescod     = SUBSTRING(CONVERT FROM 171 for 03);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.mesfct     = SUBSTRING(CONVERT FROM 174 for 03);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.outmod     = SUBSTRING(CONVERT FROM 177 for 01);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.test       = SUBSTRING(CONVERT FROM 178 for 01);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.sndpor     = SUBSTRING(CONVERT FROM 179 for 10);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.sndprt     = SUBSTRING(CONVERT FROM 189 for 02);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.sndprn     = SUBSTRING(CONVERT FROM 191 for 10);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.sndsad     = SUBSTRING(CONVERT FROM 201 for 21);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.sndlad     = SUBSTRING(CONVERT FROM 222 for 70);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.refint     = SUBSTRING(CONVERT FROM 292 for 14);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.refgrp     = SUBSTRING(CONVERT FROM 306 for 14);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.refmes     = SUBSTRING(CONVERT FROM 320 for 14);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.arckey     = SUBSTRING(CONVERT FROM 334 for 70);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.credat     = SUBSTRING(CONVERT FROM 404 for 08);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.cretim     = SUBSTRING(CONVERT FROM 412 for 06);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.stdmes     = SUBSTRING(CONVERT FROM 418 for 06);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.cimtyp     = SUBSTRING(CONVERT FROM 424 for 08);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.ext        = SUBSTRING(CONVERT FROM 432 for 08);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.rcvpfc     = SUBSTRING(CONVERT FROM 440 for 02);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.sndpfc     = SUBSTRING(CONVERT FROM 442 for 02);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.serial     = SUBSTRING(CONVERT FROM 444 for 20);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.ovr        = SUBSTRING(CONVERT FROM 464 for 01);
      SET OutputRoot.DFDL.ns:ALE_IDoc.DC.FILLER     = '                                                            ';
--     How many segmants   
      SET OutputLocalEnvironment.Variables.NUM_SEGS = (IDocStreamData_CHAR_LEN - DBCS_CD_LEN) / 1063;
      SET i = 1;
      set j = 1 + DBCS_CD_LEN;
--     Loop through the segments creating the OutputRoot Array   
      X : WHILE i <= OutputLocalEnvironment.Variables.NUM_SEGS DO
         SET OutputRoot.DFDL.ns:ALE_IDoc.DD[i].sdatatag = SUBSTRING(CONVERT FROM j for 1063);
--         set to the current segment end, plus one (first byte of next) and add the Control data length
         SET j = (i * 1063) + 1 + DBCS_CD_LEN;
         SET i = i + 1;
         END WHILE X;
      
      SET OutputRoot.Properties.CodedCharSetId = 1208;
      SET OutputRoot.Properties.Encoding = 273;

      RETURN TRUE;
   END;
Back to top
View user's profile Send private message Send e-mail AIM Address
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » counting multibyte
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.