|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
corrupted message??? |
« View previous topic :: View next topic » |
Author |
Message
|
slam |
Posted: Fri Aug 23, 2002 6:22 am Post subject: corrupted message??? |
|
|
Newbie
Joined: 22 Aug 2002 Posts: 7
|
Hello:
We are having a very odd sporadic problem with messages between AS/400
and OS/390 machines. The problem started occurring when the AS/400 was
upgraded to OS/400 5.1 and MQSeries 5.2. The software on the OS/390 has not changed.
I am hoping you can help point us in the right direction and give us
some diagnostic tips and tools.
The scenario is:
We have an in-house written File Transfer program, used to transfer files between the AS/400 and OS/390 over MQSeries. The File Transfer
program on the AS/400 box is written in C.
The AS/400 File Transfer program reads the source file and creates
a 32k message by combining as many file records that will fit within a
32k message. It then MQPUTS the 32k msg on the OS/390 transmission queue
and repeats this process until the end of file. On the OS/390, the File Transfer program MQGETS the message from the queue, splits the message
back into the file records and writes them to a file. It repeats this
process until end of file is sent.
Sometimes when a file is transferred, the OS/390 target file has
missing records. Yet, if the file is transferred again the records will
be in the file. For example, I transferred one file and it was missing
4 records in the target file. I then transferred the same file again
and all records were within the target file. I kept repeating the file
transfers and they were all successful.
From the file transfer log, we verified that we are not losing
messages. If 52 messages were sent, 52 messages arrived. The problem
is within the message data.
The messages are persistent and I found them in the OS/390 MQ log.
There were binary zeroes X'0000000' where the file record data should
have been written. For example, the message data for two messages
looked like this:
Msgx FILE RECORD1;FILE RECORD2;FILE RECORD3;0000000000000000000000000
Msgx+1 FILE RECORD6;FILE RECORD7;FILE RECORD8;FILE RECORD9;FILERECORD10
when it should look like:
msgx FILE RECORD1;FILE RECORD2;FILE RECORD3;FILE RECORD4;FILE RECORD5
Msgx+1 FILE RECORD6;FILE RECORD7;FILE RECORD8;FILE RECORD9;FILERECORD10
The binary zeroes do not start exactly at the end of a record boundary; i.e., the end of file record3 is binary zeroes.
My questions are these.
Do you know what could cause binary zeroes to be written to a message
record occassionally; but not consistently?
I would like to read the OS/400 MQ log/journals to see if the messages
are being MQPUT with the binary zeroes. If the binary zeroes are not
there, then I know something is happening in the channel or mca. If the
binary zeroes are present, then either the File Transfer program is
writing the binary zeroes or ???
How do we read the journal? Are there any tools, process or programs to format the journal?
The problem started occurring when we upgraded the OS/400 and MQSeries;
however it is a sporadic problem and very difficult to recreate
consistently. It is difficult to activate a trace because we have over
432 AS/400 machines with queue managers (one box/qmgr for every facility). Every night, all 432 facilities transfer two
files to the OS/390. We usually receive 10 files with missing records;
but every night it is different site that have problems. If we put a
trace on at one site, its a 4% chance that their file transfer will be
incomplete.
Do you have any less invasive diagnostic tools we could use on all
boxes?
WE don't see any communication network errors. Our network is SNA
Any help would be greatly appreciated |
|
Back to top |
|
 |
slam |
Posted: Fri Aug 23, 2002 9:12 am Post subject: |
|
|
Newbie
Joined: 22 Aug 2002 Posts: 7
|
We received a reply today from IBM. The problem is a known one, and a PTF to fix it is in test status. It is not a MQ Series issue, but a operating system level problem. |
|
Back to top |
|
 |
RogerLacroix |
Posted: Fri Aug 23, 2002 7:11 pm Post subject: |
|
|
 Jedi Knight
Joined: 15 May 2001 Posts: 3264 Location: London, ON Canada
|
|
Back to top |
|
 |
slam |
Posted: Mon Aug 26, 2002 5:32 am Post subject: |
|
|
Newbie
Joined: 22 Aug 2002 Posts: 7
|
Roger,
Not the same issue. This one is an AS400 to S/390 problem, with a fix required on the OS of the AS400.
The other is a AIX issue. |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|