Author |
Message
|
grebenar |
Posted: Mon Apr 10, 2006 2:52 am Post subject: Running channel does not transfer messages |
|
|
Novice
Joined: 10 Apr 2006 Posts: 22 Location: Budapest, Hungary
|
Hello!
We have a problem with Willow MQ 5.3 csd04 for HP-UX Itanium.
Have no idea how to solve it:
1. It seems that a sender chl from A to B (A_TO_B) is running. However, there are committed messages in the "B" XMIT q on A.
2. There are no FDCs, no LOG entries
2. When I try to stop the channel, it does not stop
3. If I stop it with 'force process termiation' it stops, but I can not restart it, because the XMIT q is open exclusive by the terminated process (runmqchl_nd).
4. I have to restart the QM, and the messages are transferred properly.
5. I also had this problem with a receiver chl. Channel C_TO_A was running, but messages were staying in "A" XMIT Q on C. I stoppend C_TO_A receiver chl on A with process termination, then restarted, and it was ok.
All the kernel parameters are set, according to the intallation guide (some values are much higher; however 3 parameters (semmap, shmem, sema) doesn't exist on HP-UX).
We do not use QM clusters. This is a simple 1 sender 1 receiver 1 xmit q configuration, and 100-200 physical and alias queues.
Please help, if you encountered this problem. This is a live system  |
|
Back to top |
|
 |
wschutz |
Posted: Mon Apr 10, 2006 3:09 am Post subject: |
|
|
 Jedi Knight
Joined: 02 Jun 2005 Posts: 3316 Location: IBM (retired)
|
Was it running before and suddenly starting showing this behaviour? Or is this a new setup? _________________ -wayne |
|
Back to top |
|
 |
grebenar |
Posted: Mon Apr 10, 2006 3:15 am Post subject: |
|
|
Novice
Joined: 10 Apr 2006 Posts: 22 Location: Budapest, Hungary
|
It's running all the time, transferring 200.000 - 300.000 (or more) messages a day. Sometimes (after 100 minutes inactivity) it goes to inactive, but the current case was around 10.00 am, when there are hundreds of messages per minute. So it had no time to disconnect.
It seems that this phenomenon happens only on this qm, and not on the test qms. My biggest problem is the lack of error messages.
Greetings,
Robert |
|
Back to top |
|
 |
jefflowrey |
Posted: Mon Apr 10, 2006 3:18 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
Can Willow provide you a build of CSD12 for MQ? Or is CSD4 the latest they support? _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
grebenar |
Posted: Mon Apr 10, 2006 4:52 am Post subject: |
|
|
Novice
Joined: 10 Apr 2006 Posts: 22 Location: Budapest, Hungary
|
They have CSD11 up to now. I think i will apply that, because I see no other solution (unfortunately that will need a new license file, they are not compatibile between csd 4 and 11). |
|
Back to top |
|
 |
grebenar |
Posted: Mon Apr 10, 2006 4:54 am Post subject: |
|
|
Novice
Joined: 10 Apr 2006 Posts: 22 Location: Budapest, Hungary
|
Anyway, do you have any idea what to do when the xmit q (or the SYSTEM.ADMIN.COMMAND.QUEUE at other, different case) is open exclusively by a killed, not existing process? Is QM restart the only way to solve it?
Thanks,
Robert |
|
Back to top |
|
 |
vennela |
Posted: Mon Apr 10, 2006 10:29 am Post subject: |
|
|
 Jedi Knight
Joined: 11 Aug 2002 Posts: 4055 Location: Hyderabad, India
|
grebenar wrote: |
Is QM restart the only way to solve it? |
Yes
I had this problem on AIX and IBM provided a fix for it. It was due to some FNS reverse lookup failing or something like that |
|
Back to top |
|
 |
grebenar |
Posted: Mon Apr 10, 2006 12:47 pm Post subject: |
|
|
Novice
Joined: 10 Apr 2006 Posts: 22 Location: Budapest, Hungary
|
So your case was similar to mine? Maybe that can help Willow to identify this problem. Does your fix have a number or ID from which I can find it on the IBM site?
This error used to occur very rarely in the past, but it happenned 3 times a week on a very important system.
Very interesting, I saw this error on the receiver end:
04/10/06 10:19:36
AMQ9213: A communications error for TCP/IP occurred.
EXPLANATION:
An unexpected error occurred in communications.
ACTION:
The return code from the TCP/IP(select) [TIMEOUT] 360 seconds call was 11
(X'B'). Record these values and tell the systems administrator.
----- amqccita.c : 3075 -------------------------------------------------------
04/10/06 10:19:36
AMQ9999: Channel program ended abnormally.
EXPLANATION:
Channel program 'GLOB_PD_TO_DISP_PD' ended abnormally.
ACTION:
Look at previous error messages for channel program 'GLOB_PD_TO_DISP_PD' in the
error files to determine the cause of the failure.
I searched for this error on these pages, but didn't find and exact solution.
Greetings,
Robert |
|
Back to top |
|
 |
fjb_saper |
Posted: Mon Apr 10, 2006 1:19 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
This is because in all likelyhood it is not an MQ problem but a network problem. Get your network folks involved.  _________________ MQ & Broker admin |
|
Back to top |
|
 |
grebenar |
Posted: Mon Apr 10, 2006 1:28 pm Post subject: |
|
|
Novice
Joined: 10 Apr 2006 Posts: 22 Location: Budapest, Hungary
|
I was also thinking about it after seeing this TCP message. However, that was on the receiver host, and the sender chl remained RUNNING until I tried to stop it (when it went to stopping). Meanwhile there were 300-400 messages in the xmit q, which had a trigger on it... Why didn't the sender chl discovered the situation? And when I restarted the sender QM, everything was OK, I had nothing to do with the receiver side.
Thanks for your help,
Robert |
|
Back to top |
|
 |
grebenar |
Posted: Mon Apr 10, 2006 1:38 pm Post subject: |
|
|
Novice
Joined: 10 Apr 2006 Posts: 22 Location: Budapest, Hungary
|
One more thing: we have a centralized QM network with 15-20 QMs. This problem happens only to this QM, although there are others with same message-load characteristic. |
|
Back to top |
|
 |
kevinf2349 |
Posted: Mon Apr 10, 2006 3:18 pm Post subject: |
|
|
 Grand Master
Joined: 28 Feb 2003 Posts: 1311 Location: USA
|
Is this behind a firewall?
Did someone do anything like bounce a firewall?
Is this a uniquely named queue manager in the network?
It may help if you post the appropriate definitions.
Do you have batch interval set? Are these Persisent messages? |
|
Back to top |
|
 |
grebenar |
Posted: Tue Apr 11, 2006 12:42 am Post subject: |
|
|
Novice
Joined: 10 Apr 2006 Posts: 22 Location: Budapest, Hungary
|
There is no firewall betwwen this two QMs. The QM names are unique, most of the messages are persistent. Batch interval is 0.
The related definitons are:
ALTER QLOCAL(SYSTEM.DEFAULT.LOCAL.QUEUE) DEFPRTY(5) DEFPSIST(YES) MAXDEPTH(150000) QDPMAXEV(DISABLED)
ALTER QALIAS(SYSTEM.DEFAULT.ALIAS.QUEUE) DEFPRTY(5) DEFPSIST(YES)
ALTER QREMOTE(SYSTEM.DEFAULT.REMOTE.QUEUE) DEFPRTY(5) DEFPSIST(YES)
ALTER QMODEL(SYSTEM.DEFAULT.MODEL.QUEUE) DEFPRTY(5) DEFPSIST(YES) MAXDEPTH(150000) QDPMAXEV(DISABLED)
ALTER QLOCAL(SYSTEM.DEAD.LETTER.QUEUE) MAXDEPTH(150000)
ALTER CHANNEL(SYSTEM.DEF.SENDER) CHLTYPE(SDR) SHORTRTY(180) SHORTTMR(20) LONGTMR(60)
ALTER CHANNEL(SYSTEM.DEF.SVRCONN) CHLTYPE(SVRCONN) MCAUSER('wmqnobody')
ALTER CHANNEL(SYSTEM.AUTO.SVRCONN) CHLTYPE(SVRCONN) MCAUSER('wmqnobody')
STOP CHANNEL(SYSTEM.AUTO.RECEIVER)
STOP CHANNEL(SYSTEM.AUTO.SVRCONN)
STOP CHANNEL(SYSTEM.DEF.CLUSRCVR)
STOP CHANNEL(SYSTEM.DEF.CLUSSDR)
STOP CHANNEL(SYSTEM.DEF.RECEIVER)
STOP CHANNEL(SYSTEM.DEF.REQUESTER)
ALTER CHANNEL(SYSTEM.DEF.SENDER) CHLTYPE(SDR) XMITQ(SYSTEM.CLUSTER.TRANSMIT.QUEUE)
ALTER CHANNEL(SYSTEM.DEF.SERVER) CHLTYPE(SVR) XMITQ(SYSTEM.CLUSTER.TRANSMIT.QUEUE)
STOP CHANNEL(SYSTEM.DEF.SENDER)
STOP CHANNEL(SYSTEM.DEF.SERVER)
ALTER CHANNEL(SYSTEM.DEF.SENDER) CHLTYPE(SDR) XMITQ('')
ALTER CHANNEL(SYSTEM.DEF.SERVER) CHLTYPE(SVR) XMITQ('')
STOP CHANNEL(SYSTEM.DEF.SVRCONN)
DEFINE QREMOTE(DISPMAIN_IN_A_R) DESCR('Diszpecser input remote definition queue') RNAME('DISPMAIN_IN_A') RQMNAME('DISP_PD') XMITQ(DISP_PD)
DEFINE QLOCAL(DISP_PD) DESCR('Transmission queue Diszpecser fele') USAGE(XMITQ) TRIGGER TRIGTYPE(FIRST) TRIGDATA(GLOB_PD_TO_DISP_PD) INITQ(SYSTEM.CHANNEL.INITQ)
DEFINE CHANNEL(GLOB_PD_TO_DISP_PD) CHLTYPE(SDR) DESCR('Channel Diszpecser fele') CONNAME('host(1414)') XMITQ(DISP_PD)
DEFINE CHANNEL(DISP_PD_TO_GLOB_PD) CHLTYPE(RCVR) DESCR('Channel Diszpecser rendszer felol')
DEFINE CHANNEL(SYSTEM.ADMIN.SVRCONN) CHLTYPE(SVRCONN) MCAUSER('') |
|
Back to top |
|
 |
|