|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
Switch QM between nominal / rescue |
« View previous topic :: View next topic » |
Author |
Message
|
bruce2359 |
Posted: Mon Jun 28, 2021 4:05 pm Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
gbaddeley wrote: |
bruce2359 wrote: |
Have you or your expert opened a PMR with IBM? |
Agree. You first need to investigate and resolve the issue with a ridiculously large number of messages on the SCCQ. Not only is the cluster command process unable to process them at a reasonable rate, but something is generating a massive flood of them as well.
Is there 1 IPPROC on the queue, by process amqrrmfa?
Does the first message have a non-zero BackoutCount? |
For a moment let’s presume no IBM MQ internal code defect.
Thinking sideways, how could I (or an errant app) generate a huge volume of cluster commands? I could, and did, write an app that put-inhibited then put-enabled a cluster-queue in an endless loop. I watched network traffic increase, SSCQs depth increase. Cluster abuse? Sure. It was a test system, and I was bored.
Again, sideways, inhibited db, MQ messages involved in transaction. Guessing is fun. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
Bad |
Posted: Tue Jun 29, 2021 4:05 am Post subject: |
|
|
Novice
Joined: 15 Jun 2021 Posts: 14
|
bruce2359 wrote: |
Bad wrote: |
exerk wrote: |
And another question - which file system was full? |
It was the file SYSTEM\!CLUSTER\!COMMAND\!QUEUE/ more 3GB
the cause was a partner who continued to send messages to a partner who closed his database for 3 days ... |
A closed/unavailable database should not normally result in cluster administrative messages.
How many transactions per day comprise 3 days workload? What is the length of the transaction messages? What does the app in question do when it discovers that the db is not available to complete a transaction? Does the app put-inhibit an application cluster-queue for each app message, for example? |
no idea I am only the operator of the mq series part I do not know what are the details of the messages between the applications
I do not think that it is the direct cause but can be we have to solve too violently what impacted the exchanges between this server and its FR? |
|
Back to top |
|
 |
Bad |
Posted: Tue Jun 29, 2021 4:26 am Post subject: |
|
|
Novice
Joined: 15 Jun 2021 Posts: 14
|
gbaddeley wrote: |
bruce2359 wrote: |
Have you or your expert opened a PMR with IBM? |
Agree. You first need to investigate and resolve the issue with a ridiculously large number of messages on the SCCQ. Not only is the cluster command process unable to process them at a reasonable rate, but something is generating a massive flood of them as well.
Is there 1 IPPROC on the queue, by process amqrrmfa?
Does the first message have a non-zero BackoutCount? |
here is the information I have on the SCCQ:
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.COMMAND.QUEUE) TYPE(QLOCAL)
ACCTQ(QMGR) ALTDATE(2021-06-15)
ALTTIME(15.04.22) BOQNAME( )
BOTHRESH(0) CLUSNL( )
CLUSTER( ) CLCHNAME( )
CLWLPRTY(0) CLWLRANK(0)
CLWLUSEQ(QMGR) CRDATE(2021-05-05)
CRTIME(12.17.59) CURDEPTH(2538475)
CUSTOM( ) DEFBIND(OPEN)
DEFPRTY(0) DEFPSIST(NO)
DEFPRESP(SYNC) DEFREADA(NO)
DEFSOPT(SHARED) DEFTYPE(PREDEFINED)
DESCR(WebSphere MQ Cluster Command Queue)
DISTL(NO) GET(ENABLED)
HARDENBO INITQ( )
IPPROCS(1) MAXDEPTH(999999999)
MAXMSGL(4194304) MONQ(QMGR)
MSGDLVSQ(PRIORITY) NOTRIGGER
NPMCLASS(NORMAL) OPPROCS(0)
PROCESS( ) PUT(ENABLED)
PROPCTL(COMPAT) QDEPTHHI(80)
QDEPTHLO(20) QDPHIEV(DISABLED)
QDPLOEV(DISABLED) QDPMAXEV(ENABLED)
QSVCIEV(NONE) QSVCINT(999999999)
RETINTVL(999999999) SCOPE(QMGR)
SHARE STATQ(QMGR)
TRIGDATA( ) TRIGDPTH(1)
TRIGMPRI(0) TRIGTYPE(FIRST)
USAGE(NORMAL)
The first message dated June 19 has a backoutCount equal to 0 and the amqrrmfa is a 100% CPU since 14 days
the majority of the messages contained in the SCCQ is for one of the 2 FRs it seems that he cannot communicate with
"Have you or your expert opened a PMR with IBM?"
it seems to me that we do not have access to this service |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Jun 29, 2021 5:44 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Bad wrote: |
gbaddeley wrote: |
bruce2359 wrote: |
Have you or your expert opened a PMR with IBM? |
Agree. You first need to investigate and resolve the issue with a ridiculously large number of messages on the SCCQ. Not only is the cluster command process unable to process them at a reasonable rate, but something is generating a massive flood of them as well.
Is there 1 IPPROC on the queue, by process amqrrmfa?
Does the first message have a non-zero BackoutCount? |
here is the information I have on the SCCQ:
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.COMMAND.QUEUE) TYPE(QLOCAL)
ACCTQ(QMGR) ALTDATE(2021-06-15)
ALTTIME(15.04.22) BOQNAME( )
BOTHRESH(0) CLUSNL( )
CLUSTER( ) CLCHNAME( )
CLWLPRTY(0) CLWLRANK(0)
CLWLUSEQ(QMGR) CRDATE(2021-05-05)
CRTIME(12.17.59) CURDEPTH(2538475)
CUSTOM( ) DEFBIND(OPEN)
DEFPRTY(0) DEFPSIST(NO)
DEFPRESP(SYNC) DEFREADA(NO)
DEFSOPT(SHARED) DEFTYPE(PREDEFINED)
DESCR(WebSphere MQ Cluster Command Queue)
DISTL(NO) GET(ENABLED)
HARDENBO INITQ( )
IPPROCS(1) MAXDEPTH(999999999)
MAXMSGL(4194304) MONQ(QMGR)
MSGDLVSQ(PRIORITY) NOTRIGGER
NPMCLASS(NORMAL) OPPROCS(0)
PROCESS( ) PUT(ENABLED)
PROPCTL(COMPAT) QDEPTHHI(80)
QDEPTHLO(20) QDPHIEV(DISABLED)
QDPLOEV(DISABLED) QDPMAXEV(ENABLED)
QSVCIEV(NONE) QSVCINT(999999999)
RETINTVL(999999999) SCOPE(QMGR)
SHARE STATQ(QMGR)
TRIGDATA( ) TRIGDPTH(1)
TRIGMPRI(0) TRIGTYPE(FIRST)
USAGE(NORMAL)
The first message dated June 19 has a backoutCount equal to 0 and the amqrrmfa is a 100% CPU since 14 days
the majority of the messages contained in the SCCQ is for one of the 2 FRs it seems that he cannot communicate with
"Have you or your expert opened a PMR with IBM?"
it seems to me that we do not have access to this service |
What? Are you running unlicensed or out-of-support IBM MQ code? _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
bruce2359 |
Posted: Tue Jun 29, 2021 6:37 am Post subject: |
|
|
 Poobah
Joined: 05 Jan 2008 Posts: 9469 Location: US: west coast, almost. Otherwise, enroute.
|
Bad wrote: |
exerk wrote: |
And another question - which file system was full? |
It was the file SYSTEM\!CLUSTER\!COMMAND\!QUEUE/ more 3GB
the cause was a partner who continued to send messages to a partner who closed his database for 3 days ... |
A file system is owned by the o/s. Your reply to exerk was confusing, as SYSTEM.CLUSTER.COMMAND.QUEUE is not a file system; rather, it's a local queue. A single queue can hold much more than 3GB of message data. Since the maxdepth of that queue is 999999999, 2+ million messages would not fill the queue.
Data bases have no direct impact on MQ, therefore your conclusion that the root cause of the problem was the partner who closed his data base for 3 days remains speculation. _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
 |
gbaddeley |
Posted: Tue Jun 29, 2021 4:38 pm Post subject: |
|
|
 Jedi Knight
Joined: 25 Mar 2003 Posts: 2538 Location: Melbourne, Australia
|
Bad wrote: |
here is the information I have on the SCCQ:
AMQ8409: Display Queue details.
QUEUE(SYSTEM.CLUSTER.COMMAND.QUEUE) TYPE(QLOCAL)
ACCTQ(QMGR) ALTDATE(2021-06-15)
ALTTIME(15.04.22) BOQNAME( )
BOTHRESH(0) CLUSNL( )
CLUSTER( ) CLCHNAME( )
CLWLPRTY(0) CLWLRANK(0)
CLWLUSEQ(QMGR) CRDATE(2021-05-05)
CRTIME(12.17.59) CURDEPTH(2538475)
CUSTOM( ) DEFBIND(OPEN)
DEFPRTY(0) DEFPSIST(NO)
DEFPRESP(SYNC) DEFREADA(NO)
DEFSOPT(SHARED) DEFTYPE(PREDEFINED)
DESCR(WebSphere MQ Cluster Command Queue)
DISTL(NO) GET(ENABLED)
HARDENBO INITQ( )
IPPROCS(1) MAXDEPTH(999999999)
MAXMSGL(4194304) MONQ(QMGR)
MSGDLVSQ(PRIORITY) NOTRIGGER
NPMCLASS(NORMAL) OPPROCS(0)
PROCESS( ) PUT(ENABLED)
PROPCTL(COMPAT) QDEPTHHI(80)
QDEPTHLO(20) QDPHIEV(DISABLED)
QDPLOEV(DISABLED) QDPMAXEV(ENABLED)
QSVCIEV(NONE) QSVCINT(999999999)
RETINTVL(999999999) SCOPE(QMGR)
SHARE STATQ(QMGR)
TRIGDATA( ) TRIGDPTH(1)
TRIGMPRI(0) TRIGTYPE(FIRST)
USAGE(NORMAL)
The first message dated June 19 has a backoutCount equal to 0 and the amqrrmfa is a 100% CPU since 14 days
the majority of the messages contained in the SCCQ is for one of the 2 FRs it seems that he cannot communicate with |
Is the CURDEPTH increasing or decreasing over time?
Is the MessageId of the first (oldest) message on the queue changing over time?
Check the status of all cluster channels on all FRs and the affected qmgr. They should all be RUNNING or not active.
Are you interested in a quick hack to resolve the SCCQ depth issue?
- kill the amqrrmfa process
- do a CLEAR QLOCAL(...)
- stop and start the queue manager _________________ Glenn |
|
Back to top |
|
 |
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|