|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
Admin Sever goes Down |
« View previous topic :: View next topic » |
Author |
Message
|
MaheshPN |
Posted: Mon Nov 01, 2004 1:39 pm Post subject: Admin Sever goes Down |
|
|
 Master
Joined: 21 May 2003 Posts: 245 Location: Charlotte, NC
|
Hi Guys,
From past week, Our admin server stops responding and I see the error message
11/01/04 11:30:51 FmcAssertionException, Condition=*** Invariant failed in /proj
ects/fmc/drvp/lbld/v340/aix/src/fmckbin.inl(51): _pRep==0 || _pRep->_len <= _pRe
p->_allocLen
in the syslog. If I bring up the WF, it works for some time and it goes down again. All the other servers(execution and scheduling servers) will be still running.
Any idea why it suddenly started happening?
I have also some stuffs from error log.
WebSphere MQ Workflow 3.4 Error Report
Report creation = 11/01/04 08:54:30
Related message = FMC31050E An error has occurred which has terminated processin
g.
Error location = File=/projects/fmc/drvp/lbld/v340/aix/src/fmckdbg.cxx, Line=22
2, Function=fmckInvariant(const char *, const char *, unsigned int)
Error data = FmcAssertionException, Condition=*** Invariant failed in /proj
ects/fmc/drvp/lbld/v340/aix/src/fmckbin.inl(51): _pRep==0 || _pRep->_len <= _pRe
p->_allocLen
WebSphere MQ Workflow 3.4 Error Report
Report creation = 11/01/04 10:52:45
Related message = FMC31050E An error has occurred which has terminated processin
g.
Error location = File=/projects/fmc/drvp/lbld/v340/aix/src/fmckdbg.cxx, Line=15
3, Function=fmckRequire(const char *, const char *, unsigned int)
Error data = FmcAssertionException, Condition=*** Pre-condition failed in /
projects/fmc/drvp/lbld/v340/src/fmcscbrg.cxx(124): !_fIsInConstMode
WebSphere MQ Workflow 3.4 Error Report
Report creation = 11/01/04 11:30:51
Related message = FMC31050E An error has occurred which has terminated processin
g.
Error location = File=/projects/fmc/drvp/lbld/v340/aix/src/fmckdbg.cxx, Line=22
2, Function=fmckInvariant(const char *, const char *, unsigned int)
Error data = FmcAssertionException, Condition=*** Invariant failed in /proj
ects/fmc/drvp/lbld/v340/aix/src/fmckbin.inl(51): _pRep==0 || _pRep->_len <= _pRe
p->_allocLen[o5aps004] fmc /var/fmc/cfgs/BOS/log>
If somebody have seen these kinds simptoms before, please let me know.
Thanks,
-Mahesh |
|
Back to top |
|
 |
mike_mq |
Posted: Mon Nov 01, 2004 3:58 pm Post subject: |
|
|
Centurion
Joined: 17 Oct 2003 Posts: 123
|
Can you provide us, what versions are you using ? |
|
Back to top |
|
 |
MaheshPN |
Posted: Mon Nov 01, 2004 5:48 pm Post subject: |
|
|
 Master
Joined: 21 May 2003 Posts: 245 Location: Charlotte, NC
|
WF 3.4.4 on AIX 5.1 ML 6
DB2 7.2.9
Thanks,
-Mahesh |
|
Back to top |
|
 |
MaheshPN |
Posted: Fri Apr 01, 2005 10:43 am Post subject: |
|
|
 Master
Joined: 21 May 2003 Posts: 245 Location: Charlotte, NC
|
Hi Guys,
Our Admin server died again with the same error.
I have switched on the trace and here is the output.
2005-04-01, 12:40:31.077, fmckdbg.cxx( 213) (00,Er,Kr), fmcamain(1327170- 1)
, fmckInvariant(), @$ Ctx $@ AIID=OID(00000000000000000000000000000000),0,OID(0
0000000000000000000000000000000),0@$ Ctx $@ AIName=<null>@$ Ctx $@ CliID=<null
>, with name: <null>, group: <null>@$ Ctx $@ PeaID=with type: 0, name: <null>,
system: <null>, system group: <null>@$ Ctx $@ PeaName=<null>@$ Ctx $@ PIID=OID
(00000000000000000000000000000000),OID(00000000000000000000000000000000)@$ Ctx $
@ PIName=<null>@$ Ctx $@ PTID=OID(00000000000000000000000000000000)@$ Ctx $@
PTName=<null>@$ Ctx $@ SvrID=with name: <null>, group: <null>@$ Ctx $@ SvrName
=<null>@$ Ctx $@ Xact=<null>@$ Ctx $@ MsgType= n/a@$ Ctx $@ UID=<null>, with
name: <null>, group: <null>@$ Ctx $@ WIID=OID(00000000000000000000000000000000)
@$ Ctx $@ WIName=<null>Assertion fired
2005-04-01, 12:40:31.077, fmckdbg.cxx( 222) (00,Er,Kr), fmcamain(1327170- 1)
, fmckInvariant(), THROW_INT, FmcAssertionException, Condition=*** Invariant fai
led in /projects/fmc/drvp/lbld/v340/aix/src/fmckbin.inl(51): _pRep==0 || _pRep->
_len <= _pRep->_allocLen, Origin=File=/projects/fmc/drvp/lbld/v340/aix/src/fmckd
bg.cxx, Line=222, Function=fmckInvariant(const char *, const char *, unsigned in
t)
2005-04-01, 12:40:31.077, fmckdbg.cxx( 213) (00,Er,Kr), fmcamain(1327170- 1)
, fmckInvariant(), @$ Ctx $@ AIID=OID(00000000000000000000000000000000),0,OID(0
0000000000000000000000000000000),0@$ Ctx $@ AIName=<null>@$ Ctx $@ CliID=<null
>, with name: <null>, group: <null>@$ Ctx $@ PeaID=with type: 0, name: <null>,
system: <null>, system group: <null>@$ Ctx $@ PeaName=<null>@$ Ctx $@ PIID=OID
(00000000000000000000000000000000),OID(00000000000000000000000000000000)@$ Ctx $
@ PIName=<null>@$ Ctx $@ PTID=OID(00000000000000000000000000000000)@$ Ctx $@
PTName=<null>@$ Ctx $@ SvrID=with name: <null>, group: <null>@$ Ctx $@ SvrName
=<null>@$ Ctx $@ Xact=<null>@$ Ctx $@ MsgType= n/a@$ Ctx $@ UID=<null>, with
name: <null>, group: <null>@$ Ctx $@ WIID=OID(00000000000000000000000000000000)
@$ Ctx $@ WIName=<null>Assertion fired
2005-04-01, 12:40:31.077, fmckdbg.cxx( 222) (00,Er,Kr), fmcamain(1327170- 1)
, fmckInvariant(), THROW_INT, FmcAssertionException, Condition=*** Invariant fai
led in /projects/fmc/drvp/lbld/v340/aix/src/fmckbin.inl(51): _pRep==0 || _pRep->
_len <= _pRep->_allocLen, Origin=File=/projects/fmc/drvp/lbld/v340/aix/src/fmckd
bg.cxx, Line=222, Function=fmckInvariant(const char *, const char *, unsigned in
t)
Not sure, what it means. Let me know, if anybody else have faced this issue.
Current version is WF 3.4.6 and MQ 5.3 CSD 8
Thanks,
-Mahesh |
|
Back to top |
|
 |
vennela |
Posted: Fri Apr 01, 2005 10:51 am Post subject: |
|
|
 Jedi Knight
Joined: 11 Aug 2002 Posts: 4055 Location: Hyderabad, India
|
Looks like a DB issue but I might be wrong.
Do you have any FDC's cut?
What does the db2diag.log say? |
|
Back to top |
|
 |
MaheshPN |
Posted: Fri Apr 01, 2005 11:54 am Post subject: |
|
|
 Master
Joined: 21 May 2003 Posts: 245 Location: Charlotte, NC
|
There are not FDC cut and also our DBA does not see any errors in db2diag.log. I am kind of oriented towards memory allocation problem.
Not sure whether I am thinking in the right direction.
Thanks,
-Mahesh |
|
Back to top |
|
 |
hos |
Posted: Mon Apr 04, 2005 12:37 am Post subject: |
|
|
Chevalier
Joined: 03 Feb 2002 Posts: 470
|
Hi,
I also think that there is a memory allocation problem.
You should use EXTSHM=ON environment varible to expand shared memory allocation.
- the User mqm should have in its .profile export EXTSHM=ON
- the fmc user, typically starting the trigger monitor, should also have in
the .profile
- when starting the DB2 server:
export EXTSHM=ON
db2set DB2ENVLIST=EXTSHM
db2start
If this doesn't help I recommend to open a PMR. |
|
Back to top |
|
 |
MaheshPN |
Posted: Wed Apr 06, 2005 11:19 am Post subject: |
|
|
 Master
Joined: 21 May 2003 Posts: 245 Location: Charlotte, NC
|
Thanks Hos,
I have those variables set during the db2 upgrade from 7.2 to 8.1.
I have trace level 99 ON on admin server. Does that causing this issue?
I was reading the README of sp7. here is what I found.
Quote: |
PMR31223 Some trace settings lead to a shutdown of the administration server. |
Also, I have opened a PMR with IBM and their initial response is, it may be a deadlock issue. Unfortunately we don't find anything about that in any log. I just wonder, is the size of auditlog (6million rows) cause this issue?
Any thoughts??
Thanks,
-Mahesh |
|
Back to top |
|
 |
Ratan |
Posted: Wed Apr 06, 2005 3:01 pm Post subject: |
|
|
 Grand Master
Joined: 18 Jul 2002 Posts: 1245
|
Do you mean you have your trace running all the time? _________________ -Ratan |
|
Back to top |
|
 |
hos |
Posted: Thu Apr 07, 2005 5:23 am Post subject: |
|
|
Chevalier
Joined: 03 Feb 2002 Posts: 470
|
I think they are right. If you are running multiple systems (i.e. multiple admin servers) your admin server may get a deadlock and die ( see APAR IY69544). By the way: SQL913 can be a DB timeout or deadlock. So yes, the huge amount of audit trail records may cause the problem. Do you need all of them? Do you use fmcsclad to cleanup the audit trail? |
|
Back to top |
|
 |
MaheshPN |
Posted: Thu Apr 07, 2005 6:51 am Post subject: |
|
|
 Master
Joined: 21 May 2003 Posts: 245 Location: Charlotte, NC
|
Thanks Hos,
To answer Ratan's question, yes, I have trace running on admin server all the time till the system gets stebilized. We had several issues in the past, and IBM could not resolve it without trace. I have trace running only on admin server not on any other. So far we did not hear any performance issues from users. I am planning to shut it down once the system stebilize.
We usually keep the the audit-trail data for 15days. Due to some business issues, I could not able to cleanup due to which it grown up to 6million. I cleared some 3million records now and it looks like working fine. So, I was wondering, what is the relation btn the size of audit-trail table and the deadlock?.
I was trying to find the details about the APAR IY69544. I could not find in IBM site. Would you please provide me the link?
Thanks,
-Mahesh |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|