ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Workflow Engines - IBM MQ Workflow & Business Process Choreographer » Admin Sever goes Down

Post new topic  Reply to topic
 Admin Sever goes Down « View previous topic :: View next topic » 
Author Message
MaheshPN
PostPosted: Mon Nov 01, 2004 1:39 pm    Post subject: Admin Sever goes Down Reply with quote

Master

Joined: 21 May 2003
Posts: 245
Location: Charlotte, NC

Hi Guys,
From past week, Our admin server stops responding and I see the error message

11/01/04 11:30:51 FmcAssertionException, Condition=*** Invariant failed in /proj
ects/fmc/drvp/lbld/v340/aix/src/fmckbin.inl(51): _pRep==0 || _pRep->_len <= _pRe
p->_allocLen

in the syslog. If I bring up the WF, it works for some time and it goes down again. All the other servers(execution and scheduling servers) will be still running.

Any idea why it suddenly started happening?


I have also some stuffs from error log.

WebSphere MQ Workflow 3.4 Error Report

Report creation = 11/01/04 08:54:30
Related message = FMC31050E An error has occurred which has terminated processin
g.


Error location = File=/projects/fmc/drvp/lbld/v340/aix/src/fmckdbg.cxx, Line=22
2, Function=fmckInvariant(const char *, const char *, unsigned int)
Error data = FmcAssertionException, Condition=*** Invariant failed in /proj
ects/fmc/drvp/lbld/v340/aix/src/fmckbin.inl(51): _pRep==0 || _pRep->_len <= _pRe
p->_allocLen

WebSphere MQ Workflow 3.4 Error Report

Report creation = 11/01/04 10:52:45
Related message = FMC31050E An error has occurred which has terminated processin
g.


Error location = File=/projects/fmc/drvp/lbld/v340/aix/src/fmckdbg.cxx, Line=15
3, Function=fmckRequire(const char *, const char *, unsigned int)
Error data = FmcAssertionException, Condition=*** Pre-condition failed in /
projects/fmc/drvp/lbld/v340/src/fmcscbrg.cxx(124): !_fIsInConstMode

WebSphere MQ Workflow 3.4 Error Report

Report creation = 11/01/04 11:30:51
Related message = FMC31050E An error has occurred which has terminated processin
g.


Error location = File=/projects/fmc/drvp/lbld/v340/aix/src/fmckdbg.cxx, Line=22
2, Function=fmckInvariant(const char *, const char *, unsigned int)
Error data = FmcAssertionException, Condition=*** Invariant failed in /proj
ects/fmc/drvp/lbld/v340/aix/src/fmckbin.inl(51): _pRep==0 || _pRep->_len <= _pRe
p->_allocLen[o5aps004] fmc /var/fmc/cfgs/BOS/log>


If somebody have seen these kinds simptoms before, please let me know.

Thanks,
-Mahesh
Back to top
View user's profile Send private message
mike_mq
PostPosted: Mon Nov 01, 2004 3:58 pm    Post subject: Reply with quote

Centurion

Joined: 17 Oct 2003
Posts: 123

Can you provide us, what versions are you using ?
Back to top
View user's profile Send private message
MaheshPN
PostPosted: Mon Nov 01, 2004 5:48 pm    Post subject: Reply with quote

Master

Joined: 21 May 2003
Posts: 245
Location: Charlotte, NC

WF 3.4.4 on AIX 5.1 ML 6
DB2 7.2.9

Thanks,
-Mahesh
Back to top
View user's profile Send private message
MaheshPN
PostPosted: Fri Apr 01, 2005 10:43 am    Post subject: Reply with quote

Master

Joined: 21 May 2003
Posts: 245
Location: Charlotte, NC

Hi Guys,
Our Admin server died again with the same error.
I have switched on the trace and here is the output.
2005-04-01, 12:40:31.077, fmckdbg.cxx( 213) (00,Er,Kr), fmcamain(1327170- 1)
, fmckInvariant(), @$ Ctx $@ AIID=OID(00000000000000000000000000000000),0,OID(0
0000000000000000000000000000000),0@$ Ctx $@ AIName=<null>@$ Ctx $@ CliID=<null
>, with name: <null>, group: <null>@$ Ctx $@ PeaID=with type: 0, name: <null>,
system: <null>, system group: <null>@$ Ctx $@ PeaName=<null>@$ Ctx $@ PIID=OID
(00000000000000000000000000000000),OID(00000000000000000000000000000000)@$ Ctx $
@ PIName=<null>@$ Ctx $@ PTID=OID(00000000000000000000000000000000)@$ Ctx $@
PTName=<null>@$ Ctx $@ SvrID=with name: <null>, group: <null>@$ Ctx $@ SvrName
=<null>@$ Ctx $@ Xact=<null>@$ Ctx $@ MsgType= n/a@$ Ctx $@ UID=<null>, with
name: <null>, group: <null>@$ Ctx $@ WIID=OID(00000000000000000000000000000000)
@$ Ctx $@ WIName=<null>Assertion fired
2005-04-01, 12:40:31.077, fmckdbg.cxx( 222) (00,Er,Kr), fmcamain(1327170- 1)
, fmckInvariant(), THROW_INT, FmcAssertionException, Condition=*** Invariant fai
led in /projects/fmc/drvp/lbld/v340/aix/src/fmckbin.inl(51): _pRep==0 || _pRep->
_len <= _pRep->_allocLen, Origin=File=/projects/fmc/drvp/lbld/v340/aix/src/fmckd
bg.cxx, Line=222, Function=fmckInvariant(const char *, const char *, unsigned in
t)
2005-04-01, 12:40:31.077, fmckdbg.cxx( 213) (00,Er,Kr), fmcamain(1327170- 1)
, fmckInvariant(), @$ Ctx $@ AIID=OID(00000000000000000000000000000000),0,OID(0
0000000000000000000000000000000),0@$ Ctx $@ AIName=<null>@$ Ctx $@ CliID=<null
>, with name: <null>, group: <null>@$ Ctx $@ PeaID=with type: 0, name: <null>,
system: <null>, system group: <null>@$ Ctx $@ PeaName=<null>@$ Ctx $@ PIID=OID
(00000000000000000000000000000000),OID(00000000000000000000000000000000)@$ Ctx $
@ PIName=<null>@$ Ctx $@ PTID=OID(00000000000000000000000000000000)@$ Ctx $@
PTName=<null>@$ Ctx $@ SvrID=with name: <null>, group: <null>@$ Ctx $@ SvrName
=<null>@$ Ctx $@ Xact=<null>@$ Ctx $@ MsgType= n/a@$ Ctx $@ UID=<null>, with
name: <null>, group: <null>@$ Ctx $@ WIID=OID(00000000000000000000000000000000)
@$ Ctx $@ WIName=<null>Assertion fired
2005-04-01, 12:40:31.077, fmckdbg.cxx( 222) (00,Er,Kr), fmcamain(1327170- 1)
, fmckInvariant(), THROW_INT, FmcAssertionException, Condition=*** Invariant fai
led in /projects/fmc/drvp/lbld/v340/aix/src/fmckbin.inl(51): _pRep==0 || _pRep->
_len <= _pRep->_allocLen, Origin=File=/projects/fmc/drvp/lbld/v340/aix/src/fmckd
bg.cxx, Line=222, Function=fmckInvariant(const char *, const char *, unsigned in
t)

Not sure, what it means. Let me know, if anybody else have faced this issue.

Current version is WF 3.4.6 and MQ 5.3 CSD 8

Thanks,
-Mahesh
Back to top
View user's profile Send private message
vennela
PostPosted: Fri Apr 01, 2005 10:51 am    Post subject: Reply with quote

Jedi Knight

Joined: 11 Aug 2002
Posts: 4055
Location: Hyderabad, India

Looks like a DB issue but I might be wrong.
Do you have any FDC's cut?
What does the db2diag.log say?
Back to top
View user's profile Send private message Send e-mail Visit poster's website
MaheshPN
PostPosted: Fri Apr 01, 2005 11:54 am    Post subject: Reply with quote

Master

Joined: 21 May 2003
Posts: 245
Location: Charlotte, NC

There are not FDC cut and also our DBA does not see any errors in db2diag.log. I am kind of oriented towards memory allocation problem.
Not sure whether I am thinking in the right direction.

Thanks,
-Mahesh
Back to top
View user's profile Send private message
hos
PostPosted: Mon Apr 04, 2005 12:37 am    Post subject: Reply with quote

Chevalier

Joined: 03 Feb 2002
Posts: 470

Hi,

I also think that there is a memory allocation problem.
You should use EXTSHM=ON environment varible to expand shared memory allocation.

- the User mqm should have in its .profile export EXTSHM=ON
- the fmc user, typically starting the trigger monitor, should also have in
the .profile
- when starting the DB2 server:
export EXTSHM=ON
db2set DB2ENVLIST=EXTSHM
db2start

If this doesn't help I recommend to open a PMR.
Back to top
View user's profile Send private message
MaheshPN
PostPosted: Wed Apr 06, 2005 11:19 am    Post subject: Reply with quote

Master

Joined: 21 May 2003
Posts: 245
Location: Charlotte, NC

Thanks Hos,

I have those variables set during the db2 upgrade from 7.2 to 8.1.

I have trace level 99 ON on admin server. Does that causing this issue?
I was reading the README of sp7. here is what I found.

Quote:
PMR31223 Some trace settings lead to a shutdown of the administration server.


Also, I have opened a PMR with IBM and their initial response is, it may be a deadlock issue. Unfortunately we don't find anything about that in any log. I just wonder, is the size of auditlog (6million rows) cause this issue?

Any thoughts??

Thanks,
-Mahesh
Back to top
View user's profile Send private message
Ratan
PostPosted: Wed Apr 06, 2005 3:01 pm    Post subject: Reply with quote

Grand Master

Joined: 18 Jul 2002
Posts: 1245

Do you mean you have your trace running all the time?
_________________
-Ratan
Back to top
View user's profile Send private message Send e-mail
hos
PostPosted: Thu Apr 07, 2005 5:23 am    Post subject: Reply with quote

Chevalier

Joined: 03 Feb 2002
Posts: 470

I think they are right. If you are running multiple systems (i.e. multiple admin servers) your admin server may get a deadlock and die ( see APAR IY69544). By the way: SQL913 can be a DB timeout or deadlock. So yes, the huge amount of audit trail records may cause the problem. Do you need all of them? Do you use fmcsclad to cleanup the audit trail?
Back to top
View user's profile Send private message
MaheshPN
PostPosted: Thu Apr 07, 2005 6:51 am    Post subject: Reply with quote

Master

Joined: 21 May 2003
Posts: 245
Location: Charlotte, NC

Thanks Hos,
To answer Ratan's question, yes, I have trace running on admin server all the time till the system gets stebilized. We had several issues in the past, and IBM could not resolve it without trace. I have trace running only on admin server not on any other. So far we did not hear any performance issues from users. I am planning to shut it down once the system stebilize.
We usually keep the the audit-trail data for 15days. Due to some business issues, I could not able to cleanup due to which it grown up to 6million. I cleared some 3million records now and it looks like working fine. So, I was wondering, what is the relation btn the size of audit-trail table and the deadlock?.
I was trying to find the details about the APAR IY69544. I could not find in IBM site. Would you please provide me the link?

Thanks,
-Mahesh
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » Workflow Engines - IBM MQ Workflow & Business Process Choreographer » Admin Sever goes Down
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.