Author |
Message
|
MaheshPN |
Posted: Mon Sep 26, 2005 7:27 am Post subject: WAS Crash due to WF JNI calls |
|
|
 Master
Joined: 21 May 2003 Posts: 245 Location: Charlotte, NC
|
Hi Guys,
We are running on WF 3.4 and sometimes WAS crash with Signal 6,Signal 11 with Binary core or sometime just binary core.
Sample core look like this...
in pthread_kill at 0xd005cb84 ($t24148)
0xd005cb84 (pthread_kill+0xa8) 80410014 lwz r2,0x14(r1)
pthread_kill(??, ??) at 0xd005cb84
_p_raise(??) at 0xd005c18c
raise.raise(??) at 0xd01e89cc
abort() at 0xd01f7708
myabort__3stdFv() at 0xd0772da8
terminate__3stdFv() at 0xd07716e0
terminate__Fv() at 0xd07724ac
except.__DoThrow() at 0xd2e8ca68
fmckInvariant__FPCcPCcUi() at 0xd2e03fec
fmcjdint.__dt__9FmcStringFv() at 0xd337e900
fmcjdint.__dt__11FmcSystemIDFv() at 0xd337e268
__dt__10FmcjThreadFv() at 0xd337d2f8
CleanupThread__17FmcjApiGlobalCommFv() at 0xd337d18c
Cleanup__FPv() at 0xd3393ea8
_specific_data_cleanup(??) at 0xd00625f8
pthread_exit(??) at 0xd0053214
_start(??) at 0xd0e43e30
_pthread_body(??) at 0xd004d424
There won't be any errors on the WF server side.
IBM suggested that, this is due to WF JNI Calls and should be gone by installing 3.5 and using Native JAVA calls. Deploying a new version on production always takes time and I was wondering does anybody else had the same issue.
I have a filtered audit trial running on the system. Does that causing an issue? or is it related to JDK?
Thanks in advance!!!
-Mahesh |
|
Back to top |
|
 |
JLRowe |
Posted: Mon Sep 26, 2005 8:21 am Post subject: |
|
|
 Yatiri
Joined: 25 May 2002 Posts: 664 Location: South East London
|
What platform are you on?
Have you considered upgrading the JVM? |
|
Back to top |
|
 |
MaheshPN |
Posted: Mon Sep 26, 2005 8:28 am Post subject: |
|
|
 Master
Joined: 21 May 2003 Posts: 245 Location: Charlotte, NC
|
Hi,
We are on AIX 5.x.
Quote: |
Have you considered upgrading the JVM |
Yes, In the beginning, IBM Suggested us to upgrade JDK and WAS.
It did not helped us. Our corrent versions are WAS 5.0.2.10 and JDK 1.3.1.8.
Thanks,
-Mahesh |
|
Back to top |
|
 |
JLRowe |
Posted: Mon Sep 26, 2005 9:59 am Post subject: |
|
|
 Yatiri
Joined: 25 May 2002 Posts: 664 Location: South East London
|
The only other thing I can suggest is to search around in the IBM JDK forums, you can find them on the developer works site. I've seen quite a few discussions about JVM crashes on AIX there. |
|
Back to top |
|
 |
MQRR |
Posted: Mon Sep 26, 2005 10:30 am Post subject: |
|
|
Centurion
Joined: 10 Aug 2003 Posts: 110
|
hi mahesh,
we had similar problem. It is something got to do with a HUP signal being issued to JVM that is running WSAS server. I do not have the details since I was not invloved but if you have access look into the PMR# 57359227 and see if it helps you.
MQRR |
|
Back to top |
|
 |
hos |
Posted: Mon Sep 26, 2005 11:38 pm Post subject: |
|
|
Chevalier
Joined: 03 Feb 2002 Posts: 470
|
Mahesh,
this is the Top 1 problem that is going on in the MQWF community! You can look at an arbitrary number of PMRs all of them dealing with these symptoms. The problem seems to be that WebSphere Application server does not keep sufficiently control over the resource consumtion of JNI calls - actually it is a twilight zone whether they are supported at all. On the other side the C++ runtime environment is not aware that it is running in the context of a big JVM process that has its own resource management. So when load increases the system runs into resource conflicts that end up in access violations or memory corruptions.
The only way to overcome this in a WebSphere environment is to use the native Java API. This implies at minimum MQWF version 3.5, preferable 3.6 as there the native Java API is a built-in feature. I do not say that MQWF upgrade is a piece of cake. You should make the tradeoff between the pain of server crashes ( by the way: with the native Java API only the application can crash instead of the whole AppServer) and MQWF upgrade.
Contact the support team when you decide to upgrade! There are many PMRs that have been solved by this solution. |
|
Back to top |
|
 |
MaheshPN |
Posted: Tue Sep 27, 2005 6:30 am Post subject: |
|
|
 Master
Joined: 21 May 2003 Posts: 245 Location: Charlotte, NC
|
Thanks Hos and MQRR. I have opened the PMR too on this regard and today, I got a suggestion from IBM to reduce the Workflow object size or number of workitems. Well, its difficult to find out which one caused it and also, I cannot put a filter as it is a business requirement. Since the number of transactions has increased over a time, we are seeing this crash very often.
Regard PMR number which MQRR mentioned, I tried to pull it from IBM support site. But, I came to know that, we are allowed to see only our company's PMRs. So, if possible can you paste the details here?
Hos, your explanation make sense to me. Regarding PMRs you talked about, Is those PMRs opened against 3.6 or in general I need to be aware of before upgrade? I have seen that, there are some issues(API's does not work) IN 3.6. So, I was kind of oriented towards installing 3.5 and apply latest support pack. Do you have any suggestions?
Thanks for the response guys!!
-Mahesh |
|
Back to top |
|
 |
hos |
Posted: Tue Sep 27, 2005 7:28 am Post subject: |
|
|
Chevalier
Joined: 03 Feb 2002 Posts: 470
|
Mahesh,
you have the upgrade pain anyway, so why not just do it once and go to 3.6?
A problem might be that you need to upgrade several other corequsite software versions, too. Operating system level, MQ, database, AppServer comes into my mind. Therefore my recommendation to check this with IBM's support team. As already mentioned: you get the native Java API as a built-in feature in 3.6 - including the support in the configuration utility. I do not known a technical reason that could prevent you from migration to 3.6.
The PMR's I was talking about were against MQWF 3.4. |
|
Back to top |
|
 |
jmac |
Posted: Tue Sep 27, 2005 7:38 am Post subject: |
|
|
 Jedi Knight
Joined: 27 Jun 2001 Posts: 3081 Location: EmeriCon, LLC
|
Mahesh:
Just to reiterate what Volker is saying... Why go through the upgrade pain twice.... If possible go directly to V3.6. I have run into a few minor issues (on 3.6), but nothing that is a show stopper, and all are pretty easy to work around. _________________ John McDonald
RETIRED |
|
Back to top |
|
 |
MaheshPN |
Posted: Tue Sep 27, 2005 8:26 am Post subject: |
|
|
 Master
Joined: 21 May 2003 Posts: 245 Location: Charlotte, NC
|
Looking at confidence you guys have on 3.6, I am thinking of going to 3.6 directly
If I remember correctly,there were some issues in sorting the workitem in 3.6. Sorting the Worklist is very important function in our business. So, probably I need to wait until atleast 3.6.0 SP1 is released. I am guessing that should be published soon.
Thanks,
-Mahesh |
|
Back to top |
|
 |
MQRR |
Posted: Tue Sep 27, 2005 11:15 am Post subject: |
|
|
Centurion
Joined: 10 Aug 2003 Posts: 110
|
mahesh,
how are you starting your websphere application server?
it should be something like this i guess:
/$WSROOT/websphere/appserver/bin/startServer.sh server1
add "nohup" in front:
nohup /$WSROOT/websphere/appserver/bin/startServer.sh server1
Let me know if this solves the problem.
MQRR |
|
Back to top |
|
 |
|