|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
 |
|
DataFlowEngine Stops and Restarts with BIP2228E |
« View previous topic :: View next topic » |
Author |
Message
|
MikeC |
Posted: Fri Sep 17, 2004 10:25 am Post subject: DataFlowEngine Stops and Restarts with BIP2228E |
|
|
 Acolyte
Joined: 30 Jun 2003 Posts: 55 Location: Toronto, Canada
|
The DataFlowEngine in our production environment is stopping and re-starting for no apparent reason. When it does, messages are dumped in the failure queue. The syslog shows the following error messages..
Sep 17 04:02:13 uamqsi10 WMQIv210[19482]: (MKBK.gr00execgroup)[3342]BIP2228E: Severe error: /build/S210_P/src/CommonServices/U
nix/ImbAbend.cpp 417 signal received Abend file: /var/mqsi/errors/MKBK.gr00execgroup.19482.3342.abend action: abort
Sep 17 04:02:14 uamqsi10 WMQIv210[44416]: (MKBK)[1286]BIP2060W: The broker has detected that the Execution Group gr00execgroup
, process ID 19482, has shutdown. : MKBK.agent: /build/S210_P/src/AdminAgent/ImbAdminAgent.cpp: 3789: ImbAdminAgent::startAndM
onitorADataFlowEngine: :
Sep 17 04:02:14 uamqsi10 WMQIv210[23116]: (MKBK.gr00execgroup)[1]BIP2201I: Execution Group started: process '23116'; thread '1
'; additional information: brokerName 'MKBK'; executionGroupUUID '1e959ed0-fd00-0000-0080-c0c0fec02bdd'; executionGroupLabel '
gr00execgroup'; defaultExecutionGroup 'false'; queueManagerName 'TMK1'; trusted 'false'; dataSourceName 'DMKB'; userId 'UMKGR'
; migrationNeeded 'false'; brokerUUID 'f6c59dd0-fd00-0000-0080-c0c0fec02bdd'; filePath '/usr/opt/mqsi'; workPath '/var/mqsi'.
: MKBK.1e959ed0-fd00-0000-0080-c0c0fec02bdd: /build/S210_P/src/DataFlowEngine/ImbMain.cpp: 215: main: :
I have sent the abbend file to IBM (it doesn't mean much to me) but they have been slow to respond. I didn't post it here, as it's fairly large. Has anyone seen or experienced this before?
Thanks very much,
-Mike. _________________ -Mike. |
|
Back to top |
|
 |
jefflowrey |
Posted: Fri Sep 17, 2004 10:28 am Post subject: |
|
|
Grand Poobah
Joined: 16 Oct 2002 Posts: 19981
|
If IBM is slow to respond, then you need to a) call them back, a lot, b) raise the severity level of your ticket until they DO respond, c) complain. _________________ I am *not* the model of the modern major general. |
|
Back to top |
|
 |
JohnMetcalfe |
Posted: Mon Sep 20, 2004 12:20 am Post subject: |
|
|
 Apprentice
Joined: 02 Apr 2004 Posts: 40 Location: Edinburgh, Scotland
|
Hi Mike,
I'd start by checking the memory usage of the execution group. On UNIX and NT, execution groups are allocated 512MB by default - when memory used starts to approach this, the execution group will restart and dump the message to the designated backout Q. We have hit similar issues in Capacity testing and are in the process of trying to resolve them - see thread http://www.mqseries.net/phpBB2/viewtopic.php?t=17660.
Are you processing larger messages (>1MB) at all?
Cheers |
|
Back to top |
|
 |
mgk |
Posted: Mon Sep 20, 2004 1:10 am Post subject: |
|
|
 Padawan
Joined: 31 Jul 2003 Posts: 1642
|
Hi, This statement is not entirely correct:
Quote: |
On UNIX and NT, execution groups are allocated 512MB by default |
I believe this is only the case on AIX by default. Certainly all other platforms are allowed to grow their heaps to the size of available user-mode VM (Usually 2GB on NT, 3GB on Solaris, 2GB HP, 2GB z/OS, 2GB Linux etc...). We do not impose any artificial limits on memory use...
Cheers, _________________ MGK
The postings I make on this site are my own and don't necessarily represent IBM's positions, strategies or opinions. |
|
Back to top |
|
 |
JohnMetcalfe |
Posted: Mon Sep 20, 2004 1:32 am Post subject: |
|
|
 Apprentice
Joined: 02 Apr 2004 Posts: 40 Location: Edinburgh, Scotland
|
I stand corrected. In my defence, we are running on UNIX, so this was first hand, the NT setup was my second hand understanding from another team. On UNIX we are seeing our execution groups restarting after approx 390MB of memory used, despite having 512MB allocated. |
|
Back to top |
|
 |
MikeC |
Posted: Mon Sep 20, 2004 6:22 am Post subject: |
|
|
 Acolyte
Joined: 30 Jun 2003 Posts: 55 Location: Toronto, Canada
|
Thanks guys. This might well be the problem. .. So, is it possible on AIX to allocate more memory for each execution group? .. Say increase it to a gig? ... Or what about spreading your flows over multiple execution groups? Would that help? _________________ -Mike. |
|
Back to top |
|
 |
Tibor |
Posted: Mon Sep 20, 2004 7:22 am Post subject: |
|
|
 Grand Master
Joined: 20 May 2001 Posts: 1033 Location: Hungary
|
MikeC wrote: |
Thanks guys. This might well be the problem. .. So, is it possible on AIX to allocate more memory for each execution group? .. Say increase it to a gig? ... Or what about spreading your flows over multiple execution groups? Would that help? |
Mike,
We raised up the hard limit burned in DataFlowEngine processes and since then works fine. Look this on IBM Support site:
http://www-1.ibm.com/support/docview.wss?rs=172&context=SW900&q1=dfe+aix+limit&uid=swg1IY35159&loc=en_US&cs=utf-8&lang=en
Contenting this shell script for lazies
Code: |
#!/bin/sh
if [ $# -ne 1 ] ; then
echo "\nusage: $0 <executable name>"
echo "\tmodifies header of AIX executable to enable a larger address space for the executable."
echo "\n(see http://publib.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixprggd/genprogc/toc.htm"
echo " and refer to Chapter 8. Large Program Support .)"
exit 1
fi
EXE=$1
cp ${EXE} ${EXE}_512
/usr/bin/echo "\040\0\0\0" | dd of=${EXE}_512 bs=4 count=1 seek=19 conv=notrunc
cp ${EXE} ${EXE}_768
/usr/bin/echo "\060\0\0\0" | dd of=${EXE}_768 bs=4 count=1 seek=19 conv=notrunc
cp ${EXE} ${EXE}_1024
/usr/bin/echo "\0100\0\0\0" | dd of=${EXE}_1024 bs=4 count=1 seek=19 conv=notrunc
cp ${EXE} ${EXE}_1280
/usr/bin/echo "\0120\0\0\0" | dd of=${EXE}_1280 bs=4 count=1 seek=19 conv=notrunc
cp ${EXE} ${EXE}_1536
/usr/bin/echo "\0140\0\0\0" | dd of=${EXE}_1536 bs=4 count=1 seek=19 conv=notrunc
cp ${EXE} ${EXE}_1792
/usr/bin/echo "\0160\0\0\0" | dd of=${EXE}_1792 bs=4 count=1 seek=19 conv=notrunc
cp ${EXE} ${EXE}_2048
/usr/bin/echo "\0200\0\0\0" | dd of=${EXE}_2048 bs=4 count=1 seek=19 conv=notrunc
ls -lrt ${EXE}* |
In other side IMHO there are some advantages of multiple execution groups e.g you can restart separately, etc.
Tibor |
|
Back to top |
|
 |
|
|
 |
|
Page 1 of 1 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|