ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » SIGSEGV address not mapped - issue with process amqrmppa

Post new topic  Reply to topic
 SIGSEGV address not mapped - issue with process amqrmppa « View previous topic :: View next topic » 
Author Message
parker
PostPosted: Mon Sep 10, 2012 2:01 am    Post subject: SIGSEGV address not mapped - issue with process amqrmppa Reply with quote

Newbie

Joined: 09 Sep 2012
Posts: 5

Hello All,
The issue occurs after MQ 6.0 upgrade to version 7.0.1.4
Initially the PMR said to try upgrade to 7.0.1.9 But it didn't help. also asked to reduce TRIGINT to 99999.
We tought of considering to reduce TRIGINT to 99999 but I
found the situation where extra trigger message generation is not going to be help in this issue.
Let me put togather my investigation:

1) There comes a message to queue QL.A.DATA but message
did not get processed by the attached trigger process FTS.A.DELIVER

2) Found the application a.exe connected to the queue but it did not process the message (later I found the process is dead already at time of FDC). a.exe is appeared in the Window task manager in
the client machine. But it did not process the messages from queue.

3) Found the process id 32137 attached to the queue (found from Q's usage) is running in the MQ server machine.
mqm@somehost~:> ps -ef| grep 32137
mqm 6205 3015 0 08:16 pts/0 00:00:00 grep 32137
mqm 32137 30808 0 05:00 ? 00:00:12 /opt/mqm/bin/amqrmppa -
m SOMEQMGR
But there is AMQ32137.0.FDC file generated by the process 32317 at
the time of issue.

4) At the time there is event error for client trigger monitor (also in client's AMQERR01.LOG) in the client machine:
Starting: a.cmd XXXXX "TMC 2QL.A.DATA FTS.A.DELIVER d:\...\..\a.cmd XXXXX XXXXX SOMEQMGR " .
---
The description for Event ID ( 3 ) in Source ( YYYY ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: KFT0003, Failed: MQSeries comp code 2, reason 2009. Function MQGET, on object FTS.DATA.
---
Channel 'SO-CALLED-SVRCONN-CHL' timed out.

A timeout occurred while waiting to receive from the other end of channel 'SO-CALLED-SVRCONN-CHL'. The address of the remote end of the connection was 'aa.bb.cc.dd(1414)'.

The return code from the select() [TIMEOUT] 60 seconds call was 0 (X'0'). Record these values and tell the systems administrator.
---
Remote host 'someclienthost (ee.ff.gg.hh) (1414)' not available, retry later.

The attempt to allocate a conversation using TCP/IP to host 'someclienthost (ee.ff.gg.hh) (1414)' was not successful. However the error may be a transitory one and it may be possible to successfully allocate a TCP/IP conversation later.

Try the connection again later. If the failure persists, record the error values and contact your systems administrator. The return code from TCP/IP is 10061 (X'274D'). The reason for the failure may be that this host cannot reach the destination host. It may also be possible that the listening program at host 'someclienthost (ee.ff.gg.hh) (1414)' was not running. If this is the case, perform the relevant operations to start the TCP/IP listening program, and try again.

5) Retriggering did not work for the queue QL.A.DATA with
curdepth 1
Because the broken application has the queue open for input (
IPPROCS(1) )and is still appearing in the queue's usage (never ended).

6) When trig type is changed from FIRST to EVERY, then retriggered
the queue - a new a.exe appeared in the usage and message got
processed and the new process disappeared shortly but the broken
process still there to stop the next initiation message.

in short form the client's connection getting
broken and worse is not leaving from the queue's usage. Could anyone help me on this?
Back to top
View user's profile Send private message
parker
PostPosted: Mon Sep 10, 2012 2:06 am    Post subject: Reply with quote

Newbie

Joined: 09 Sep 2012
Posts: 5

The MQ client versions are 7.0.1.1 and MQ Server version is 7.0.1.9.
Below is the FDC content:
Quote:
+-----------------------------------------------------------------------------+
| |
| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Fri September 07 2012 09:02:16 UTC |
| UTC Time :- 1347008536.228990 |
| UTC Time Offset :- -240 (EST) |
| Host Name :- mq-server-hostname |
| Operating System :- Linux 2.6.32.23-0.3-default |
| PIDS :- 5724H7230 |
| LVLS :- 7.0.1.9 |
| Product Long Name :- WebSphere MQ for Linux (x86-64 platform) |
| Vendor :- IBM |
| Probe Id :- XC130003 |
| Application Name :- MQM |
| Component :- xehExceptionHandler |
| SCCS Info :- lib/cs/unix/amqxerrx.c, 1.242.1.3 |
| Line Number :- 1401 |
| Build Date :- Jul 18 2012 |
| CMVC level :- p701-109-120718 |
| Build Type :- IKAP - (Production) |
| Effective UserID :- 5000 (mqm) |
| Real UserID :- 5000 (mqm) |
| Program Name :- amqrmppa |
| Addressing mode :- 64-bit |
| Process :- 32137 |
| Process(Thread) :- 879 |
| Thread :- 69 |
| ThreadingModel :- PosixThreads |
| QueueManager :- SOMEQMGR |
| UserApp :- FALSE |
| ConnId(1) IPCC :- 174070 |
| Last HQC :- 1.0.0-354264 |
| Last HSHMEMB :- 0.0.0-0 |
| Major Errorcode :- STOP |
| Minor Errorcode :- OK |
| Probe Type :- HALT6109 |
| Probe Severity :- 1 |
| Probe Description :- AMQ6109: An internal WebSphere MQ error has occurred. |
| FDCSequenceNumber :- 0 |
| Arith1 :- 11 (0xb) |
| Comment1 :- SIGSEGV: address not mapped(0x811000) |
| |
+-----------------------------------------------------------------------------+

O/S Call Stack for current thread
/opt/mqm/lib64/libmqmcs_r.so(xcsPrintStackForCurrentThread+0xa0)[0x7ffadda36280]
/opt/mqm/lib64/libmqmcs_r.so(signalHandlerInternal+0x5c)[0x7ffadda4c76c]
/opt/mqm/lib64/libmqmcs_r.so(PrepareDumpAreas+0xd2)[0x7ffadda4ac82]
/opt/mqm/lib64/libmqmcs_r.so(xcsFFSTFn+0x20d9)[0x7ffadda4f149]
/opt/mqm/lib64/libmqmcs_r.so(xehExceptionHandler+0x625)[0x7ffadda494a5]
/lib64/libpthread.so.0(+0xf5d0)[0x7ffadd6885d0]
/lib64/libc.so.6(memcpy+0x15b)[0x7ffadd198b8b]
/opt/mqm/lib64/libmqmr_r.so(rstSendAsyncMessage+0xa43)[0x7ffade202663]
/opt/mqm/lib64/libmqmr_r.so(rstConsumer+0xe33)[0x7ffade206623]
/opt/mqm/lib64/libmqz_r.so(zstUserCallback+0x72a)[0x7ffadde1959a]
/opt/mqm/lib64/libmqz_r.so(zstAsyncConsume+0xb90)[0x7ffadde1ad00]
/opt/mqm/lib64/libmqz_r.so(zstAsyncConsumeThread+0x73e)[0x7ffadde1e5fe]
/opt/mqm/lib64/libmqmcs_r.so(+0x10ced6)[0x7ffaddaa3ed6]
/lib64/libpthread.so.0(+0x75f0)[0x7ffadd6805f0]
/lib64/libc.so.6(clone+0x6d)[0x7ffadd1eb84d]
Back to top
View user's profile Send private message
exerk
PostPosted: Mon Sep 10, 2012 2:19 am    Post subject: Reply with quote

Jedi Council

Joined: 02 Nov 2006
Posts: 6339

First things first: only the header of the FDC is worth posting, the rest is only meaningful to Hursley support. So, I've deleted the irrelevant bit from your original post.

Secondly: it would be sensible to have the client installation at the same FixPack level as the server installation.

Thirdly: the GET failed on object FTS.DATA, so what is QL.A.DATA used for?

And, lastly: was just the server upgraded to the later WMQ version, or both server and client? If the client also, was a.exe recompiled against the new libraries?
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.
Back to top
View user's profile Send private message
parker
PostPosted: Mon Sep 10, 2012 1:00 pm    Post subject: Reply with quote

Newbie

Joined: 09 Sep 2012
Posts: 5

thanks for yr reply. Sorry for that mess

Thirdly: the GET failed on object FTS.DATA, so what is QL.A.DATA used for?
ans:my mistake FTS.DATA is QL.A.DATA i missed to change it.

And, lastly: was just the server upgraded to the later WMQ version, or both server and client? If the client also, was a.exe recompiled against the new libraries?
ans: only server upgraded. The thing is this issue happens randomly. when we run the target application manually it works. even it works when we set the trig type to every and retrigger it.

i checked some of the chl attr difference against another qmgr which doing same work with mq v6.
i suspect it might be due to the SHARECNV thing introduced in the mq v7.

any help will be more help for us to resolve the issue .
Back to top
View user's profile Send private message
mvic
PostPosted: Mon Sep 10, 2012 1:19 pm    Post subject: Reply with quote

Jedi

Joined: 09 Mar 2004
Posts: 2080

You have a memory exception (SIGSEGV) in your channel thread, but this is not likely to be your fault unless (unlikely?) you have a bad piece of code - eg. a user-written "exit" - in your amqrmppa process that is overwriting memory somehow. Did IBM review the above FDC file from your 7.0.1.9? What did they say? It looks like this is repeatable so you will probably be asked to capture a recreate inside an MQ trace and send it to IBM support.
Back to top
View user's profile Send private message
parker
PostPosted: Tue Sep 11, 2012 5:18 pm    Post subject: Reply with quote

Newbie

Joined: 09 Sep 2012
Posts: 5

Yes, IBM checked the FDCs from 7.0.1.9, stackit data was not workth for their investigation. they have asked to capture the trace. I am in the process of sending the trace data.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Tue Sep 11, 2012 5:54 pm    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9469
Location: US: west coast, almost. Otherwise, enroute.

Thanks for the update, Parker.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
parker
PostPosted: Thu Sep 20, 2012 6:36 pm    Post subject: Reply with quote

Newbie

Joined: 09 Sep 2012
Posts: 5

Fix provided in PMR: Looking at the initial FDC it is possible that part of one of WMQ's data structure has been overwritten by a transmission buffer. Advised to turn *off* all channel shared conversations. ie. Use SHARECNV(0) for all channels.

All of this processing on multiple threads occurred because by default V7 uses an asynchronous delivery to the client. Advised to
set SHARECNV(0); side effect: only SVRCONN chl multiplexing would become the same mechanism as WMQ V6.

We set SHARECNV(0) for svrconn channel(s), this fixed the issue. Hope this may help someone !!
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Thu Sep 20, 2012 7:27 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

Just for testing would setting shareconv(1) have the same effect? Full duplex but still only 1 conversation per socket.
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Andyh
PostPosted: Thu Sep 20, 2012 10:58 pm    Post subject: Reply with quote

Master

Joined: 29 Jul 2010
Posts: 239

Using SHARECNV(1) would NOT have the same effect.
The MCA would still implement a client MQGET using async consume when SHARECNV(1) is used.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Fri Sep 21, 2012 9:59 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20756
Location: LI,NY

Andyh wrote:
Using SHARECNV(1) would NOT have the same effect.
The MCA would still implement a client MQGET using async consume when SHARECNV(1) is used.


So you're telling us that what he ran into was not a threading problem but an async consume problem... interesting...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » General IBM MQ Support » SIGSEGV address not mapped - issue with process amqrmppa
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.