ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » excessive sd + preferred way to restart DataFlowEngine

Post new topic  Reply to topic
 excessive sd + preferred way to restart DataFlowEngine « View previous topic :: View next topic » 
Author Message
jcv
PostPosted: Thu Apr 24, 2008 5:30 am    Post subject: excessive sd + preferred way to restart DataFlowEngine Reply with quote

Chevalier

Joined: 07 May 2007
Posts: 411
Location: Zagreb

Hello!

$ uname
AIX
$ ulimit -a|grep desc
nofiles(descriptors) 2000
$ mqsiservice
BIPv600 hr HR
ucnv Console CCSID 912 dft ucnv CCSID 912
ICUW ibm-912_P100-1995 ICUA ibm-912_P100-1995

BIP8071I: Successful command completion.

lsof command reports excessive socket descriptor list on DataFlowEngine process running message flows which issue http requests through http request node. In fact, the limit of 2000 is reached:

$ lsof|grep DataFlow|grep 335944|wc -l
2001
$

Entries look like this:

DataFlowE 335944 mqm 1999u IPv4 0xf1000d0002d1f288 0t0 TCP host1:*->host2:XXXX

netstat does not show network connections between those two hosts in any state. Is this some known issue? We are going to apply some patches there... Is there possible error in message flow?
It's hard for me to trace now under what circumstances this leak happens, but obviously it's not every time http request node processes request, because in that case limit would be reached after several minutes.

Besides that, when I restarted message flows by using toolkit commands (corresponding to mqsistopmsgflow and mqsistartmsgflow), I have somehow resolved part of the problem, being able to process further messages in that execution group, but the process DataFlowEnging itself is not restarted by that command, hence no descriptors were freed. I know all processes would be restarted if I stop the whole broker, and I don't want to do that. What's left? To kill the process on os level to free descriptors? Am I missing some mqsi command?
Back to top
View user's profile Send private message Visit poster's website
jcv
PostPosted: Thu Apr 24, 2008 5:36 am    Post subject: Reply with quote

Chevalier

Joined: 07 May 2007
Posts: 411
Location: Zagreb

In fact, trace and debug options seem to me now the only normal way to solve the problem...
Back to top
View user's profile Send private message Visit poster's website
jcv
PostPosted: Thu Apr 24, 2008 6:51 am    Post subject: Reply with quote

Chevalier

Joined: 07 May 2007
Posts: 411
Location: Zagreb

The symptom of the problem was that message flow was rejecting input mq messages to backout queue, and it was logged:

$ tail -10 /tmp/syslog.out
Apr 22 18:52:07 host1 user:err|error last message repeated 4 times
Apr 22 21:12:42 host1 user:warn|warning WebSphere Broker v6003[335944]: syslog: fopen on /dev/null failed, errno 24
Apr 24 10:49:59 host1 user:info WebSphere Broker v6003[335944]: syslog: fopen on /dev/null failed, errno 24

That is when descriptor limit was exhausted. Now I'll restart DataFlow engine and try to catch when it leaks sd next time.
It looks it may be application problem as well as DataFlow engine problem.
Back to top
View user's profile Send private message Visit poster's website
jcv
PostPosted: Fri Jun 06, 2008 1:19 am    Post subject: Reply with quote

Chevalier

Joined: 07 May 2007
Posts: 411
Location: Zagreb

Socket descriptor leak in http request node was confirmed by L3 support.
When issuing request to the wrong port, socket remains open. I have noticed it in test environment during stress test when web service was temporarily stopped. The only remedy is to restart broker, since mqm user runs out of descriptors.
In production environment we have web service availability detection, implemented in application layer, hence, bug was not exposed there. Fix is expected.
Back to top
View user's profile Send private message Visit poster's website
sandeepdaggupati
PostPosted: Mon Feb 09, 2009 10:30 am    Post subject: Reply with quote

Novice

Joined: 29 Aug 2008
Posts: 11

what was the fix for this problem? I am also facing the same issue.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » excessive sd + preferred way to restart DataFlowEngine
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.