Author |
Message
|
zaldyman3 |
Posted: Tue Oct 29, 2013 7:16 pm Post subject: Windows 2012 MQ 7.5, 7.5.0.1 or 7.5.0.2 configuration |
|
|
Novice
Joined: 09 Jan 2013 Posts: 10
|
Hi folks,
Just wondering if anyone had encounter a channel that goes on retry after every reboot on a Windows 2012 server?
Qmgr: MQTK7001
MQ hub: MQTK0001 (Multi-instance Qmgr)
Sender Channel definition:
dis CHANNEL(MQTK7001.MQTK0001) all
2 : dis CHANNEL(MQTK7001.MQTK0001) all
AMQ8414: Display Channel details.
CHANNEL(MQTK7001.MQTK0001) CHLTYPE(SDR)
ALTDATE(2013-10-17) ALTTIME(11.24.23)
BATCHHB(0) BATCHINT(0)
BATCHLIM(5000) BATCHSZ(50)
COMPHDR(NONE) COMPMSG(NONE)
CONNAME(kutmqm003(1414),kutmqm004(1414))
CONVERT(NO) DESCR( )
DISCINT(6000) HBINT(300)
KAINT(AUTO) LOCLADDR( )
LONGRTY(999999999) LONGTMR(1200)
MAXMSGL(4194304) MCANAME( )
MCATYPE(PROCESS) MCAUSER( )
MODENAME( ) MONCHL(QMGR)
MSGDATA( ) MSGEXIT( )
NPMSPEED(FAST) PASSWORD( )
PROPCTL(COMPAT) RCVDATA( )
RCVEXIT( ) RESETSEQ(1)
SCYDATA( ) SCYEXIT( )
SENDDATA( ) SENDEXIT( )
SEQWRAP(999999999) SHORTRTY(10)
SHORTTMR(60) SSLCIPH( )
SSLPEER( ) STATCHL(QMGR)
TPNAME( ) TRPTYPE(TCP)
USEDLQ(YES) USERID( )
XMITQ(MQTK0001)
Multi-Instance Qmgr
$ dspmqx
QMNAME(MQTK0001) STANDBY
(Permitted)
INSTANCE(kutmqm003) MODE(Active)
INSTANCE(kutmqm004) MODE(Standby)
No FFDC, the only channel error is for the passive server kutmqm004 which is ok as it will only connect once the Qmgr failover to the other server.
Qmgr error log:
------------------------------------------------
AMQ9202: Remote host 'kutmqm004' not available, retry later.
EXPLANATION:
The attempt to allocate a conversation using TCP/IP to host
'kutmqm004' for channel MQTK7001.MQTK0001 was not successful.
However the error may be a transitory one and it may be possible
to successfully allocate a TCP/IP conversation later.
In some cases the remote host cannot be determined and so is
shown as '????'.
ACTION:
Try the connection again later. If the failure persists, record
the error values and contact your systems administrator. The return
code from TCP/IP is 11001 (X'0'). The reason for the failure may be
that this host cannot reach the destination host. It may also be
possible that the listening program at host 'kutmqm004' was not
running. If this is the case, perform the relevant operations to
start the TCP/IP listening program, and try again.
------------------------------------------------------------------------
AMQ9999: Channel 'MQTK7001.MQTK0001' to host 'kutmqm004(1414)' ended
abnormally.
EXPLANATION:
The channel program running under process ID 4600(164) for channel
'MQTK7001.MQTK0001' ended abnormally. The host name is
'kutmqm004(1414)'; in some cases the host name cannot be determined
and so is shown as '????'.
ACTION:
Look at previous error messages for the channel program in the error
logs to determine the cause of the failure. Note that this message
can be excluded completely or suppressed by tuning the
"ExcludeMessage" or "SuppressMessage" attributes under the
"QMErrorLog" stanza in qm.ini. Further information can be found in
the System Administration Guide
-----------------------------------------------AMQ9202: Remote host 'kutmqm004' not available, retry later.
EXPLANATION:
The attempt to allocate a conversation using TCP/IP to host
'kutmqm004' for channel MQTK7001.MQTK0001 was not successful.
However the error may be a transitory one and it may be possible
to successfully allocate a TCP/IP conversation later.
In some cases the remote host cannot be determined and so is
shown as '????'.
ACTION:
Try the connection again later. If the failure persists, record
the error values and contact your systems administrator. The return
code from TCP/IP is 11001 (X'0'). The reason for the failure may be
that this host cannot reach the destination host. It may also be
possible that the listening program at host 'kutmqm004' was not
running. If this is the case, perform the relevant operations to
start the TCP/IP listening program, and try again.
------------------------------------------------------------------------
AMQ9999: Channel 'MQTK7001.MQTK0001' to host 'kutmqm004(1414)' ended
abnormally.
EXPLANATION:
The channel program running under process ID 4600(164) for channel
'MQTK7001.MQTK0001' ended abnormally. The host name is
'kutmqm004(1414)'; in some cases the host name cannot be determined
and so is shown as '????'.
ACTION:
Look at previous error messages for the channel program in the error
logs to determine the cause of the failure. Note that this message
can be excluded completely or suppressed by tuning the
"ExcludeMessage" or "SuppressMessage" attributes under the
"QMErrorLog" stanza in qm.ini. Further information can be found in
the System Administration Guide.
------------------------------------------------
Any feedback is appreciated...
Cheers,
Z |
|
Back to top |
|
|
zaldyman3 |
Posted: Tue Oct 29, 2013 7:18 pm Post subject: |
|
|
Novice
Joined: 09 Jan 2013 Posts: 10
|
In addition...
to be able to start the channel... i need to do a
telnet kutmqm003 1414 on a command prompt
then the sender channel starts.
without doing the telnet it will not start even if you stop/reset/start the sender channel.
cheers,
Z |
|
Back to top |
|
|
exerk |
Posted: Wed Oct 30, 2013 1:32 am Post subject: |
|
|
Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
Windows TCP/IP Error Code 11001 WSAHOST_NOT_FOUND -- Host not found. (DNS error.), which should give you a clue, especially as the channel works from one server and not from the other.
And I'm a little confused: If you are having to telnet to your 'working' server for the SDR to start, from where are you initiating that telnet? I suspect you have deeper issues within your network set-up that need resolving. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
|
JosephGramig |
Posted: Wed Oct 30, 2013 5:20 am Post subject: |
|
|
Grand Master
Joined: 09 Feb 2006 Posts: 1237 Location: Gold Coast of Florida, USA
|
Also, always use a fully qualified DNS name and not short names. You could experience DNS resolution issues. Not that this is your problem, ensure you use the following two channel stanzas:
Code: |
CHANNELS:
AdoptNewMCACheck=ALL
AdoptNewMCA=ALL |
|
|
Back to top |
|
|
zaldyman3 |
Posted: Thu Nov 21, 2013 3:13 pm Post subject: |
|
|
Novice
Joined: 09 Jan 2013 Posts: 10
|
Hi exerk,
the telnet was run on the same server i have the sender channel not starting.
Hi JosephGramig,
I have tried the full DNS entry and also using the IP... they all have similar outcome after a reboot.
IBM has given me a script to verify the DNS is being translated and it seems that is the case when the channel is running.
BUT if the IBM script is run after the reboot i get the error that exerk posted...
Windows TCP/IP Error Code 11001 WSAHOST_NOT_FOUND -- Host not found. (DNS error.)
I have pushed this back to our server team and network support to further investigate. I wil update one i get something from them or IBM.
Thanks again for the input guys. |
|
Back to top |
|
|
ladyrodgers |
Posted: Wed Feb 26, 2014 11:03 am Post subject: |
|
|
Newbie
Joined: 26 Feb 2014 Posts: 1
|
we are seeing the same thing with websphere 8.5 running on windows 2012, host not found until we "ping" the server name and then it resolves.
did you find a resolution for this?
I've been working with IBM support as well as MS support but nothing... |
|
Back to top |
|
|
gbaddeley |
Posted: Wed Feb 26, 2014 2:14 pm Post subject: |
|
|
Jedi Knight
Joined: 25 Mar 2003 Posts: 2527 Location: Melbourne, Australia
|
Looks like there is something wrong with the TCP / WinSock stack on that server. Its not initializing properly after reboot or can't contact its DNS ? _________________ Glenn |
|
Back to top |
|
|
zaldyman3 |
Posted: Mon Mar 03, 2014 4:47 pm Post subject: |
|
|
Novice
Joined: 09 Jan 2013 Posts: 10
|
From IBM: MQ makes a call to getaddrinfo(), which is a
windows API that actually resolves the hostname and retrieves the IP
address. getaddrinfo() returns error code 11001 which means
WSAHOST_NOT_FOUND
they have sent us a small program to run and check this... our server team found that it is caused by a load balancer ip so we have turn that off.
every time we reboot the server and run the getaddrinfo program from IBM the ip is resolved correctly.
BUT, the issue of channel retrying is still there after the reboot and needing to telnet before it starts.
Another test the server guys did was made a depency for MQ before it starts the service... the last service that the server starts in our environment is the VMware. So now MQ is the last one to start on the server making sure that the TCP and Winsocket are running.
OUTCOME of this dependency test... after the rebooting the server, the channel is running.
I will get some tracing again and repeat this and send it to IBM... could it be that in Windows 2012 server, MQ starts quicky even before the rest of the comms needing to be up? or MQ now needs a depency before the service starts?
I will post again if i get something out of IBM from the new trace that i will send. |
|
Back to top |
|
|
pjdf5133 |
Posted: Thu Sep 18, 2014 4:00 am Post subject: Windows 2012 MQ 7.5, 7.5.0.1 or 7.5.0.2 configuration |
|
|
Newbie
Joined: 18 Sep 2014 Posts: 2
|
Did anyone ever get resolution to this? This sounds very similar to an issue we are seeing. Basically on Windows 2012 (non R2) we see this 1101 failure out of getadddrinfo(). If we ping the host, it resolves the issue for a few hours, then starts failing again.
Network traces show that when it is in a failure state, it only performs net bios name queries to try and resolve host completely by passing DNS.
Once we ping the host, it starts doing DNS again and working as expected again.
Thx |
|
Back to top |
|
|
exerk |
Posted: Thu Sep 18, 2014 4:10 am Post subject: Re: Windows 2012 MQ 7.5, 7.5.0.1 or 7.5.0.2 configuration |
|
|
Jedi Council
Joined: 02 Nov 2006 Posts: 6339
|
pjdf5133 wrote: |
Did anyone ever get resolution to this? This sounds very similar to an issue we are seeing. Basically on Windows 2012 (non R2) we see this 1101 failure out of getadddrinfo(). If we ping the host, it resolves the issue for a few hours, then starts failing again.
Network traces show that when it is in a failure state, it only performs net bios name queries to try and resolve host completely by passing DNS.
Once we ping the host, it starts doing DNS again and working as expected again.
Thx |
zaldyman3 wrote "...our server team found that it is caused by a load balancer ip so we have turn that off...", so worth talking with your network people to see if you have the same issue. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. |
|
Back to top |
|
|
pjdf5133 |
Posted: Mon Dec 22, 2014 3:01 pm Post subject: MQ DNS Windows 2012 Solved |
|
|
Newbie
Joined: 18 Sep 2014 Posts: 2
|
Our issue turned out to be failed DNS queries resulting in negative cache entries. How/Why DNS is failing is still a mystery as this only seems to be affecting only MQ boxes. The way IBM is calling getaddrinfo() if a negative cache entry exists, it will not perform an over the wire DNS query until negative cache expires which then fails to NetBios. If they provided hints.ai_flags = AI_ADDRCONFIG it would perform over wire DNS query everytime.
Only workable solutions for us would be:
1. Use FQDN. Big change accrosss a lot of servers.
2. Add registry key MaxNegativeCacheTtl with value of 0 to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Dnscache\Parameters. This is the option we took.
Hope this helps someone else out. |
|
Back to top |
|
|
jtfries |
Posted: Sat May 06, 2017 8:40 am Post subject: Re: MQ DNS Windows 2012 Solved |
|
|
Newbie
Joined: 04 Apr 2012 Posts: 7
|
Hi -- Sorry to raise such an old thread, but do you have any references which show that setting "hints.ai_flags = AI_ADDRCONFIG" will ignore a negative DNS cache entry and go out over the wire to the DNS server every time?
The documentation I have read for AI_ADDRCONFIG says that it keeps the lookup from looking up A records if no IPv4 addresses are configured on the system and from looking up AAAA records if no IPv6 addresses are configured, but I haven't seen any discussion about bypassing negative DNS cache entries.
I have seen some documentation that AI_ADDRCONFIG can mess up localhost-type lookups in certain situations, and one site claimed that Windows by default acts as if AI_ADDRCONFIG were always set (although in a way that does not break localhost lookups).
I would love to be able to achieve the behavior you described. The default timeout for negative DNS cache entries is quite high and has been a source of confusion before. _________________ Justin Fries
IBM Corporation
RTP, North Carolina
The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions. |
|
Back to top |
|
|
gbaddeley |
Posted: Sun May 07, 2017 4:42 pm Post subject: |
|
|
Jedi Knight
Joined: 25 Mar 2003 Posts: 2527 Location: Melbourne, Australia
|
We had a similar issue on Win 2012 with apparent DNS failures, and a channel remains Retrying state instead of Running.
Microsoft suggested making the following registry change to resolve the issue.
Start > Run > Regedit
Navigate to: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Dnscache\Parameters
Add DWORD MaxNegativeCacheTtl, Give it a value of zero
i.e.
Name = MaxNegativeCacheTtl
Value = 0 _________________ Glenn |
|
Back to top |
|
|
|