Intercommunication

Channel control function

The channel control function provides facilities for you to define, monitor, and control channels. Commands are issued through panels, programs, or from a command line to the channel control function. The panel interface also displays channel status and channel definition data.

Note:: For the channel control function on WebSphere MQ for UNIX and Windows, and MQSeries for OS/2 Warp, Compaq OpenVMS Alpha, and Compaq NonStop Kernel, you can use Programmable Command Formats or those WebSphere MQ commands (MQSC) and control commands that are detailed in Chapter 8, Monitoring and controlling channels on distributed platforms.

The commands fall into the following groups:

Channel administration
Channel control
Channel status monitoring

Channel administration commands deal with the definitions of the channels. They enable you to:

Create a channel definition
Copy a channel definition
Alter a channel definition
Delete a channel definition

Channel control commands manage the operation of the channels. They enable you to:

Start a channel
Stop a channel
Re-synchronize with partner (in some implementations)
Reset message sequence numbers
Resolve an in-doubt batch of messages
Ping; send a test communication across the channel

Channel monitoring displays the state of channels, for example:

Current channel settings
Whether the channel is active or inactive
Whether the channel terminated in a synchronized state

Preparing channels

Before trying to start a message channel or MQI channel, you must make sure that all the attributes of the local and remote channel definitions are correct and compatible. Chapter 6, Channel attributes describes the channel definitions and attributes.

Although you set up explicit channel definitions, the channel negotiations carried out when a channel starts up may override one or other of the values defined. This is quite normal, and transparent, and has been arranged like this so that otherwise incompatible definitions can work together.

Auto-definition of receiver and server-connection channels

In WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, and Windows systems, and MQSeries V5.1 for OS/2 Warp, if there is no appropriate channel definition, then for a receiver or server-connection channel that has auto-definition enabled, a definition is created automatically. The definition is created using:

The appropriate model channel definition, SYSTEM.AUTO.RECEIVER or SYSTEM.AUTO.SVRCONN. The model channel definitions for auto-definition are the same as the system defaults, SYSTEM.DEF.RECEIVER and SYSTEM.DEF.SVRCONN, except for the description field, which is "Auto-defined by" followed by 49 blanks. The systems administrator can choose to change any part of the supplied model channel definitions.
Information from the partner system. The partner's values are used for the channel name and the sequence number wrap value.
A channel exit program, which you can use to alter the values created by the auto-definition. See Channel auto-definition exit program.

The description is then checked to determine whether it has been altered by an auto-definition exit or because the model definition has been changed. If the first 44 characters are still "Auto-defined by" followed by 29 blanks, the queue manager name is added. If the final 20 characters are still all blanks the local time and date are added.

Once the definition has been created and stored the channel start proceeds as though the definition had always existed. The batch size, transmission size, and message size are negotiated with the partner.

Defining other objects

Before a message channel can be started, both ends must be defined (or enabled for auto-definition) at their respective queue managers. The transmission queue it is to serve must be defined to the queue manager at the sending end, and the communication link must be defined and available. In addition, it may be necessary for you to prepare other WebSphere MQ objects, such as remote queue definitions, queue manager alias definitions, and reply-to queue alias definitions, so as to implement the scenarios described in Chapter 2, Making your applications communicate.

For information about MQI channels, see the WebSphere MQ Clients book.

Multiple message channels per transmission queue

It is possible to define more than one channel per transmission queue, but only one of these channels can be active at any one time. This is recommended for the provision of alternative routes between queue managers for traffic balancing and link failure corrective action.

Starting a channel

A channel can be caused to start transmitting messages in one of four ways. It can be:

Started by an operator (not receiver, cluster-receiver or server-connection channels).
Triggered from the transmission queue (sender, and fully-qualified server channels only). You will need to prepare the necessary objects for triggering channels.
Started from an application program (not receiver, cluster-receiver or server-connection channels).
Started remotely from the network by a sender, cluster-sender, requester, server, or client-connection channel. Receiver, cluster-receiver and possibly server and requester channel transmissions, are started this way; so are server-connection channels. The channels themselves must already be started (that is, enabled).

Note:: Because a channel is 'started' it is not necessarily transmitting messages, but, rather, it is 'enabled' to start transmitting when one of the four events described above occurs. The enabling and disabling of a channel is achieved using the START and STOP operator commands.

Channel states

Figure 29 shows the hierarchy of all possible channel states, and Figure 30 shows the links between them. These apply to all types of message channel. On WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, and Windows systems, and MQSeries V5.1 for OS/2 Warp, these states apply also to server-connection channels.

Figure 29. Channel states

The diagram shows the hierarchy of channel states. At the top level, A channel is inactive or current. A current channel can be stopped, starting, retrying, or active. An active channel can be initializing, binding, requesting, running, pused, or stopping.

Current and active

The channel is "current" if it is in any state other than inactive. A current channel is "active" unless it is in RETRYING, STOPPED, or STARTING state.

Figure 30. Flows between channel states

The diagram shows the flows between channel states. A stopped channel can be started, and becomes inactive. A start command, trigger, remote initiation, or a channel initiator places the channel in the initializing state. The channel moves into starting state, and then binding state, while it establishes session and initial data exchange. If the status is OK, the channel state becomes running. The channel can be placed into a paused state while waiting for message-retry interval, or a stopping state after an error, a STOP request, or if a disconnect interval expires. The channel could then move into a retrying state, or back to the stopped state.

Notes:

When a channel is in one of the six states highlighted in Figure 30 (INITIALIZING, BINDING, REQUESTING, RUNNING, PAUSED, or STOPPING), it is consuming resource and a process or thread is running; the channel is active. (INITIALIZING occurs on z/OS and on WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, and Windows systems, and MQSeries V5.1 for Compaq Tru64 UNIX, and OS/2 Warp. (PAUSED does not occur on z/OS.)
When a channel is in STOPPED state, the session may be active because the next state is not yet known.

Specifying the maximum number of current channels

You can specify the maximum number of channels that can be current at one time. This is the number of channels that have entries in the channel status table, including channels that are retrying and channels that are disabled (that is, stopped). Specify this in the channel initiator parameter module for z/OS, the queue manager initialization file for iSeries, the queue manager configuration file for OS/2, Compaq NonStop Kernel, and UNIX systems, or the registry for Windows. For more information about the values you set using the initialization or the configuration file see Appendix C, Configuration file stanzas for distributed queuing. For more information about specifying the maximum number of channels, see the WebSphere MQ System Administration Guide for WebSphere MQ for AIX, HP-UX, Linux, Solaris, and Windows systems, and MQSeries V5.1 for Compaq Tru64 UNIX, and OS/2 Warp, the WebSphere MQ for iSeries V5.3 System Administration book for WebSphere MQ for iSeries, or the WebSphere MQ for z/OS Concepts and Planning Guide for information relating to your platform.

Notes:

On WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, Windows systems, and z/OS, and MQSeries V5.1 for Compaq Tru64 UNIX, and OS/2 Warp, server-connection channels are included in this number.
A channel must be current before it can become active. If a channel is started, but cannot become current, the start fails.
If you are using CICS for distributed queuing on z/OS, you cannot specify the maximum number of channels.

Specifying the maximum number of active channels

You can also specify the maximum number of active channels (except on WebSphere MQ for z/OS using CICS). You can do this to prevent your system being overloaded by a large number of starting channels. If you use this method, you should set the disconnect interval attribute to a low value to allow waiting channels to start as soon as other channels terminate.

Each time a channel that is retrying attempts to establish connection with its partner, it must become an active channel. If the attempt fails, it remains a current channel that is not active, until it is time for the next attempt. The number of times that a channel will retry, and how often, is determined by the retry count and retry interval channel attributes. There are short and long values for both these attributes. See Chapter 6, Channel attributes for more information.

When a channel has to become an active channel (because a START command has been issued, or because it has been triggered, or because it is time for another retry attempt), but is unable to do so because the number of active channels is already at the maximum value, the channel waits until one of the active slots is freed by another channel instance ceasing to be active. If, however, a channel is starting because it is being initiated remotely, and there are no active slots available for it at that time, the remote initiation is rejected.

Whenever a channel, other than a requester channel, is attempting to become active, it goes into the STARTING state. This is true even if there is an active slot immediately available, although in this case it will only be in STARTING state for a very short time. However, if the channel has to wait for an active slot, it is in STARTING state while it is waiting.

Requester channels do not go into STARTING state. If a requester channel cannot start because the number of active channels is already at the limit, the channel ends abnormally.

Whenever a channel, other than a requester channel, is unable to get an active slot, and so waits for one, a message is written to the log or the z/OS console, and an event is generated. When a slot is subsequently freed and the channel is able to acquire it, another message and event are generated. Neither of these events and messages are generated if the channel is able to acquire a slot straightaway.

If a STOP CHANNEL command is issued while the channel is waiting to become active, the channel goes to STOPPED state. A Channel-Stopped event is raised as usual.

On WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, Windows systems, and z/OS, and MQSeries V5.1 for OS/2 Warp, server-connection channels are included in the maximum number of active channels.

For more information about specifying the maximum number of active channels, see the WebSphere MQ System Administration Guide book for WebSphere MQ for AIX, iSeries, HP-UX, Solaris, and Windows systems, and MQSeries for Compaq Tru64 UNIX, and OS/2 Warp, the WebSphere MQ for iSeries V5.3 System Administration book for WebSphere MQ for iSeries, or the WebSphere MQ for z/OS Concepts and Planning Guide for the z/OS platform.

Channel errors

Errors on channels cause the channel to stop further transmissions. If the channel is a sender or server, it goes to RETRY state because it is possible that the problem may clear itself. If it cannot go to RETRY state, the channel goes to STOPPED state. For sending channels, the associated transmission queue is set to GET(DISABLED) and triggering is turned off. (A STOP command with STATUS(STOPPED) takes the side that issued it to STOPPED state; only expiry of the disconnect interval or a STOP command with STATUS(INACTIVE) will make it end normally and become inactive.) Channels that are in STOPPED state need operator intervention before they will restart (see Restarting stopped channels).

Note:: For Compaq OpenVMS Alpha, OS/2 Warp, iSeries, UNIX systems, Compaq NonStop Kernel, and Windows systems, a channel initiator must be running for retry to be attempted. If the channel initiator is not available, the channel becomes inactive and must be manually restarted. If you are using a script to start the channel, ensure the channel initiator is running before you try to run the script. On platforms other than AIX, Compaq Tru64 UNIX, HP-UX, iSeries, OS/2 Warp, Solaris, and Windows systems, the channel initiator must be monitoring the initiation queue specified in the transmission queue that the channel is using.

Long retry count (LONGRTY) describes how retrying works. If the error clears, the channel restarts automatically, and the transmission queue is re-enabled. If the retry limit is reached without the error clearing, the channel goes to STOPPED state. A stopped channel must be restarted manually by the operator. If the error is still present, it does not retry again. When it does start successfully, the transmission queue is re-enabled.

On WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, Windows systems, and z/OS without CICS, and MQSeries V5.1 for OS/2 Warp, if the channel initiator or queue manager stops while a channel is in RETRYING or STOPPED status, the channel status is remembered when the channel initiator or queue manager is restarted.

On OS/2 Warp, Windows systems, iSeries, Compaq NonStop Kernel, and UNIX systems, if a channel is unable to put a message to the target queue because that queue is full or put inhibited, the channel can retry the operation a number of times (specified in the message-retry count attribute) at a given time interval (specified in the message-retry interval attribute). Alternatively, you can write your own message-retry exit that determines which circumstances cause a retry, and the number of attempts made. The channel goes to PAUSED state while waiting for the message-retry interval to finish.

See Chapter 6, Channel attributes for information about the channel attributes, and Chapter 45, Channel-exit programs for information about the message-retry exit.

Checking that the other end of the channel is still available

In WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, Windows systems, and z/OS without CICS, and MQSeries V5.1 for OS/2 Warp, you can use the heartbeat-interval channel attribute to specify that flows are to be passed from the sending MCA when there are no messages on the transmission queue. This is described in Heartbeat interval (HBINT).

In WebSphere MQ for z/OS, if you are using TCP as the transport protocol, you can also specify a value for the KeepAlive Interval channel attribute (KAINT). You are recommended to give the KeepAliveInterval a higher value than the heartbeat interval, and a smaller value than the disconnect value. You can use this attribute to specify a time-out value for each channel. This is described in KeepAlive Interval (KAINT).

In WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, Windows systems, and MQSeries V5.1 for OS/2 Warp, if you are using TCP as your transport protocol, you can set keepalive=yes in the qm.ini file. If you specify this option, TCP periodically checks that the other end of the connection is still available, and if it is not, the channel is terminated.

In WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, and Windows systems, and MQSeries V5.1 for OS/2 Warp, if you are using TCP as your transport protocol, the receiving end of inactive connections are also closed if no data is received for a period of time. This period of time is determined according to the HBINT (heartbeat interval) value.

The time-out value is set as follows:

For an initial number of flows, before any negotiation has taken place, the timeout is twice the HBINT value from the channel definition.
When the channels have negotiated a HBINT value, if HBINT is set to less than 60 seconds, the timeout is set to twice this value. If HBINT is set to 60 seconds or more, the timeout value is set to 60 seconds greater than the value of HBINT.

Notes:

If either of the above values is zero, there is no timeout.
For connections that do not support heartbeats, the HBINT value is negotiated to zero in step 2 and hence there is no timeout, so we must use TCP/IP KEEPALIVE.
For client connections, heartbeats are only flowed from the server when the client issues an MQGET call with wait; none are flowed during other MQI calls. Therefore, you are not recommended to set the heartbeat interval too small for client channels. For example, if the heartbeat is set to ten seconds, an MQCMIT call will fail (with MQRC_CONNECTION_BROKEN) if it takes longer than twenty seconds to commit because no data will have been flowed during this time. This can happen with large units of work. However, it should not happen if appropriate values are chosen for the heartbeat interval because only MQGET with wait should take significant periods of time.
Aborting the connection after twice the heartbeat interval is valid because a data or hearbeat flow is expected at least every heartbeat interval. Setting the heartbeat interval too small, however, can cause problems, especially if you are using channel exits. For example, if the HBINT value is one second, and a send or receive exit is used, the receiving end will only wait for two seconds before aborting the channel. If the MCA is performing a task such as encrypting the message, this value might be too short.

If you have unreliable channels that are suffering from TCP errors, use of SO_KEEPALIVE will mean that your channels are more likely to recover.

You can specify time intervals to control the behavior of the SO_KEEPALIVE option. When you change the time interval, only TCP/IP channels started after the change are affected. The value that you choose for the time interval should be less than the value of the disconnect interval for the channel.

For more information about using the SO_KEEPALIVE option on z/OS, see WebSphere MQ for z/OS Concepts and Planning Guide . For other platforms, see the chapter about setting up communications for your platform in this manual.

Adopting an MCA

If a channel suffers a communications failure, the receiver channel could be left in a 'communications receive' state. When communications are re-established the sender channel attempts to reconnect. If the remote queue manager finds that the receiver channel is already running it does not allow another version of the same receiver channel to be started. This problem requires user intervention to rectify the problem or the use of system keepalive.

The Adopt MCA function solves the problem automatically. It enables WebSphere MQ to cancel a receiver channel and to start a new one in its place.

The function can be set up with various options. For more information see WebSphere MQ for z/OS System Setup Guide or the appropriate equivalent publication for your platforms.

Stopping and quiescing channels

Message channels are designed to be long-running connections between queue managers with orderly termination controlled only by the disconnect interval channel attribute. This mechanism works well unless the operator needs to terminate the channel before the disconnect time interval expires. This can occur in the following situations:

System quiesce
Resource conservation
Unilateral action at one end of a channel

In this case, an operator command is provided to allow you to stop the channel. The command provided varies by platform, as follows:

For z/OS without CICS:: The STOP CHANNEL MQSC command or the Stop a channel panel
For z/OS using CICS:: The Stop option on the Message Channel List panel
For OS/2, Windows systems, Compaq OpenVMS Alpha, Compaq NonStop Kernel, and UNIX systems:: The STOP CHANNEL MQSC or PCF command
For iSeries:: ENDMQMCHL or the END option on the WRKMQMCHL panel
For VSE/ESA:: The CLOSE command from the MQMMSC panel or MQCL transaction closes (rather than stops) the channel.

There are three options for stopping channels using these commands:

QUIESCE: The QUIESCE option attempts to end the current batch of messages before stopping the channel.
FORCE: The FORCE option attempts to stop the channel immediately and may require the channel to resynchronize when it restarts because the channel may be left in doubt.
TERMINATE: The TERMINATE option attemps to stop the channel immediately, and terminates the channel's thread or process.

Note that all of these options leave the channel in a STOPPED state, requiring operator intervention to restart it.

Stopping the channel at the sending end is quite effective but does require operator intervention to restart. At the receiving end of the channel, things are much more difficult because the MCA is waiting for data from the sending side, and there is no way to initiate an orderly termination of the channel from the receiving side; the stop command is pending until the MCA returns from its wait for data.

Consequently there are three recommended ways of using channels, depending upon the operational characteristics required:

If you want your channels to be long running, you should note that there can be orderly termination only from the sending end. When channels are interrupted, that is, stopped, operator intervention (a START CHANNEL command) is required in order to restart them.
If you want your channels to be active only when there are messages for them to transmit, you should set the disconnect interval to a fairly low value. Note that the default setting is quite high and so is not recommended for channels where this level of control is required. Because it is difficult to interrupt the receiving channel, the most economical option is to have the channel automatically disconnect and reconnect as the workload demands. For most channels, the appropriate setting of the disconnect interval can be established heuristically.
For WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, Windows systems, and z/OS without CICS, and MQSeries V5.1 for OS/2 Warp, you can use the heartbeat-interval attribute to cause the sending MCA to send a heartbeat flow to the receiving MCA during periods in which it has no messages to send. This releases the receiving MCA from its wait state and gives it an opportunity to quiesce the channel without waiting for the disconnect interval to expire. Give the heartbeat interval a lower value than the value of the disconnect interval.
Notes:
1. You are advised to set the disconnect interval to a low value, or to use heartbeats, for server channels. This is to allow for the case where the requester channel ends abnormally (for example, because the channel was canceled) when there are no messages for the server channel to send. In this case, the server does not detect that the requester has ended (it will only do this the next time it tries to send a message to the requester). While the server is still running, it holds the transmission queue open for exclusive input in order to get any more messages that arrive on the queue. If an attempt is made to restart the channel from the requester, the start request receives an error because the server still has the transmission queue open for exclusive input. It is necessary to stop the server channel, and then restart the channel from the requester again.
2. On WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, Windows systems, and z/OS without CICS, and MQSeries V5.1 for Compaq Tru64 UNIX, and OS/2 Warp, server-connection channels can also be stopped like receiver channels.

Restarting stopped channels

When a channel goes into STOPPED state (either because you have stopped the channel manually using one of the methods given in Stopping and quiescing channels, or because of a channel error) you have to restart the channel manually.

To do this, issue one of the following commands:

For WebSphere MQ for z/OS without CICS:: The START CHANNEL MQSC command or the Start a channel panel
For WebSphere MQ for z/OS using CICS:: The Start option on the Message Channel List panel
For WebSphere MQ for UNIX systems and Windows systems, and MQSeries for OS/2 Warp, Compaq OpenVMS Alpha, Compaq NonStop Kernel, :: The START CHANNEL MQSC or PCF command
For WebSphere MQ for iSeries:: The START command on the WRKMQMCHL panel, the STRMQMCHL command, or the START CHANNEL MQSC or PCF command
For MQSeries for VSE/ESA:: The OPEN command from the MQMMSC panel or MQCL transaction opens (rather than restarts) the channel.

For sender or server channels, when the channel entered the STOPPED state, the associated transmission queue was set to GET(DISABLED) and triggering was set off. When the start request is received, these attributes are reset automatically. On WebSphere MQ for AIX, iSeries, HP-UX, Linux, Solaris, Windows systems, and z/OS without CICS, and MQSeries V5.1 for OS/2 Warp, if the channel initiator or queue manager stops while a channel is in RETRYING or STOPPED status, the channel status is remembered when the channel initiator or queue manager is restarted. On other platforms, if the channel initiator or queue manager is restarted the status is lost and you have to alter the queue attributes manually to re-enable triggering of the channel.

Note:: If you are using CICS for distributed queuing on z/OS, these queue attributes are not reset automatically; you always have to alter them manually when you restart a channel.

In-doubt channels

An in-doubt channel is a channel that is in doubt with a remote channel about which messages have been sent and received. Note the distinction between this and a queue manager being in doubt about which messages should be committed to a queue.

You can reduce the opportunity for a channel to be placed in doubt by using the Batch Heartbeat channel parameter (BATCHHB). When a value for this parameter is specified, a sender channel checks that the remote channel is still active before taking any further action. If no response is received the receiver channel is considered to be no longer active. The messages can be rolled-back, and re-routed, and the sender-channel is not put in doubt. This reduces the time when the channel could be placed in doubt to the period between the sender channel verifying that the receiver channel is still active, and verifying that the receiver channel has received the sent messages. See Chapter 6, Channel attributes for more information on the batch heartbeat parameter.

In-doubt channel problems are usually resolved automatically. Even when communication is lost, and a channel is placed in doubt with a message batch at the sender whose receipt status is unknown, the situation is resolved when communication is re-established. Sequence number and LUWID records are kept for this purpose. The channel is in doubt until LUWID information has been exchanged, and only one batch of messages can be in doubt for the channel.

You can, when necessary, resynchronize the channel manually. The term manual includes use of operators or programs that contain WebSphere MQ system management commands. The manual resynchronization process works as follows. This description uses MQSC commands, but you can also use the PCF equivalents.

Use the DISPLAY CHSTATUS command to find the last-committed logical unit of work ID (LUWID) for each side of the channel. Do this using the following commands:
- For the in-doubt side of the channel:
```
DISPLAY CHSTATUS(name) SAVED CURLUWID
```
  You can use the CONNAME and XMITQ parameters to further identify the channel.
- For the receiving side of the channel:
```
DISPLAY CHSTATUS(name) SAVED LSTLUWID
```
  You can use the CONNAME parameter to further identify the channel.
The commands are different because only the sending side of the channel can be in doubt. The receiving side is never in doubt.
On WebSphere MQ for iSeries, the DISPLAY CHSTATUS command can be executed from a file using the STRMQMMQSC command or the Work with MQM Channel Status CL command, WRKMQMCHST
If the two LUWIDs are the same, the receiving side has committed the unit of work that the sender considers to be in doubt. The sending side can now remove the in-doubt messages from the transmission queue and re-enable it. This is done with the following channel RESOLVE command:
```
RESOLVE CHANNEL(name) ACTION(COMMIT)
```
If the two LUWIDs are different, the receiving side has not committed the unit of work that the sender considers to be in doubt. The sending side needs to retain the in-doubt messages on the transmission queue and re-send them. This is done with the following channel RESOLVE command:
```
RESOLVE CHANNEL(name) ACTION(BACKOUT)
```
On WebSphere MQ for iSeries, you can use the Resolve MQM Channel command, RSVMQMCHL.

Once this process is complete the channel is no longer in doubt. The transmission queue can now be used by another channel, if required.

Problem determination

There are two distinct aspects to problem determination:

Problems discovered when a command is being submitted
Problems discovered during operation of the channels

Channel control function

Preparing channels

Auto-definition of receiver and server-connection channels

Defining other objects

Multiple message channels per transmission queue

Starting a channel

Channel states

Current and active

Specifying the maximum number of current channels

Specifying the maximum number of active channels

Channel errors

Checking that the other end of the channel is still available

Adopting an MCA

Stopping and quiescing channels

Restarting stopped channels

In-doubt channels

Problem determination

Command validation

Processing problems

Messages and codes