This pattern provides a reusable solution for how to achieve a multi-threaded transformation process when receiving input data in sequence from a single threaded source. The most common example of this requirement is when reading input data from a file, but the same pattern could be applied to many protocols.
When reading data from an input file record-by-record using a FileInput node, the WebSphere Message Broker will use a single thread to process a single file. Configuring additional instances on the FileInput node allows many files to be processed simultaneously at any given time, although each individual file will still only be processed by one thread from beginning to end. In circumstances where a single file is very large, or where each record requires a complex transformation, this can lead to performance problems. This pattern decouples the process of reading the records from the logic required to transform each record. The advantage of this approach is that the transformation process can be made multi-threaded by placing the logic in a message flow which uses multiple instances. In many cases this design pattern provides a significant performance enhancement than the equivalent single threaded message flow containing both FileInput node and transformation logic.
The pattern generates three message flows:
The first and last flows are deliberately made single threaded, and the middle flow is intended to be multi threaded, with a configurable number of threads created when the pattern instance is generated. The intention of the pattern is to improve the performance of the end-to-end process. The single threaded components of the pattern are intentionally very simple. It is strongly advised not to add any additional logic into these message flows, as this could degrade the performance of the pattern. If a developer uses the default parameters, then the pattern will treat each record as a BLOB data format in the Splitter_SingleThread and Adder_SingleThread message flows. The Splitter_SingleThread flow will read each line of an input file as a distinct record. However, it is likely that when transformation logic is added to the Transformer_MultiThread message flow, a developer may wish to use alternative parser settings on this flow.
Splitter_SingleThread creates an MQMD header on the front of each record which is read from the input file. The MQMD GroupId is given a value based on the name of the input file, and the time the file was last modified. The Record number of each record read from the file is assigned to the MsgSeqNumber of the MQMD. Each message is flagged as a member of a WMQ Group. Once all the records have been read from the input file, an End Of Data propagation is used to create an additional message, which is flagged as the last member of the WMQ Group. In the final message flow named Adder_SingleThread, the arrival of this message is used in conjunction with the Finish File terminal of the FileOutput node to close the final output file.