MQSeries.net :: View topic - Sequential Processing using fileInput node

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Sequential Processing using fileInput node

Goto page 1, 2 Next

Sequential Processing using fileInput node

« View previous topic :: View next topic »

Author

Message

RangaKovela

Posted: Sun Dec 13, 2015 9:39 pm Post subject: Sequential Processing using fileInput node

Apprentice

Joined: 10 May 2011
Posts: 38

Hi Team,
Environment : IIB9 broker on Windows
SFTP server is on Windows.

We have requirement to process a batch of files generated by backend system in sequential order (i.e FIFO). A batch can have multiple files . All the files are placed in the IIB source directory from where FileInput Node is polling using move command.
I want to know if FileInput node is capable of picking up files in the order they were created by backend system.

Thanks,

zpat

Posted: Sun Dec 13, 2015 11:04 pm Post subject:

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

Use MQ, not files, if data integrity or sequence are important.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

RangaKovela

Posted: Sun Dec 13, 2015 11:35 pm Post subject:

Apprentice

Joined: 10 May 2011
Posts: 38

Thanks for your response. IIB source is FileServer in this case. Can't change this now

zpat

Posted: Mon Dec 14, 2015 3:47 am Post subject:

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

That's just an excuse. Get the design right and these issues do not arise.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

Vitor

Posted: Mon Dec 14, 2015 5:18 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

RangaKovela wrote:

IIB source is FileServer in this case. Can't change this now

Well that's an awesome design decision by someone.

The FileInput node matches files based on what the underlying OS (in this case Windoze) tells it is available. This means:

a) the files could be picked up in any order
b) you can never scale the solution to run additional instances
c) you can't be sure (especially for larger files) IIB will process the entire file as you can't be sure that the file has been completely moved into the target directory before it showed up as available to the FileInput node
_________________
Honesty is the best policy.
Insanity is the best defence.

timber

Posted: Mon Dec 14, 2015 5:34 am Post subject:

Grand Master

Joined: 25 Aug 2015
Posts: 1292

You could write a script that copies/moves the files into the IIB source directory one at a time, in the correct order. That might just about work if this is a daily batch job and performance is not important.

But I agree with zpat and Vitor; the design is not good.

zpat

Posted: Mon Dec 14, 2015 8:33 am Post subject:

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

Because many of the new people in IT seem to be untrained and/or inexperienced, combined with a desire to get something that appears to work done quickly, rather than get it right - I spend increasing amounts of my time on this subject.

What you need to understand about using files is this.

1. On unix there is no locking done by most applications (including FTP and SFTP) so that files may appear in a directory (and start to get processed) before they have been completely written. This applies to both files coming into and going out of the broker. Result = partial data and no warning of this.

2. There is nothing inherent to prevent a file being read more than once (unlike MQ where messages are consumed as they are accessed). So inadvertant duplication can (and does) often occur when scripts are run more than once, or restarted after failure.

3. All file transfers are synchronous - so that both ends must be active for it to work, when the destination is down - the file transfer fails and there is no automatic retry or queueing, this makes management a nightmare. Files can therefore be missed entirely and no warning is given.

4. Failure during file processing causes great uncertainty because there is no transactional control, at best the entire file is re-processed leading to duplication of data, but it can easily be only partially processed, no warning of this is apparent to the target application.

5. Files can be processed out of sequence and again no warning is given to the application. Also since no locking exists, file can be corrupted by multiple applications opening the file for write at the same time.

The only way to avoid this (with files) is to use a numbering convention for the files and a header and trailer record in each file. Then the receiving application must check it has processed each file in order, not missing any or processing it twice. It must also check that the file is complete and has the trailer record matching the header record.

In other words the end application has to be extremely resistant to the unreliable delivery inherent with file usage. Since NO-ONE ever does this adequately - you are designing a system that WILL FAIL at some point. The impact of the loss or duplication of data can be immense. Companies can go out of business very quickly these days if they lose credibility.

Or you can just use something fit for purpose - MQ, which is very easy to code for (or just use JMS). This can even do XA 2-phase commit which gives you bulletproof delivery to a database. Nothing else does this.

Remember - using files is NEVER a requirement, it is only a possible (and very poor) solution, when anything transactionally important is being sent.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

smdavies99

Posted: Mon Dec 14, 2015 9:10 am Post subject:

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

In adittion to what my esteemed colleague has said so eloquently...

Files are often used because they cost nothing. System have filesystems and files. Using them costs nothing.

Files are often chosen because the people doing the choosing realy know nothing else. FTP is free so that is what is used.

As has been said, once you start speeding things up (from about 1-2 files/second) then the problems start. Until you have been bitten (badly) by them you don't really understand how bad a choice files can be.
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.

zpat

Posted: Mon Dec 14, 2015 10:02 am Post subject:

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

MQ client (or JMS) is also free.

But what is the cost of losing or duplicating production data?
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

PeterPotkay

Posted: Mon Dec 14, 2015 3:43 pm Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

If the original data is in a file, and the data needs to get into WMB, what do you approach do you guys take with the architects? Do you just dig in your heels, say unless the data arrives as MQ messages at WMB don't bother with WMB?

Whether its WMB converting the file into MQ messages, or some home grown mess, SOMETHING needs to absorb the pain and risk of getting that data out of a file and into MQ messages. As unappetizing as it is to do it in WMB, is it the least worst choice?
_________________
Peter Potkay
Keep Calm and MQ On

Simbu

Posted: Tue Dec 15, 2015 2:51 am Post subject: Re: Sequential Processing using fileInput node

Master

Joined: 17 Jun 2011
Posts: 289
Location: Tamil Nadu, India

RangaKovela wrote:

Hi, Worst case, sequence can be maintained by last updated timestamp.
IBM document says,

Quote:

files to be processed by the FileInputNode are
prioritized as per the last updated timestamp.that is the oldest files are processed first

Is any failure while processing the any file then sequence will be missed

zpat

Posted: Tue Dec 15, 2015 3:17 am Post subject:

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

To avoid partial files being processed prematurely - use the put and rename approach in SFTP (or FTP).

I.e. send the file as filename pattern 1, then after the put - change (rename/move) the file name to filename pattern 2.

In the file input node, look only for filename pattern 2. Also increase the polling interval to a sensible value.

This way, partial files will not be processed by the broker at least.

---------------

However you really need to get the data issued as messages in the first place, long experience tells me that not the slightest effort is ever made to do this in certain locations due to "cultural" issues.

---------------

In terms of timestamps - take care as some file transfer tools preserve the original file's date/time, and some set the date/time of the transfer. Usually this can be controlled with an option.

Also if using WINSCP you must disable sending files into temporary filenames (see preferences - endurance)

--------------

If you work for a financial institution and are using files to carry transactional data, please let us know the company (so we can avoid using it).

_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

RangaKovela

Posted: Tue Dec 15, 2015 10:18 pm Post subject:

Apprentice

Joined: 10 May 2011
Posts: 38

thank you all for your responses.

We did some more research on ths and found this in IBM site: http://www-01.ibm.com/support/docview.wss?uid=swg1IC91632

We are moving files from backend directory to IIB Directory (directory from which FileInput Node picks up the messages) using batch script hourly basis.

We have designed two flows here.
Flow1 - which picks Files using fileInput node and posts Files into MQ
Flow2. - Process MQ message received from flows and post them in Cloud base web application using HTTPS

We have implemented backout mechanism in case of any intermittent connectivity failures in Flow2 to ensure sequential processing.

We are planning to get disable the main queues using PCF messages from Flow2 if in case connectivity issues persists after 5 retries. An e-mail notification/text message will be sent out to Support team. Support team will have to retry Messages in backout queue followed by enabling of main queue once backend system is up and running. This will ensure sequential processing. Any suggestions

timber

Posted: Wed Dec 16, 2015 1:35 am Post subject:

Grand Master

Joined: 25 Aug 2015
Posts: 1292

Quote:

We are planning to get disable the main queues using PCF messages from Flow2 if in case connectivity issues persists after 5 retries. An e-mail notification/text message will be sent out to Support team. Support team will have to retry Messages in backout queue followed by enabling of main queue once backend system is up and running. This will ensure sequential processing. Any suggestions

I am suspicious of solutions that involve sending PCF messages from a message flow. I understand why people do it, but I am fairly sure that IBM never intended it to be done. I reckon there is always a better solution. So how about this...

Can you set the retry count to 5, and remove the backout queue?That way, after 5 retries the message will sit on the input queue and block the remaining messages ( it will act like a 'poison message' ). Sounds as if that is the behaviour that you need - and it would be a lot easier than using PCF messages.

RangaKovela

Posted: Wed Dec 16, 2015 1:47 am Post subject:

Apprentice

Joined: 10 May 2011
Posts: 38

If backout queue is not specified MQInput node would move messages DLQ assigned to the queuemanager.

Display posts from previous:

Goto page 1, 2 Next

Page 1 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Sequential Processing using fileInput node

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP