ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Other ideas for counting lines of large file

Post new topic  Reply to topic Goto page 1, 2, 3  Next
 Other ideas for counting lines of large file « View previous topic :: View next topic » 
Author Message
cwazpitt3
PostPosted: Wed Mar 21, 2012 4:22 am    Post subject: Other ideas for counting lines of large file Reply with quote

Acolyte

Joined: 31 Aug 2011
Posts: 61

Hey gurus,

I have a requirement (which I don't really think should be handled by MB, but I digress) to perform auditing ONLY on large (~75MB) inbound files and then route over to an outbound folder only if the auditing passes. I have a working solution that looks like this:

FileInput (whole file) --> ESQL which calls External Java Function (pass BLOB read/count lines and pass back count and last line i.e. trailer for auditing) --> FileOutput (whole file)

This solution works, but it eats up a fair amount of memory while processing due to the large file being read and wrote (I assume). The longest delay seems to be writing the outbound file (which is just a copyEntireMessage pass through). Most of the memory seems to be released after processing is complete, so maybe this solution is OK, but I was simply wondering if anyone had any other clever ideas for solving this problem.

I had couple other ideas such as different Java processing that just passed in the filename (not BLOB) or use a shell script to do a wc -l [filename], however, by nature when the FileInput node picks up the file, it puts it in the mqsitransitin folder and renames it to a pattern that I don't think I can obtain at runtime (seems like GUID-ogfilename). So I cannot actually access the original file to run any of the aforementioned techniques against it and I have no other way of triggering the flow.

Any other thoughts or ideas? Thanks in advance!
Back to top
View user's profile Send private message
Vitor
PostPosted: Wed Mar 21, 2012 4:45 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

The memory's being chewed by because you're using BLOB, hence the entire file's being loaded. As the Java's counting lines these files must have a line-based structure (I imagine <CR> delimited).

Hence you can build a message set that reads lines with no account of their structure, read and copy the file a line at a time (counting as you go) until you reach the last line (end of file on input), use this trailer and the count for whatever the audit is, finish the file if it's correct or rollback if it's not.

Other, more elegant solutions are undoubtably available but I've only had 1 so far this morning.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
cwazpitt3
PostPosted: Wed Mar 21, 2012 4:48 am    Post subject: Reply with quote

Acolyte

Joined: 31 Aug 2011
Posts: 61

Vitor wrote:
The memory's being chewed by because you're using BLOB, hence the entire file's being loaded. As the Java's counting lines these files must have a line-based structure (I imagine <CR> delimited).

Hence you can build a message set that reads lines with no account of their structure, read and copy the file a line at a time (counting as you go) until you reach the last line (end of file on input), use this trailer and the count for whatever the audit is, finish the file if it's correct or rollback if it's not.

Other, more elegant solutions are undoubtably available but I've only had 1 so far this morning.


I'm on my second cup so life is a little better here

I have already tried the approach you suggested. It was VERY slow...like it took about 12 mins to get through 100k of the 250k lines. While this is better at memory consumption, for such a silly requirement, I hate to eat CPU and spend all that time. Just bugs me
Back to top
View user's profile Send private message
Vitor
PostPosted: Wed Mar 21, 2012 5:01 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

cwazpitt3 wrote:
I have already tried the approach you suggested. It was VERY slow...like it took about 12 mins to get through 100k of the 250k lines.


Really? Where did the user trace say the time was going? Unless you're running this on a server you have to wind with a key every few hours I'd expect better than that. Works out to around 140 TPS where each transaction is "oh look, a record" which isn't that sparkly. How long does the Java take?

I do agree that it's a silly requirement for WMB; if there was some kind of transformational value add maybe but still, soon someone will send a significantly larger file and you'll be able to justify more memory.

(Don't give me the "the files are always 75Mb or so and not expected to grow significantly"; I've heard that one. They either grow and the users act all surprised, or there's a problem upstream and they combine 1-n files together)
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
Esa
PostPosted: Wed Mar 21, 2012 5:06 am    Post subject: Reply with quote

Grand Master

Joined: 22 May 2008
Posts: 1387
Location: Finland

cwazpitt3 wrote:

I have already tried the approach you suggested. It was VERY slow...like it took about 12 mins to get through 100k of the 250k lines. While this is better at memory consumption, for such a silly requirement, I hate to eat CPU and spend all that time. Just bugs me


It will certainly be very slow if you don't apply techniques presented in the large message processing sample...
Back to top
View user's profile Send private message
cwazpitt3
PostPosted: Wed Mar 21, 2012 5:09 am    Post subject: Reply with quote

Acolyte

Joined: 31 Aug 2011
Posts: 61

Vitor wrote:

Really? Where did the user trace say the time was going? Unless you're running this on a server you have to wind with a key every few hours I'd expect better than that. Works out to around 140 TPS where each transaction is "oh look, a record" which isn't that sparkly. How long does the Java take?


I never took a user trace. This is a DEV server, but it shouldn't be terrible. I could run again. Do you have any good docs or anything on User Trace? I feel like I don't really understand its full capabilities and therefore don't utilize it enough. As for the Java, it takes about 6 minutes total to do the process I outlined...not bad IMO.

Vitor wrote:

I do agree that it's a silly requirement for WMB; if there was some kind of transformational value add maybe but still, soon someone will send a significantly larger file and you'll be able to justify more memory.


Nope, no transformation just audit...stupid!

Vitor wrote:

(Don't give me the "the files are always 75Mb or so and not expected to grow significantly"; I've heard that one. They either grow and the users act all surprised, or there's a problem upstream and they combine 1-n files together)


I completely agree and I have warned that once we get to 100MB, its a deal breaker if running it whole file, am I right? I don't ever trust file size estimates.
Back to top
View user's profile Send private message
mqjeff
PostPosted: Wed Mar 21, 2012 5:14 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

cwazpitt3 wrote:
Vitor wrote:

Really? Where did the user trace say the time was going? Unless you're running this on a server you have to wind with a key every few hours I'd expect better than that. Works out to around 140 TPS where each transaction is "oh look, a record" which isn't that sparkly. How long does the Java take?


I never took a user trace. This is a DEV server, but it shouldn't be terrible. I could run again. Do you have any good docs or anything on User Trace? I feel like I don't really understand its full capabilities and therefore don't utilize it enough.

http://www-01.ibm.com/support/docview.wss?&uid=swg21177321
Back to top
View user's profile Send private message
cwazpitt3
PostPosted: Wed Mar 21, 2012 5:15 am    Post subject: Reply with quote

Acolyte

Joined: 31 Aug 2011
Posts: 61

Esa wrote:

It will certainly be very slow if you don't apply techniques presented in the large message processing sample...


@Esa, what sample are you referring to? The batch processing?
Back to top
View user's profile Send private message
cwazpitt3
PostPosted: Wed Mar 21, 2012 5:17 am    Post subject: Reply with quote

Acolyte

Joined: 31 Aug 2011
Posts: 61

mqjeff wrote:
cwazpitt3 wrote:
Vitor wrote:

Really? Where did the user trace say the time was going? Unless you're running this on a server you have to wind with a key every few hours I'd expect better than that. Works out to around 140 TPS where each transaction is "oh look, a record" which isn't that sparkly. How long does the Java take?


I never took a user trace. This is a DEV server, but it shouldn't be terrible. I could run again. Do you have any good docs or anything on User Trace? I feel like I don't really understand its full capabilities and therefore don't utilize it enough.

http://www-01.ibm.com/support/docview.wss?&uid=swg21177321


Thanks @mqjeff. Now I have to see if I can actually run this command since where I am we have a shared dev environment and I cannot run mqsi commands on my own What a pain!
Back to top
View user's profile Send private message
mqjeff
PostPosted: Wed Mar 21, 2012 5:20 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

cwazpitt3 wrote:
Thanks @mqjeff. Now I have to see if I can actually run this command since where I am we have a shared dev environment and I cannot run mqsi commands on my own What a pain!

Do you have access to the Broker Explorer?
It will enable trace for you, even if it won't run mqsireadlog/mqsiformatlog.

If you spend enough time asking your dev server admin to run these commands, like five or six times a day, I"m sure they will find a way to allow you to run them on your own...
Back to top
View user's profile Send private message
cwazpitt3
PostPosted: Wed Mar 21, 2012 5:22 am    Post subject: Reply with quote

Acolyte

Joined: 31 Aug 2011
Posts: 61

mqjeff wrote:
If you spend enough time asking your dev server admin to run these commands, like five or six times a day, I"m sure they will find a way to allow you to run them on your own...


No we don't have access to Broker Explorer. I completly agree with your solution of bugging the admin...that's my plan.
Back to top
View user's profile Send private message
Esa
PostPosted: Wed Mar 21, 2012 5:24 am    Post subject: Reply with quote

Grand Master

Joined: 22 May 2008
Posts: 1387
Location: Finland

cwazpitt3 wrote:
Esa wrote:

It will certainly be very slow if you don't apply techniques presented in the large message processing sample...


@Esa, what sample are you referring to? The batch processing?


The sample called Large Messaging.
Back to top
View user's profile Send private message
cwazpitt3
PostPosted: Wed Mar 21, 2012 5:28 am    Post subject: Reply with quote

Acolyte

Joined: 31 Aug 2011
Posts: 61

cwazpitt3 wrote:
mqjeff wrote:
If you spend enough time asking your dev server admin to run these commands, like five or six times a day, I"m sure they will find a way to allow you to run them on your own...


No we don't have access to Broker Explorer. I completly agree with your solution of bugging the admin...that's my plan.


I do have the ability to add trace nodes with user trace setting and they have exposed a way to format the logs so I can access them. This is different than the overall flow trace though, right? If I used this approach, any suggestions where I might put the trace and what I might put in it to see where the time is being spent? Too much trace can slow it down, right?
Back to top
View user's profile Send private message
mqjeff
PostPosted: Wed Mar 21, 2012 5:34 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 17447

cwazpitt3 wrote:
cwazpitt3 wrote:
mqjeff wrote:
If you spend enough time asking your dev server admin to run these commands, like five or six times a day, I"m sure they will find a way to allow you to run them on your own...


No we don't have access to Broker Explorer. I completly agree with your solution of bugging the admin...that's my plan.


I do have the ability to add trace nodes with user trace setting and they have exposed a way to format the logs so I can access them. This is different than the overall flow trace though, right? If I used this approach, any suggestions where I might put the trace and what I might put in it to see where the time is being spent? Too much trace can slow it down, right?


No? I don't think it's different? Trace nodes with 'user trace' output write into the same logs that mqsireadlog accesses and mqsiformatlog formats, and the request to read those logs at the user trace level (rather than service trace level) will report all of the rest of the flow level information as well as the data you have specifically added to the user trace with your trace nodes.

So I think you're good to use this to see the full flow execution.
Back to top
View user's profile Send private message
marko.pitkanen
PostPosted: Wed Mar 21, 2012 6:08 am    Post subject: Reply with quote

Chevalier

Joined: 23 Jul 2008
Posts: 440
Location: Jamsa, Finland

Hi cwazpitt3,

Just to check did you set up your FileInput node to read file line by line?

Code:
Records and Elements
  Record detection = Delimeted
  Delimeter = DOS or UNIX Line End


--
Marko
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2, 3  Next Page 1 of 3

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Other ideas for counting lines of large file
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.