Author |
Message
|
goffinf |
Posted: Thu Jan 26, 2012 2:25 am Post subject: High Memory usage for large messages |
|
|
Chevalier
Joined: 05 Nov 2005 Posts: 401
|
MB v6.1.0.9
I am experiencing memory usage growing until it is all completely used when processing emails with moderately large attachments. I am reasonably confident from studying the email server logs that the emails are being retrieved (via IMAP) one at a time (by their UID), so I now want to turn my attention to whether I am doing something dumb in the JavaCompute node code when generating the output message (its possible I am wrong about the JavaMail stuff, but I just want to rule out any MB API problems first).
Scenario:
Retrieve emails from the server and simply route them as unparsed BLOBs on to an MQ queue. Another flow will pick them up and put the attachments into a Content Management application but thats outside the scope of this flow. So really simple.
The emails I'm using in this test have one or more attachments of up to 5MB, so not huge.
Since this is v6.1 there isn't an EmailInput node I'm using a JCN for the JavaMail stuff and to create the out messages (the JavaMail code is based on IA9R but modified to deal with secure connections and binary attachments).
Flow layout is :-
Timer (poll every 10 secs) -> JCN -> MQHeader -> MQOutput
Functionally everything works fine, and thru-put wise, its fairly fast.
However, if emails build up in the INBOX even to a modest level (say 100) memory usage climbs until its all gone.
Some example tests :-
I deploy the flow and put it in STOP state. Then add a number of emails to the INBOX being polled :-
10 emails of around 5MB, EG memory starts at around 125MB
Start the flow, memory grows to 221 MB, takes about 6 secs to complete
20 emails, 5MB of attachments :-
Memory start : 125, memory at completion: 314
100 emails, 5 MB of attachments :-
Memory start: 125, memory at completion 1GB+ (and still going - had to stop this one pronto !)
After many hours playing around with the JavaMail stuff, I can't make any impression on memory build up. So I want to turn my attention to the non JavaMail code to see if anything is amiss there. Thats where I need your help.
Here the bones of it. It basically loops through a folder of emails propgating each one. I would be very happy for someone to identify a fundamental flaw , so all comments are most welcome to track this one down.
In the code below (helper) is a class where many of the Javamail methods exists (connect, disconnect, etc ..) I haven't shown any of that because I want to focus on the MB stuff (if this thread leads in that direction afterwards, thats fine) :-
Code: |
public void evaluate(MbMessageAssembly contact admin) throws MbException {
MbOutputTerminal out = getOutputTerminal("out");
MbOutputTerminal alt = getOutputTerminal("alternate");
MbMessage inMessage = contact admin.getMessage();
MbMessage outMessage = null;
MbMessageAssembly outAssembly = null;
synchronized (helper) {
if (!helper.isConnected() || !helper.isOpen()) {
helper.connect(host, user, password, folder, readOnly, secure);
}
Long[] messageIds = helper.getMessageIds();
if (null == messageIds || 0 == messageIds.length) {
// no email polled, passthru the input
alt.propagate(contact admin);
} else {
Message incomingEmail;
// loop thru all email UIDs getting their content and propagating
for (int i = 0; i < messageIds.length; ++i) {
incomingEmail = helper.getMessageByUid(messageIds[i]);
try {
// create new message
outMessage = new MbMessage();
outAssembly = new MbMessageAssembly(contact admin, outMessage);
copyMessageHeaders(inMessage, outMessage);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
incomingEmail.writeTo(bos);
// Create new BLOB message
try {
MbElement outRoot = outMessage.getRootElement();
outRoot.createElementAsLastChildFromBitstream(
bos.toByteArray(), MbBLOB.PARSER_NAME,
null, null, null, 0, 0, 0);
bos.close();
// finalize
outMessage.finalizeMessage(MbMessage.FINALIZE_NONE);
// Propagate to next node in the flow
out.propagate(outAssembly);
} catch (IOException e) {
// deal with exception (code elided)
}
} catch (MbException e) {
// deal with exception (code elided)
} catch (IOException e) {
// deal with exception (code elided)
} catch (MessagingException e) {
// deal with exception (code elided)
} finally {
// clear the outMessage
outMessage.clearMessage();
// free memory held by the folder cache for headers
((IMAPMessage) incomingEmail).invalidateHeaders();
}
}
}
// Close the folder to expunge the received mails.
helper.disconnect();
}
|
|
|
Back to top |
|
 |
goffinf |
Posted: Thu Jan 26, 2012 2:33 am Post subject: |
|
|
Chevalier
Joined: 05 Nov 2005 Posts: 401
|
I noted on preview that the code looks like this :-
public void evaluate(MbMessageAssembly contact admin)
...
MbMessage inMessage = contact admin.getMessage()
That ISNT what I pasted in (don't know why it got changed), this is
public void evaluate(MbMessageAssembly contact admin)
...
MbMessage inMessage = contact admin.getMessage(); |
|
Back to top |
|
 |
kimbert |
Posted: Thu Jan 26, 2012 2:49 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Your dot persists in being absent. Never mind - we'll pretend that it's there, and we promise not to flame you for missing out a dot. We're not like that on this forum.  |
|
Back to top |
|
 |
adubya |
Posted: Thu Jan 26, 2012 2:53 am Post subject: |
|
|
Partisan
Joined: 25 Aug 2011 Posts: 377 Location: GU12, UK
|
The missing dot was throwing me I must admit  |
|
Back to top |
|
 |
Esa |
Posted: Thu Jan 26, 2012 3:37 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
goffinf wrote: |
I noted on preview that the code looks like this :-
public void evaluate(MbMessageAssembly contact admin)
...
MbMessage inMessage = contact admin.getMessage()
That ISNT what I pasted in (don't know why it got changed), this is
public void evaluate(MbMessageAssembly contact admin)
...
MbMessage inMessage = contact admin.getMessage(); |
I have noticed the same thing. If your post text contains for example 'transform' slightly misspelled, it becomes 'contact admin' |
|
Back to top |
|
 |
Esa |
Posted: Thu Jan 26, 2012 4:01 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
goffinf,
try creating the MQMD in your java code instead of using MQHeader node. The MQHeader memory leak is probably fixed on the level you are running, but it's still possible that it may leak some other way. Besides, it issues a deep copy of the message. |
|
Back to top |
|
 |
goffinf |
Posted: Thu Jan 26, 2012 4:39 am Post subject: |
|
|
Chevalier
Joined: 05 Nov 2005 Posts: 401
|
kimbert wrote: |
Your dot persists in being absent. Never mind - we'll pretend that it's there, and we promise not to flame you for missing out a dot. We're not like that on this forum.  |
Thanks.
Actually even in my follow up post 'contact admin' was changed to 'contact admin' and now I notice that its stuck that in various other places as well, and in other places replaced with nothing.
Sorry this makes the code a bit more difficult to follow, if I could stop it happening I would.
P.S When I viewd this message it happened again (grrrr). Never mind I'm sure you guys can skip over that issue'tte |
|
Back to top |
|
 |
goffinf |
Posted: Thu Jan 26, 2012 4:50 am Post subject: |
|
|
Chevalier
Joined: 05 Nov 2005 Posts: 401
|
Esa wrote: |
try creating the MQMD in your java code instead of using MQHeader node. The MQHeader memory leak is probably fixed on the level you are running, but it's still possible that it may leak some other way. Besides, it issues a deep copy of the message. |
OK thats worth a look-see.
As an aside, I also have a MB v8 version of this flow which uses the supplied EmailInput node. For now I'm going to assume that the MQHeader 'leak' you refer to doesn't happen in this version. At some point (hopefully later on this year) we may move to v8. Current experience for this flow is not that good so far though (its very much slower (each email takes around 6 seconds to emit to the Q) and the flow also consumes increasing amounts of memory). Its early days in my experimentation, so hopefully that will change !
For now I really need the v6.1 version to work. |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Jan 26, 2012 5:01 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
I would make sure that you use clearMessage on your alt propagate path as well.
I would consider redesigning your code to do all of the IMAP work in a separate thread, and then have your JCN only access objects from the thread class and use those to populate the MbMessage. It's not clear from the code you've posted if that's what's being done, or if the helper methods are actually incurring the costs of calling out to IMAP each time you call them.
I would consider reusing the same outMessage object in each iteration of message loop, instead of creating a new one.
I would consider using java profiling tools to determine if you can see leaks in your flow.
I would consider rewriting your JCN to populate the Environment tree instead of a new MbMessage element, and then using another node to map from Environment into a real message. |
|
Back to top |
|
 |
goffinf |
Posted: Thu Jan 26, 2012 5:43 am Post subject: |
|
|
Chevalier
Joined: 05 Nov 2005 Posts: 401
|
mqjeff wrote: |
I would make sure that you use clearMessage on your alt propagate path as well.
I would consider redesigning your code to do all of the IMAP work in a separate thread, and then have your JCN only access objects from the thread class and use those to populate the MbMessage. It's not clear from the code you've posted if that's what's being done, or if the helper methods are actually incurring the costs of calling out to IMAP each time you call them.
I would consider reusing the same outMessage object in each iteration of message loop, instead of creating a new one.
I would consider using java profiling tools to determine if you can see leaks in your flow.
I would consider rewriting your JCN to populate the Environment tree instead of a new MbMessage element, and then using another node to map from Environment into a real message. |
Yep all good suggestions.
I'm less bothered about thru-put (1 message / sec is OK for the volumes we have). Increasing memory usage is the big issue (hence I'm trying hard not to create any objects that would create another copy of the email stream (I beleive JavaMail already does that in its folder object which appears to be confirmed by the fact that each 5MB message consumers about 10MB of memory).
Does that change anything re: use a separate worker thread (the current helper code doesn't do that) ??
I think I tried reusing the outMessage and outAssembly but got an exception. Probably I did something else wrong. I'll go back and try that again.
Profiling. I've tried that in the past and didn't have any success in attaching the agent (I think I tried jip last time). Have you got any suggestions or examples of what/how to do this successfully (I'm not a Java expert as might be clear from my example code).
How does creating the output in the Environment help here ?? |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Jan 26, 2012 5:50 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Using Environment (and/or Localenvironment) would allow you not to create a new message at all. You would just add elements to an existing message tree. You could then use an ESQL compute node, which is guaranteed to handle memory correctly, to populate that data into a new output message structure.
The point of using a separate worker thread is to control when you create instances of the email message in your code, and ensure that you only query existing instances when you need to get the data.
I can't say that I've personally used Java profiling tools against broker. I remember seeing a thread or two about people trying to set runtime options to allow it. |
|
Back to top |
|
 |
goffinf |
Posted: Thu Jan 26, 2012 9:17 am Post subject: |
|
|
Chevalier
Joined: 05 Nov 2005 Posts: 401
|
mqjeff wrote: |
Using Environment (and/or Localenvironment) would allow you not to create a new message at all. You would just add elements to an existing message tree. You could then use an ESQL compute node, which is guaranteed to handle memory correctly, to populate that data into a new output message structure.
|
Well sir, you are *so* my favourite person tonight. Using the Environment works like a charm. Hardly any build up memory at all (I ran 60 5MB messages thru and the EG gained 4MB - previously this would have chewed up 600MB !) and the speed was pretty good (0.6 sec/5MB msg). Might even get a bit more speed if I create the MQMD in ESQL and remove the MQHeader node as per Esa's suggestion.
I'm quitting while I'm ahead tonight.
Thanks for you suggestions. I love learning stuff.
Fraser. |
|
Back to top |
|
 |
|