MQSeries.net :: View topic - Handling large HTTP payloads

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Handling large HTTP payloads

Goto page 1, 2 Next

Handling large HTTP payloads

« View previous topic :: View next topic »

Author

Message

sleepyjamie

Posted: Fri Nov 13, 2015 7:03 am Post subject: Handling large HTTP payloads

Centurion

Joined: 29 Apr 2015
Posts: 135

I have a case where a REST API is returning a large payload. This is causing the HTTP Request node to take a really long time to return.

I am wondering if IIB has the ability to provide an input stream such that I can read the bytes from payload and propagate based on a delimiter of my choice?

mqjeff

Posted: Fri Nov 13, 2015 8:19 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

I forget if the JSON parser is a streaming parser or not.
_________________
chmod -R ugo-wx /

sleepyjamie

Posted: Fri Nov 13, 2015 8:38 am Post subject:

Centurion

Joined: 29 Apr 2015
Posts: 135

Yeah I've searched through the documentation with no luck

mqjeff

Posted: Fri Nov 13, 2015 8:45 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

The way to tell is to examine the logic in the large message processing sample, and see how it propagates parts of the inputroot and then deletes them and/or clears the output root before sending the next part.

If that performs better than what you're trying now, it's probably a streaming parser.

Perhaps some parser guru will be along to express an opiinion.
_________________
chmod -R ugo-wx /

timber

Posted: Fri Nov 13, 2015 8:47 am Post subject:

Grand Master

Joined: 25 Aug 2015
Posts: 1280

I don't remember either, but regardless of the answer JSON is certainly an on-demand parser. So you should be able to do this:
- set the Domain on the HTTPRequest node to BLOB
- parse once 'record' at a time, and generate the output for it
- propagate the resulting message tree and do whatever you need to do with it in the rest of the message flow
- Delete the message tree and parse the next record

Details are in this well-thumbed article:
http://www.ibm.com/developerworks/websphere/library/techarticles/0505_storey/0505_storey.html
( it was written in a preivous era, so ignore references to now-deprecated domains. The techniques still work )

This way, you still need to read in the entire input BLOB ( which may be large ) but at least you're not building the message tree for all of the records at the same time. Memory usage will be a *lot* lower.

Vitor

Posted: Fri Nov 13, 2015 9:05 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

mqjeff wrote:

Perhaps some parser guru will be along to express an opiinion.

All Hail The Parser King!

_________________
Honesty is the best policy.
Insanity is the best defence.

sleepyjamie

Posted: Fri Nov 13, 2015 9:57 am Post subject:

Centurion

Joined: 29 Apr 2015
Posts: 135

Looks like this approach still isn't feasible.

Reason being is large BLOB payloads cause IIB toolkit to hang after an attempt to cast the BLOB to a string. So having a pointer/reference to an input stream would be ideal.

Vitor

Posted: Fri Nov 13, 2015 10:03 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

sleepyjamie wrote:

Reason being is large BLOB payloads cause IIB toolkit to hang after an attempt to cast the BLOB to a string.

a) Have you tried this on a server that's not got a bit more power than the one in the Toolkit
b) I'm surprised to hear that this payload is nothing but a string and no structure that can be managed with DFDL, especially as you talked earlier about using a "delimiter of your choice". What exactly is this data?
_________________
Honesty is the best policy.
Insanity is the best defence.

smdavies99

Posted: Fri Nov 13, 2015 10:07 am Post subject:

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

sleepyjamie wrote:

I guess that you must be trying to use the debugger in order to get the TK to hang.
Why not revert to 'old-style' debugging? i.e. Use user trace.
In fact, I will go as far as saying that this is the best way to handle large payloads.

I say old style because us 'old farts' who have been using this product for more than a decade didn't have the luxury (sic) of the debugger back in them old days.
It was usertrace or nothing.

I have never got on with the debugger so use usertrace for all by debugging even with V10. YMMV though.
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.

sleepyjamie

Posted: Fri Nov 13, 2015 10:27 am Post subject:

Centurion

Joined: 29 Apr 2015
Posts: 135

The data response is json. There is no need to parse it into node tree because my flow simply does a quick publish of the data to a queue. So doing the transformation is just wasted cpu. For example:

[{"a": "A1", "b": "B1"}, {"a": "A2", "b": "B2"}]

This would be published to MQ as two messages

<data><a>A1</a>B1</data>

<data><a>A2</a>B2</data>

Using DFDL is a bit overkill for this. It's easier to write some simple code that iterates over the character array and propagate whenever I encounter a character sequence. At any given time I only need a few characters in the input buffer for reading.

I'm running a corei7 (8 cores) 16GB of RAM, I don't see why the toolkit would be hanging. I'll try increasing the memory but I think its an issue with the toolkit. Usertrace is an option, however IMO the toolkit should be robust to handle large payloads. I have no issues running the debugger under regular Eclipse IDE using Java for the same json response.

Vitor

Posted: Fri Nov 13, 2015 10:36 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

sleepyjamie wrote:

The data response is json. There is no need to parse it into node tree because my flow simply does a quick publish of the data to a queue. So doing the transformation is just wasted cpu.

Ok, now I'm confused:

sleepyjamie wrote:

For example:

[{"a": "A1", "b": "B1"}, {"a": "A2", "b": "B2"}]

This would be published to MQ as two messages

<data><a>A1</a>B1</data>

<data><a>A2</a>B2</data>

This looks to my untrained eye like a JSON message is being transformed into XML.

sleepyjamie wrote:

Using DFDL is a bit overkill for this. It's easier to write some simple code that iterates over the character array and propagate whenever I encounter a character sequence. At any given time I only need a few characters in the input buffer for reading.

Easier than using the JSON parser to identify the end of the first JSON array (using the on-demand parsing and memory saving techniques previously mentioned)?

sleepyjamie wrote:

I'm running a corei7 (8 cores) 16GB of RAM, I don't see why the toolkit would be hanging. I'll try increasing the memory but I think its an issue with the toolkit.

Because it's trying to resolve the entire input stream to display it. Welcome to the IIB debugger and one reason why my worthy associate prefers the user trace. As I do.

sleepyjamie wrote:

So doing the transformation is just wasted cpu

I also question that assertion. It's almost certainly true in a pure Java world but you're not in Kansas any more Toto and I'll bet my entire annual bonus that the JSON parser can find it's way round a JSON message faster and more efficiently than IIB can find it's place in a string because string handling in IIB is resource expensive and inefficient, especially with large strings.

sleepyjamie wrote:

I have no issues running the same debugger under regular Eclipse IDE using Java.

This is the key point. You can stream HTTP traffic into Java and do exactly what you're describing, looking at a few characters at a time. IIB doesn't stream a BLOB (because a BLOB is by definition one singular item).

Humor us. Try using the JSON domain.
_________________
Honesty is the best policy.
Insanity is the best defence.

timber

Posted: Fri Nov 13, 2015 10:42 am Post subject:

Grand Master

Joined: 25 Aug 2015
Posts: 1280

Quote:

There is no need to parse it into node tree because my flow simply does a quick publish of the data to a queue.

That would have been useful information in your first post.

Quote:

I don't see why the toolkit would be hanging

Nor me. The toolkit, for whatever reason, appears not to like data this large. Without seeing the stack trace it's hard to say any more than that. Have you tried using user trace and Trace nodes?

Quote:

Using DFDL is a bit overkill for this.

Possibly true. It depends on where else commas can appear in the JSON data. And how regular the delimiters are ( it is always a single comma, with no spaces on either side? )
You don't need to CAST to CHARACTER to scan the BLOB. Just CAST the string that you are looking for to a BLOB ( making sure to use the correct CCSID ). Then scan for that byte sequence. But don't forget the edge case where the data contains commas that are not delimiters.

sleepyjamie

Posted: Fri Nov 13, 2015 10:49 am Post subject:

Centurion

Joined: 29 Apr 2015
Posts: 135

Vitor wrote:

sleepyjamie wrote:

The data response is json. There is no need to parse it into node tree because my flow simply does a quick publish of the data to a queue. So doing the transformation is just wasted cpu.

Ok, now I'm confused:

sleepyjamie wrote:

For example:

[{"a": "A1", "b": "B1"}, {"a": "A2", "b": "B2"}]

This would be published to MQ as two messages

<data><a>A1</a>B1</data>

<data><a>A2</a>B2</data>

This looks to my untrained eye like a JSON message is being transformed into XML.

sleepyjamie wrote:

Easier than using the JSON parser to identify the end of the first JSON array (using the on-demand parsing and memory saving techniques previously mentioned)?

sleepyjamie wrote:

I'm running a corei7 (8 cores) 16GB of RAM, I don't see why the toolkit would be hanging. I'll try increasing the memory but I think its an issue with the toolkit.

Because it's trying to resolve the entire input stream to display it. Welcome to the IIB debugger and one reason why my worthy associate prefers the user trace. As I do.

sleepyjamie wrote:

So doing the transformation is just wasted cpu

sleepyjamie wrote:

I have no issues running the same debugger under regular Eclipse IDE using Java.

I've used JSON domain originally and thats when i found the crashing in IIB TK, then thought it might be too resource intensive, so I was going to switch to BLOB. If BLOB comes in as the full binary response then I might be out of luck.

Strange to me that a primitive feature such as reference to HTTP response as an input stream is unavailable. I wish IIB input nodes, parsers and stream handlers were separate logic in the product. This would allow you to write custom input stream handling logic for HTTP Request node. From a development architecture point of view this would be more flexible.

I'll keep trying. thanks!

stoney

Posted: Fri Nov 13, 2015 11:22 am Post subject:

Centurion

Joined: 03 Apr 2013
Posts: 140

How big a JSON message are we talking about here - KBs, MBs?

Quote:

I'm running a corei7 (8 cores) 16GB of RAM, I don't see why the toolkit would be hanging. I'll try increasing the memory but I think its an issue with the toolkit.

You might already be aware, but the toolkit (like all Java applications) is limited by the maximum Java heap size setting - I think this defaults to 1GB in all recent toolkit levels.
You can increase it by editing the -Xmx setting in <install root>/tools/eclipse.ini.

sleepyjamie

Posted: Fri Nov 13, 2015 11:27 am Post subject:

Centurion

Joined: 29 Apr 2015
Posts: 135

stoney wrote:

How big a JSON message are we talking about here - KBs, MBs?

Quote:

I'm running a corei7 (8 cores) 16GB of RAM, I don't see why the toolkit would be hanging. I'll try increasing the memory but I think its an issue with the toolkit.

The payload is in the MB. I'm using the 32-bit version so the toolkit memory is limited. I'll try and see if I can get the 64-bit.

I think a better approach is to ask the REST API dev to implement a paging endpoint.

Cheers.

jamie

Display posts from previous:

Goto page 1, 2 Next

Page 1 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Handling large HTTP payloads

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP