MQSeries.net :: View topic - Reg: EDIFACT messages

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Reg: EDIFACT messages

Goto page Previous 1, 2

Reg: EDIFACT messages

« View previous topic :: View next topic »

Author

Message

kimbert

Posted: Tue Nov 27, 2012 5:09 am Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

Yes, if this lookup table is reference data then it should be loaded rarely. But the OP is claiming that it is the lookups that are taking the time - not the loading of the data into the table.

mqjeff

Posted: Tue Nov 27, 2012 5:18 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

kimbert wrote:

Why would querying a large lookup table stored in a SHARED ROW be faster than querying a large lookup table stored in the Environment tree? Or am I missing the point?

The performance impact seems to come more from making repeated selects, and less from how the data being selected from is stored...

The general way to optimize this kind of scenario is to cache the results of the select in a faster-to-lookup manner. There's no indication here if this can actually be done in this specific case. That is, there's nothing that says that a given select is actually executed more than once, or that there's an easy way to assemble an indicator of the select to be performed to use as a cache lookup key.

Other things to consider are :

significantly reducing the size of the lookup table by precalculating portions of it based on known factors
reorganizing the structure of the lookup to optimize it based on the search criteria (alter the normalization or alter the lookup key hierarchy)
reorganizing the structure of the message processing such that more operations are performed in parallel
offload the lookup processing to a real database using a stored procedure
as kimbert says, move the lookup table into a java ArrayList and use optimized searching on that
spread the workload over more instances of the message flow and more EGs and more brokers
independently develop and implement a quantum computer to perform non-deterministic processing of the transformation in O(1)...
other less feasible ideas...

But I don't think that any of these ideas can be reasonably examined through this forum without ravilegolas providing a *lot* more information that is probably confidential to his current site.

But I will say that ravilegolas appears to be making all of the right steps in determining that performance is a concern, where it is a concern, and what kinds of steps can be taken to address it.

Esa

Posted: Tue Nov 27, 2012 6:14 am Post subject:

Grand Master

Joined: 22 May 2008
Posts: 1387
Location: Finland

mqjeff wrote:

[*]as kimbert says, move the lookup table into a java ArrayList and use optimized searching on that

If the flow is issuing several lookups with different sets of criteria, I would suggest creating several java hashtables with different sets of keys.

ravilegolas

Posted: Tue Nov 27, 2012 11:36 am Post subject:

Novice

Joined: 04 Sep 2012
Posts: 15

The look up table that my flow is refering to contains around 800 rows. It has 4 columns with 2 columns having codes and 2 having their expansion. For now what I have proposed is to move the look up table in target environment which is a data warehouse. Whenever they fetch values from the table, they can refer the look up tables and populate the expansion of the codes accordingly. Is there any difference in using select query on environment variable as compared to row variable. My flow was selecting values from the look up table more than 20000 times for each message which made the flow very slow. BTW thanks everyone for your suggestions, I will try to implement one of them in my code.

lancelotlinc

Posted: Tue Nov 27, 2012 11:50 am Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

ravilegolas wrote:

Is there any difference in using select query on environment variable as compared to row variable.

SHARED ROW is the least preferred method of sharing information between message flows. It is the most problematic, both in terms of implementation as well as performance.

Using a SINGLETON is the cheapest, fastest alternative as kimbert suggested, using Java ArrayList or HashTable.

http://docs.oracle.com/javase/1.4.2/docs/api/java/util/Hashtable.html

If you have budget, the best solution is solidDb or eXtremeScale.
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

mqjeff

Posted: Tue Nov 27, 2012 11:52 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

lancelotlinc wrote:

ravilegolas wrote:

Is there any difference in using select query on environment variable as compared to row variable.

You don't need budget for eXtremeScale if you have 8.0.0.1....

lancelotlinc

Posted: Tue Nov 27, 2012 12:20 pm Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

mqjeff wrote:

lancelotlinc wrote:

ravilegolas wrote:

Is there any difference in using select query on environment variable as compared to row variable.

You don't need budget for eXtremeScale if you have 8.0.0.1....

Seems true for Development environment only; but as I read the license agreement, a $$ license $$ is needed for production. Am I right? Liberty license is only good for Developer interaction.

Also, WMB 8.0.0.1 is not required if you interface with the eXtremeScale API through JCN?
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

kimbert

Posted: Tue Nov 27, 2012 12:46 pm Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

Quote:

My flow was selecting values from the look up table more than 20000 times for each message which made the flow very slow.

I'm not surprised that the flow was very slow. I agree that you should choose your caching technology carefully ( extremeScale, SolidDB, SHARED ROW etc ). However, you should be even more careful about choosing your data structures and search algorithms. If you choose wrongly you will end up with performance that is orders of magnitude below what is achievable - regardless of which type of cache you use.

mqjeff

Posted: Tue Nov 27, 2012 12:46 pm Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

All license questions are only properly answered by an IBM Software Sales Representative.

It is my understanding that all use of the Global Cache feature of WMB is covered by the license for WMB itself, and does not require a separate license for eXtremeScale for any configuration or in any type of environment.

Yes, if you use a JavaCompute node, you can access eXtremeScale using it's Java APIs from any version of Broker that supports the JCN - but then you *do* need a license for eXtremeScale, because that JCN has to connect to *something*.

lancelotlinc

Posted: Tue Nov 27, 2012 1:11 pm Post subject:

Jedi Knight

Joined: 22 Mar 2010
Posts: 4941
Location: Bloomington, IL USA

mqjeff wrote:

IBM is offering a trial license for pilot applications at no charge:

http://www.ibm.com/developerworks/downloads/ws/wsdg/
_________________
http://leanpub.com/IIB_Tips_and_Tricks
Save $20: Coupon Code: MQSERIES_READER

mqsiuser

Posted: Tue Nov 27, 2012 10:13 pm Post subject:

Yatiri

Joined: 15 Apr 2008
Posts: 637
Location: Germany

lancelotlinc wrote:

SHARED ROW is the least preferred method of sharing information between message flows. It is the most problematic, both in terms of implementation as well as performance.

Is that really true? Where are the problems? It is so easy to use. And it is native ESQL (so it must be great

). You just "DECLARE myRow SHARED ROW;" and then happily assign tree-data (like on Input- and OutputRoot <- I hope these 2 guys are (still) fine !?)... What should be the difference between a SHARED ROW and Input/OutputRoot (except for the lifetime)?

It is so nice to distinguish (and point to the importance of) the following:

kimbert wrote:

I agree that you should choose your caching technology carefully ( extremeScale, SolidDB, SHARED ROW etc ).

And on projects you can really (totally) screw brokers performance with not implementing properly:

kimbert wrote:

However, you should be even more careful about choosing your data structures and search algorithms. If you choose wrongly you will end up with performance that is orders of magnitude below what is achievable.

You can use a hash-map/array/list. That's theortically O(1) (but with concerns/considerations on the "hashing-function").
You could implement a balanced tree (for your 800 entries) (based on the (2) keys). Then traversing (going into / getting a single element) in O(logn) (which is ~10 in your case).

Or you just do (like I see on many projects) something dead easy:
1. Use a shared row

Code:

DECLARE cache SHARED ROW;
SET cache = PASSTHRU("SELECT key1, key2, value1, value2 FROM myTable");

2. Highly consider using curly braces (and within that the key(s)):

Code:

SET Environment.lookupData = PASSTHRU("SELECT key1, key2, value1, value2 FROM myTable");
DECLARE cache SHARED ROW;
// WITHIN a loop (over Environment.lookupData rearange the data:
SET cache.{rLookupDataCurRow.key1}_{rLookupDataCurRow.key2} = rLookupDataCurRow;

Because then you can lookup the values (with the keys) in a convenient way as follows:

Code:

SET Environment.value1_value2 = cache.{key1}_{key2};

That is O(n) (but broker is really good at it)

Now there is a break-even (for the size of n):
If n is bigger than the break even then (usually) O(1) or O(logn) is better.
If n is smaller than the break even then (usually) O(n) is better.

Lets say the break even is 1000

...
Lets say complexity (of your dataStructure and searchAlgorithm) also counts

_________________
Just use REFERENCEs

Esa

Posted: Wed Nov 28, 2012 1:10 am Post subject:

Grand Master

Joined: 22 May 2008
Posts: 1387
Location: Finland

ravilegolas wrote:

The look up table that my flow is refering to contains around 800 rows.
...
My flow was selecting values from the look up table more than 20000 times for each message which made the flow very slow

Have you considered rewriting your code instead of caching the lookup table? Instead of hitting the lookup table once or twice for each iteration of a while loop you could perhaps replace the loop with a combined query. Something like this:

Code:

SET outputRef.Objects[] = SELECT A.ID, A.NAME, B.VALUE1, C.VALUE2 FROM InputRef AS A, LookupRef AS B, LookupRef AS C WHERE A.CODE1 = B.KEY1 AND A.CODE2 = C.KEY2

I think this is the use case ESQL was originally developed for...

My example code certainly contains syntax errors and you will have to refine it to make the target element names correct, for example. I wrote it from scratch just to try to give you the idea.

I guess if you do it like this it won't make an awful lot of difference is you query the lookup table from a database or from an Environment or ROW tree, not even if the ROW is cached or not. I mean compared to the case where you hit the lookup table 20 k times.

If the problem is that the SELECT query puts the elements in wrong order, you can always make an ESQL wrapper for some java code that sorts the elements with a Collection afterwards.

mqsiuser

Posted: Wed Nov 28, 2012 3:31 am Post subject:

Yatiri

Joined: 15 Apr 2008
Posts: 637
Location: Germany

Esa wrote:

make an ESQL wrapper for some java code that sorts the elements with a Collection afterwards.

Do you have performance measures for that ? I am interested.
_________________
Just use REFERENCEs

Esa

Posted: Wed Nov 28, 2012 4:52 am Post subject:

Grand Master

Joined: 22 May 2008
Posts: 1387
Location: Finland

mqsiuser wrote:

Esa wrote:

make an ESQL wrapper for some java code that sorts the elements with a Collection afterwards.

Do you have performance measures for that ? I am interested.

Unfortunately not. I have never needed to implement that.

Display posts from previous:

Goto page Previous 1, 2

Page 2 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Reg: EDIFACT messages

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP