ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » SELECT FROM WHERE

Post new topic  Reply to topic Goto page 1, 2  Next
 SELECT FROM WHERE « View previous topic :: View next topic » 
Author Message
longng
PostPosted: Thu Mar 14, 2013 7:59 pm    Post subject: SELECT FROM WHERE Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

While we are doing analysis on performance implications of SELECT in our particular scenarios, we've come across a strange phenomenon!

It starts out innocent enough with

We have a set of input data that we would want to copy over to a new location (Environment Variables) without any duplications

Code:

<?xml version="1.0" encoding="UTF-8"?>
<InIds>
   <id>AA</id>
   <id>BB</id>
   <id>BB</id>
   <id>CC</id>
   <id>CC</id>
   <id>AA</id>
   <id>BB</id>
</InIds>

For simplicity, the above data structure is the input message set that's referenced by the InIDRef within the FOR loop in the code below:
Code:

      CREATE FIELD Environment.Variables.Layer1.Layer2.Layer3.id;

      DECLARE EnvOutIDRef REFERENCE TO Environment.Variables.Layer1.Layer2.Layer3.id;

      FOR InIDRef AS InputRoot.XMLNSC.InIds.id[] DO
         IF (FindID(EnvInIDRef)) THEN
            -- It does not exist
            SET EnvOutIDRef = CAST (InIDRef AS CHARACTER);
            CREATE NEXTSIBLING OF EnvOutIDRef AS EnvOutIDRef REPEAT;
         END IF;
      END FOR;

The FindID function used above is shown below:
Code:

   CREATE FUNCTION FindID (IN Id CHARACTER) RETURNS BOOLEAN
   BEGIN

      SET rowId = THE (SELECT A.id
      FROM Environment.Variables.Layer1.Layer2.Layer3[] AS A
      WHERE A.id = Id);
      IF rowId.id IS NULL THEN
         SET Found = FALSE;
         END IF;
      ELSE
         SET Found = TRUE;
      END IF;
      RETURN Found;
   END;


To our surprise, we still see duplications in the output (Environment.Variables.Layer1.Layer2.Layer3...). Further analysis confirms that the output does not contain duplicated element containing 'AA' but duplicated/triplicated elements containing 'CC' and 'BB', respectively'!' Playing around a bit more we find out that there will not be a duplicated element in the output if its value matched the first entry! It seems to indicate that the WHERE clause would only be effective for the first element?

What takes, what gives?
Back to top
View user's profile Send private message
Esa
PostPosted: Thu Mar 14, 2013 11:40 pm    Post subject: Re: SELECT FROM WHERE Reply with quote

Grand Master

Joined: 22 May 2008
Posts: 1387
Location: Finland

Testing if a field is NULL is unreliable, take a look at EXISTS function. The argument it takes must be an array. Remember that an array with only one element is still an array.
Back to top
View user's profile Send private message
Esa
PostPosted: Fri Mar 15, 2013 12:17 am    Post subject: Re: SELECT FROM WHERE Reply with quote

Grand Master

Joined: 22 May 2008
Posts: 1387
Location: Finland

longng wrote:

Code:

   CREATE FUNCTION FindID (IN Id CHARACTER) RETURNS BOOLEAN
   BEGIN

      SET rowId = THE (SELECT A.id
      FROM Environment.Variables.Layer1.Layer2.Layer3[] AS A
      WHERE A.id = Id);
      IF rowId.id IS NULL THEN
         SET Found = FALSE;
         END IF;
      ELSE
         SET Found = TRUE;
      END IF;
      RETURN Found;
   END;



In this function you are assigning a value to a variable called Found that you have not declared. If the code compiles in spite of that, the reason must be that you have declared it insome enclosing scope so that it is visible to your function.

That kind of carelessness in scoping is the root of many strange phenomenon!

But probably not in this case. Just take a look at the nice example of EXISTS in the InfoCenter an write a more safe test, something like this:

Code:

   CREATE FUNCTION FindID (IN Id CHARACTER) RETURNS BOOLEAN
   BEGIN

      RETURN EXISTS(SELECT A.id
      FROM Environment.Variables.Layer1.Layer2.Layer3.id[] AS A
      WHERE A.id = Id);
       END;


Yes, the cause of the problem is that you are selecting from ...Layer3[] when you should select from Layer3.id[]...

The solution matches nicely with your topic title "SELECT FROM WHERE"

I hope this helps.
Back to top
View user's profile Send private message
adubya
PostPosted: Fri Mar 15, 2013 12:55 am    Post subject: Re: SELECT FROM WHERE Reply with quote

Partisan

Joined: 25 Aug 2011
Posts: 377
Location: GU12, UK

Esa wrote:


Yes, the cause of the problem is that you are selecting from ...Layer3[] when you should select from Layer3.id[]...


If the OP selects from Layer3.id[] then there must be child "id" elements underneath the Layer3.id structure for the subsequent code to work. Not sure if that's what the OP's datastructure looks like.

The OP's FindID function as posted above wouldn't compile anyway due to the termination of the IF statement before the ELSE is specified.
Back to top
View user's profile Send private message Send e-mail
Esa
PostPosted: Fri Mar 15, 2013 1:36 am    Post subject: Re: SELECT FROM WHERE Reply with quote

Grand Master

Joined: 22 May 2008
Posts: 1387
Location: Finland

adubya wrote:
Esa wrote:


Yes, the cause of the problem is that you are selecting from ...Layer3[] when you should select from Layer3.id[]...


If the OP selects from Layer3.id[] then there must be child "id" elements underneath the Layer3.id structure for the subsequent code to work. Not sure if that's what the OP's datastructure looks like.

The OP's FindID function as posted above wouldn't compile anyway due to the termination of the IF statement before the ELSE is specified.


You are right. Talk about carelessness...

It seems OP's code is hand written in the post, not copy/pasted from Toolkit. Maybe there are even more typos? There is something that may make the externally defined variable Found to get stuck with FALSE....

Corrected suggestion:
Code:

   CREATE FUNCTION FindID (IN Id CHARACTER) RETURNS BOOLEAN
   BEGIN

      RETURN EXISTS(SELECT A.id
      FROM Environment.Variables.Layer1.Layer2.Layer3[] AS A
      WHERE A.id = Id);
    END;
Back to top
View user's profile Send private message
kimbert
PostPosted: Fri Mar 15, 2013 2:45 am    Post subject: Reply with quote

Jedi Council

Joined: 29 Jul 2003
Posts: 5542
Location: Southampton

Is the FindID() function likely to be a performance bottleneck?

If it is, then you would be well advised to consider the following:
- do duplicates typically arrive together ( or close together ) in the input? If so, then you could significantly reduce the cost of your FindID() by checking for the last ID ( or last two or three IDs ) before doing the full search.
- If the number of IDs can get large then you could maintain the list in the Environment as a sorted list, and use a binary search in your FindID() function.
Back to top
View user's profile Send private message
longng
PostPosted: Fri Mar 15, 2013 7:45 am    Post subject: Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

Thank you! You all are so sharp as to be able to spot all those typos, which I inadvertently introduced while 'translating' from the actual existing code as to remove propriety information and/or superfluous details!

Here are some additional information:

If the code is run in debugging mode, I can see the list 'A' being gradually populated with entries and the 'Found' flag is mostly FALSE except for the case of the second 'AA' entry where Found is TRUE.
In the trace, annotations seem to imply the test (WHERE) would only be applied to the first entry as though the logic doesn't know there are more entries after the first one!

@Kimbert: thanks for the suggestions, I intend to share more details (in other thread) about performance (as perceived by me) in other thread (http://www.mqseries.net/phpBB2/viewtopic.php?t=63726) later.

For completeness, I have corrected the typos and list them here again just in case anyone would like to confirm/refute my observation. It should only take about 10 minutes to construct a simple message flow (MQInput->Compute->MQOutput) with the details below:
Input data
Code:

<?xml version="1.0" encoding="UTF-8"?>
<InIds>
   <id>AA</id>
   <id>BB</id>
   <id>BB</id>
   <id>CC</id>
   <id>CC</id>
   <id>AA</id>
   <id>BB</id>
</InIds>


Main Function

Code:

CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
   CALL CopyEntireMessage();

     CREATE FIELD Environment.Variables.Layer1.Layer2.Layer3.id;

      DECLARE EnvOutIDRef REFERENCE TO Environment.Variables.Layer1.Layer2.Layer3.id;

      FOR InIDRef AS InputRoot.XMLNSC.InIds.id[] DO
         IF (FindID(EnvInIDRef)) THEN
            -- It does not exist
            SET EnvOutIDRef = CAST (InIDRef AS CHARACTER);
            CREATE NEXTSIBLING OF EnvOutIDRef AS EnvOutIDRef REPEAT;
         END IF;
      END FOR;
      RETURN TRUE;
END;

FindID function
Code:

The FindID function
   CREATE FUNCTION FindID (IN Id CHARACTER) RETURNS BOOLEAN
   BEGIN
      DECLARE found BOOLEAN FALSE;

      SET rowId = THE (SELECT A.id
      FROM Environment.Variables.Layer1.Layer2.Layer3[] AS A
      WHERE A.id = Id);
      IF rowId.id IS NULL THEN
         SET Found = FALSE;
      ELSE
         SET Found = TRUE;
      END IF;
      RETURN Found;
   END;
Back to top
View user's profile Send private message
mgk
PostPosted: Fri Mar 15, 2013 1:13 pm    Post subject: Reply with quote

Padawan

Joined: 31 Jul 2003
Posts: 1642

The new code as posted will still not deploy as a couple of variables are not defined and the sense of the IF is wrong. However, the reason it will fail to do what you want (at least in the example posted here) is that there is only ever one element called "layer3" so the WHERE clause will only ever test the first "id" element, which is why it will only match when the first element is repeated later on. If you change the code to something like this you should have better luck:

Code:
SET rowId = THE (SELECT A
FROM Environment.Variables.Layer1.Layer2.Layer3.id[] AS A
WHERE A = Id);


Note also that your FOR loop at present will always create an "extra" element with a "NULL" value. This may or may not be a problem for you...

Kind regards,
_________________
MGK
The postings I make on this site are my own and don't necessarily represent IBM's positions, strategies or opinions.
Back to top
View user's profile Send private message
longng
PostPosted: Fri Mar 15, 2013 1:52 pm    Post subject: Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

mgk wrote:
The new code as posted will still not deploy as a couple of variables are not defined and the sense of the IF is wrong. However, the reason it will fail to do what you want (at least in the example posted here) is that there is only ever one element called "layer3" so the WHERE clause will only ever test the first "id" element, which is why it will only match when the first element is repeated later on. If you change the code to something like this you should have better luck:

Code:
SET rowId = THE (SELECT A
FROM Environment.Variables.Layer1.Layer2.Layer3.id[] AS A
WHERE A = Id);


Note also that your FOR loop at present will always create an "extra" element with a "NULL" value. This may or may not be a problem for you...

Kind regards,



That's it!!!!! Thank you very much mgk!


and yes, I am guilty as charged of other typos while trying to 'translate' the actual code to the one I post here. For the extraneous element you mention, the actual code does have logic to clean it up after the FOR loop terminates.


Here's the whole code with correction suggested by mgk:

Code:
   CREATE PROCEDURE BuildIDStructure ( )
   BEGIN
      CREATE FIELD Environment.Variables.Layer1.Layer2.Layer3.id;

      DECLARE EnvOutIDRef REFERENCE TO Environment.Variables.Layer1.Layer2.Layer3.id;

      FOR EnvInIDRef AS InputRoot.XMLNSC.InIds.id[] DO
         IF (EnvInIDRef) IS FALSE) THEN
            -- It does not exist
            SET EnvOutIDRef = CAST (EnvInIDRef AS CHARACTER);
            CREATE NEXTSIBLING OF EnvOutIDRef AS EnvOutIDRef REPEAT;
         END IF;
      END FOR;
[b]      -- Remove the last unused field
      IF LASTMOVE(EnvOutIDRef) THEN
         DELETE FIELD EnvOutIDRef;
      END IF;
[/b]   END;
   CREATE FUNCTION FindID (IN Id CHARACTER) RETURNS BOOLEAN
   BEGIN
      DECLARE rowId ROW NULL;
      DECLARE Found BOOLEAN FALSE;
      SET rowId = THE (SELECT A
      FROM Environment.Variables.Layer1.Layer2.Layer3.id[] AS A
      WHERE A = Id);

      IF rowId IS NULL THEN
         SET Found = FALSE;
      ELSE
         SET Found = TRUE;
      END IF;
      RETURN Found;
   END;
Back to top
View user's profile Send private message
Esa
PostPosted: Fri Mar 15, 2013 11:38 pm    Post subject: Reply with quote

Grand Master

Joined: 22 May 2008
Posts: 1387
Location: Finland

More typos?

longng wrote:


Code:

         IF (EnvInIDRef) IS FALSE) THEN

Back to top
View user's profile Send private message
longng
PostPosted: Sat Mar 16, 2013 7:37 am    Post subject: Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

Esa wrote:
More typos?

longng wrote:


Code:

         IF (EnvInIDRef) IS FALSE) THEN



Yes Esa! It's mine

Hopefully, there's no more
Code:
   CREATE PROCEDURE BuildIDStructure ( )
   BEGIN
      CREATE FIELD Environment.Variables.Layer1.Layer2.Layer3.id;

      DECLARE EnvOutIDRef REFERENCE TO Environment.Variables.Layer1.Layer2.Layer3.id;

      FOR EnvInIDRef AS InputRoot.XMLNSC.InIds.id[] DO
         IF (FindID(EnvInIDRef) IS FALSE) THEN
            -- It does not exist
            SET EnvOutIDRef = CAST (EnvInIDRef AS CHARACTER);
            CREATE NEXTSIBLING OF EnvOutIDRef AS EnvOutIDRef REPEAT;
         END IF;
      END FOR;
[b]      -- Remove the last unused field
      IF LASTMOVE(EnvOutIDRef) THEN
         DELETE FIELD EnvOutIDRef;
      END IF;
[/b]   END;
   CREATE FUNCTION FindID (IN Id CHARACTER) RETURNS BOOLEAN
   BEGIN
      DECLARE rowId ROW NULL;
      DECLARE Found BOOLEAN FALSE;
      SET rowId = THE (SELECT A
      FROM Environment.Variables.Layer1.Layer2.Layer3.id[] AS A
      WHERE A = Id);

      IF rowId IS NULL THEN
         SET Found = FALSE;
      ELSE
         SET Found = TRUE;
      END IF;
      RETURN Found;
   END;
Back to top
View user's profile Send private message
adubya
PostPosted: Sat Mar 16, 2013 7:53 am    Post subject: Reply with quote

Partisan

Joined: 25 Aug 2011
Posts: 377
Location: GU12, UK

I'd probably go with

Code:
   CREATE PROCEDURE BuildIDStructure ( )
   BEGIN
    
     DECLARE EnvOutIDRef REFERENCE TO Environment.Variables;
           
      CREATE FIELD Environment.Variables.Layer1.Layer2.Layer3 AS EnvOutIDRef;

      FOR EnvInIDRef AS InputRoot.XMLNSC.InIds.id[] DO
         IF  NOT FindID(EnvInIDRef) THEN
            -- It does not exist
            CREATE LASTCHILD OF EnvOutIDRef NAME 'id' VALUE  CAST ( EnvInIDRef AS CHARACTER );
         END IF;
      END FOR;
   END;
   
   
   CREATE FUNCTION FindID (IN Id CHARACTER) RETURNS BOOLEAN
   BEGIN
 
      RETURN EXISTS ( SELECT A
                     FROM Environment.Variables.Layer1.Layer2.Layer3.id[] AS A
                    WHERE A = Id );

   END;
Back to top
View user's profile Send private message Send e-mail
longng
PostPosted: Sat Mar 16, 2013 8:19 am    Post subject: Reply with quote

Apprentice

Joined: 22 Feb 2013
Posts: 42

adubya wrote:
I'd probably go with

Code:
   CREATE PROCEDURE BuildIDStructure ( )
   BEGIN
    
     DECLARE EnvOutIDRef REFERENCE TO Environment.Variables;
           
      CREATE FIELD Environment.Variables.Layer1.Layer2.Layer3 AS EnvOutIDRef;

      FOR EnvInIDRef AS InputRoot.XMLNSC.InIds.id[] DO
         IF  NOT FindID(EnvInIDRef) THEN
            -- It does not exist
            CREATE LASTCHILD OF EnvOutIDRef NAME 'id' VALUE  CAST ( EnvInIDRef AS CHARACTER );
         END IF;
      END FOR;
   END;
   
   
   CREATE FUNCTION FindID (IN Id CHARACTER) RETURNS BOOLEAN
   BEGIN
 
      RETURN EXISTS ( SELECT A
                     FROM Environment.Variables.Layer1.Layer2.Layer3.id[] AS A
                    WHERE A = Id );

   END;


@adubya: That's definitely much better
Back to top
View user's profile Send private message
rekarm01
PostPosted: Sat Mar 16, 2013 1:40 pm    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 1415

adubya wrote:
I'd probably go with

Code:
...
CREATE FUNCTION FindID (IN Id CHARACTER) RETURNS BOOLEAN
BEGIN
    RETURN EXISTS(SELECT A
                    FROM Environment.Variables.Layer1.Layer2.Layer3.id[] AS A
                    WHERE A = Id);
END;

Although when it's boiled down to one statement, it doesn't really need to be a separate function anymore. Moving the SELECT call into the procedure also allows for the direct use of EnvOutIDRef:

Code:
CREATE PROCEDURE BuildIDStructure()
BEGIN
    DECLARE EnvOutIDRef REFERENCE TO Environment.Variables;
    CREATE FIELD Environment.Variables.Layer1.Layer2.Layer3 AS EnvOutIDRef;

    FOR EnvInIDRef AS InputRoot.XMLNSC.InIds.id[] DO
        IF NOT EXISTS(SELECT A FROM EnvOutIDRef.id[] AS A WHERE A = EnvInIDRef) THEN
        -- It does not exist
            CREATE LASTCHILD OF EnvOutIDRef NAME 'id' VALUE CAST(EnvInIDRef AS CHARACTER);
        END IF;
    END FOR;
END;

[Edit: Fixed the 'WHERE A = Id' clause, per adubya's hint ... however, the code remains untested.]


Last edited by rekarm01 on Sun Mar 17, 2013 3:06 am; edited 1 time in total
Back to top
View user's profile Send private message
adubya
PostPosted: Sun Mar 17, 2013 12:36 am    Post subject: Reply with quote

Partisan

Joined: 25 Aug 2011
Posts: 377
Location: GU12, UK

Nearly there, you have a copy/paste error though
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » SELECT FROM WHERE
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.