Author |
Message
|
arunkumo |
Posted: Tue Jan 21, 2020 5:37 pm Post subject: Unique Records |
|
|
 Novice
Joined: 21 Sep 2014 Posts: 20
|
Hi,
I want to calculate unique records while I received from File . The file having number of records and repetitive records like below.
Header file 21012010
Bose,sbi,21022010
sati,icic,21012020
sam,axis,21012010
pandey,ocb,20012020
Bose,sbi,21022010
sati,icic,21012020
Trailer file 6
If I use cardinality , it is displaying count as 6,but I want to display count as 4. No need to bother about header and trailer.
I want to avoid duplicate records. Is there any way in Esql ? |
|
Back to top |
|
 |
abhi_thri |
Posted: Wed Jan 22, 2020 2:01 am Post subject: |
|
|
 Knight
Joined: 17 Jul 2017 Posts: 516 Location: UK
|
hi...One way to do it is to iterate the records and start adding it to a new array but only after checking whether the element already exists in the new array...by the end of the processing the new array will only have unique records.
Another way, in case any of the three columns are supposed to be unique ones (eg: the name) then you could use the approach suggested by mqjeff at the below thread,
http://www.mqseries.net/phpBB2/viewtopic.php?p=404056&sid=a8c6da254bff879e37f49a32e8681d5e
Quote: |
Or you could create a tree in the XMLNSC domain in Environment or Local Environment, and use the x/xtypes/xtype/xtypeName as the top level element of the tree... something like
set NEXT CHILD OF OutputLocalEnvironment.XMLNSC.DistinctNames.{InputRoot.XMLNSC.x.xtypes.xtype.xtypeName} = curRef
for some reference curRef looping over the input tree. |
|
|
Back to top |
|
 |
arunkumo |
Posted: Wed Jan 22, 2020 1:50 pm Post subject: |
|
|
 Novice
Joined: 21 Sep 2014 Posts: 20
|
|
Back to top |
|
 |
abhi_thri |
Posted: Thu Jan 23, 2020 1:39 am Post subject: |
|
|
 Knight
Joined: 17 Jul 2017 Posts: 516 Location: UK
|
arunkumo wrote: |
Thank you Abhi. I have tryed this option but it didn't worked out. |
hi...can you please be more specific, did it work for few records or didn't work at all?
If you are dealing with 1000s of records in the same file with numerous duplicate entries I would try getting the source application to deal with the duplication. What is the guarantee that they don't send duplicate entries across two different files? |
|
Back to top |
|
 |
nick12 |
Posted: Fri Jan 24, 2020 6:43 pm Post subject: |
|
|
Novice
Joined: 14 Jan 2020 Posts: 18
|
I would recommend trying to use a java or .net compute node and using a data structure purpose-built for this kind of thing. |
|
Back to top |
|
 |
arunkumo |
Posted: Sat Jan 25, 2020 8:38 am Post subject: |
|
|
 Novice
Joined: 21 Sep 2014 Posts: 20
|
Thank you Abhi and Niki.
It doesn't worked with ESQL-COmpute Node. Well I have achieved this by using xsl transformation node as of now. Because if i want to use java,it suppose to compare and 2 looping techniques to be used.If it is 10,000 records, i assumed processing might be slower.So i went with XSL node and just 3 lines of code, it sorted the duplicate records.
I have written the xsl code in .xsl file. I have given the location of .xsl file in xsl node, before that i set the encoding and char set id in compute node. |
|
Back to top |
|
 |
timber |
Posted: Mon Jan 27, 2020 8:24 am Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
Quote: |
It doesn't worked with ESQL-COmpute Node |
The technique suggested by abhi_thri/mqjeff does work. I think you mean that you could not work out how to do it
Quote: |
if i want to use java,it suppose to compare and 2 looping techniques to be used.If it is 10,000 records, i assumed processing might be slower. |
Who told you to use 2 looping techniques? Or any looping techniques at all? A HashMap or Dictionary would do the job quickly and simply without any looping at all.
Quote: |
i went with XSL node and just 3 lines of code, it sorted the duplicate records |
Glad you got it working. Just be aware that you are parsing your input twice. The XSL engine parses it once, then IIB's XMLNSC parser parses it a second time. That it probably not optimal. |
|
Back to top |
|
 |
arunkumo |
Posted: Mon Jan 27, 2020 11:52 am Post subject: |
|
|
 Novice
Joined: 21 Sep 2014 Posts: 20
|
[quote="timber"]
Quote: |
It doesn't worked with ESQL-COmpute Node |
The technique suggested by abhi_thri/mqjeff does work. I think you mean that you could not work out how to do it
Can you help me ,how to achieve in esql. If that works, I will remove my xsl node. |
|
Back to top |
|
 |
arunkumo |
Posted: Mon Jan 27, 2020 8:21 pm Post subject: |
|
|
 Novice
Joined: 21 Sep 2014 Posts: 20
|
DECLARE tempGroup,temprow ROW;
DECLARE index INTEGER 1;
SET tempGroup.EmployeeInfo[1] = InputRoot.XMLNSC.EmployeelList.EmployeeInfo[1];
FOR loopField AS InputRoot.XMLNSC.EmployeelList.EmployeeInfo[] DO
IF NOT EXISTS (SELECT Field FROM temprow.EmployeeInfo[] WHERE tempGroup.EmployeeInfo = loopField) THEN
SET tempGroup.EmployeeInfo[index] =loopField ;
SET index=index+1;
END IF ;
END FOR ;
This code is giving the same count of cardinality. I.e without removing the duplicates. |
|
Back to top |
|
 |
timber |
Posted: Tue Jan 28, 2020 1:43 am Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
|
Back to top |
|
 |
rekarm01 |
Posted: Tue Jan 28, 2020 2:38 am Post subject: Re: Unique Records |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
Please use [code] tags when posting code, to preserve spacing; this makes well-formatted code easier to read.
arunkumo wrote: |
Code: |
IF NOT EXISTS (SELECT Field FROM temprow.EmployeeInfo[] WHERE tempGroup.EmployeeInfo = loopField) THEN ... |
|
One problem here is confusing "temprow" with "tempGroup". Another problem is mixing up where the distinct key value is; is it in "EmployeeInfo", or "EmployeeInfo"."Field"? A third problem might be the spelling of "EmployeelList".
A SELECT function call in a FOR loop may not perform well for a large number of records. A simpler loop might exploit ESQL field references to copy element values to element names, and then rely on the SET statement to eliminate any duplicates, by overwriting existing elements, rather than creating new ones:
Code: |
DECLARE tempRow ROW;
DECLARE recCount INTEGER;
FOR elemRef AS InputRoot.XMLNSC.EmployeeList.EmployeeInfo[] DO
SET tempRow.{elemRef} TYPE = Name;
END FOR;
SET recCount = CARDINALITY(tempRow.*[]); |
This example is not necessarily tested, optimal, or even useful, as-is, but may serve as a suitable starting point, for counting unique records. |
|
Back to top |
|
 |
arunkumo |
Posted: Wed Jan 29, 2020 7:43 am Post subject: Re: Unique Records |
|
|
 Novice
Joined: 21 Sep 2014 Posts: 20
|
Code: |
DECLARE tempRow ROW;
DECLARE recCount INTEGER;
FOR elemRef AS InputRoot.XMLNSC.EmployeeList.EmployeeInfo[] DO
SET tempRow.{elemRef} TYPE = Name;
END FOR;
SET recCount = CARDINALITY(tempRow.*[]); |
I'm sorry, I didn't catch you. What is Name here and Type ? |
|
Back to top |
|
 |
gbaddeley |
Posted: Wed Jan 29, 2020 3:56 pm Post subject: |
|
|
 Jedi Knight
Joined: 25 Mar 2003 Posts: 2538 Location: Melbourne, Australia
|
How about sorting. The dups will then be together and can be skipped in a single iteration. _________________ Glenn |
|
Back to top |
|
 |
rekarm01 |
Posted: Wed Jan 29, 2020 6:41 pm Post subject: Re: Unique Records |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 1415
|
arunkumo wrote: |
I'm sorry, I didn't catch you. What is Name here and Type ? |
The Knowledge Center describes Name and TYPE in more detail.
In the given example, the elements in tempRow need unique names, but they don't need values, because CARDINALITY() doesn't count values; any of these other SET statements below should work equally well:
Code: |
FOR elemRef AS InputRoot.XMLNSC.EmployeeList.EmployeeInfo[] DO
-- SET tempRow.{elemRef} TYPE = Name;
-- SET tempRow.{elemRef} VALUE = NULL;
SET tempRow.{elemRef} = '42';
END FOR; |
Again, this is only an (untested) example; feel free to modify it, as needed. |
|
Back to top |
|
 |
arunkumo |
Posted: Fri Jan 31, 2020 4:44 pm Post subject: Re: Unique Records |
|
|
 Novice
Joined: 21 Sep 2014 Posts: 20
|
Again, this is only an (untested) example; feel free to modify it, as needed.[/quote]
Thank you.. it doesn't worked out.. and thank you so much for your valuable thought. |
|
Back to top |
|
 |
|