MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

SORT to remove specific duplicate record

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
Magesh_J
Intermediate


Joined: 21 Jun 2014
Posts: 259
Topics: 54

PostPosted: Wed Jul 29, 2020 3:07 pm    Post subject: SORT to remove specific duplicate record Reply with quote

Is this possible in sort.

case 1
Code:

AAABBBCCCDDDEEEFF        TEEE
AAABBBCCCDDDEEEFF        TAAA
AAABBBCCCDDDEEE12        TEEE
AAABBBCCCDDDEEE13        BCCC

Case 2

Code:

AAABBBCCCDDDEEEFF        TAAA
AAABBBCCCDDDEEE12        TEEE
AAABBBCCCDDDEEE13        BCCC
AAABBBCCCDDDEEEFF        KAAA



If the duplicate key exists(1,17) and if one of the duplicates has TEEE in pos(26,4), then retain only the TEEE record.
if the duplicate key exists(1,17) and both the records don't have TEEE in pos(26,4), then the sort should return RC16.
if no duplicate key then just copy the same value.

Output for case 1
Code:

AAABBBCCCDDDEEEFF        TEEE
AAABBBCCCDDDEEE12        TEEE
AAABBBCCCDDDEEE13        BCCC

output for case 2 ==> should return 16, as the duplicate record dont have TEEE Record.

Thanks

Regards,
Magesh
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12367
Topics: 75
Location: San Jose

PostPosted: Wed Jul 29, 2020 4:50 pm    Post subject: Reply with quote

Magesh_J,

Couple of clarifications.

1. If I understand your question correctly, you want an RC=16 when any record does NOT have a 'TEEE' record associated to that key is that it? if so then it is quite easy.


2. How do you plan to handle Unique records? What if you have a key with value other than 'TEEE'?

Code:

EEEEEEEEEEEEEEEEE        WXYZ


3. How do you plan to handle more than 2 duplicates? something like this

Code:

FFFFFFFFFFFFFFFFF        DUP1
FFFFFFFFFFFFFFFFF        DUP2
FFFFFFFFFFFFFFFFF        DUP3
FFFFFFFFFFFFFFFFF        TEEE


4. How do you plan to handle more than 1 duplicates for TEEE? something like this
Code:

GGGGGGGGGGGGGGGGG        DUP1
GGGGGGGGGGGGGGGGG        DUP2
GGGGGGGGGGGGGGGGG        TEEE
GGGGGGGGGGGGGGGGG        TEEE

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Magesh_J
Intermediate


Joined: 21 Jun 2014
Posts: 259
Topics: 54

PostPosted: Wed Jul 29, 2020 5:08 pm    Post subject: Reply with quote

Hi Kolusu,

Thanks for looking.

If we have a unique record alone, then it is valid, it should be copied and RC should be 0, even if it doesn't have TEEE.

if we have a duplicate record, it may be 1 or 2 or many, then only the record with TEEE should be retained, rest should be removed.

if we have duplicates 2 or more and TEEE is missing in all the duplicates, then it should fail.

We will not have two TEEE for the same account.

The files were getting from two sources and it gets merged, the problem is we don't have access to that process, we have access after it gets merged, so most likely we will not get more than 2 records as duplicates, but I don't know, it may happen.

Thanks
Magesh
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12367
Topics: 75
Location: San Jose

PostPosted: Wed Jul 29, 2020 6:01 pm    Post subject: Reply with quote

Magesh,

In the sort step we will tag the 'TEEE' records with an indicator of 0 and other records with 1. Sorting on the KEY and the indicator would make TEEE record as the first record.

On OUTREC using when=group, you push the TEEE record and a sequence number onto the duplicate records.

Using OUTFIL we write 2 files.

KEYS will have unique records per key and the first record whatever it has

CHKRC will ONLY contain records that do NOT have a single TEEE but are duplicates on the key.

Since sort product cannot set an RC=16, we use IDCAMS to set the RC=16 if there is even 1 duplicate record that does not have a TEEE record.


Code:

//STEP0100 EXEC PGM=SORT                                     
//SYSOUT   DD SYSOUT=*                                       
//SORTIN   DD DISP=SHR,DSN=your input file
/*
//KEYS     DD DSN=Your output file,
//            DISP=(NEW,CATLG,DELETE)
//            SPACE=(CYL,(X,Y),RLSE) 
/*   
//CHKRC    DD DSN=Your CHKRC file,
//            DISP=(NEW,CATLG,DELETE)
//            SPACE=(CYL,(X,Y),RLSE) 
/*   
//SYSIN    DD *                                     
** TAG 'TEEE' RECORDS TO BE SORTED AS FIRST RECORD PER KEY
                                                   
  INREC OVERLAY=(81:26,4,CHANGE=(1,C'TEEE',C'0'),   
                        NOMATCH=(C'1'))             
                                                   
                                                   
  SORT FIELDS=(01,17,CH,A,         # KEY           
               81,01,CH,A)         # IND           
                                                   
  OUTREC IFTHEN=(WHEN=GROUP,KEYBEGIN=(1,17),       
                 PUSH=(81:26,4,85:SEQ=3))           
                                                   
  OUTFIL FNAMES=KEYS,                               
  INCLUDE=(85,3,ZD,EQ,1),                           
  BUILD=(01,80)                                     
                                                   
  OUTFIL FNAMES=CHKRC,                             
  INCLUDE=(85,3,ZD,GT,1,AND,81,4,CH,NE,C'TEEE'),   
  BUILD=(01,80)                                     
/*                                                 
//**********************************************************
//*  SET RC=16 IF CHKRC DATASET HAS ATLEAST 1 RECORD       *
//**********************************************************
//STEP0200 EXEC PGM=IDCAMS                                   
//SYSPRINT DD SYSOUT=*                                       
//INP      DD DISP=SHR,DSN=Your CHKRC File                   
//SYSIN    DD *                                             
  PRINT INFILE(INP) CHARACTER COUNT(1)                       
  IF LASTCC = 0 THEN DO                                     
     SET MAXCC=16                                           
     END                                                     
  ELSE IF LASTCC = 4 THEN DO                                 
       SET MAXCC=0                                           
  END                                                       
/*                                                           


You can also delete the KEYS file if CHKRC file has 1 record where we are setting the RC=16
_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Magesh_J
Intermediate


Joined: 21 Jun 2014
Posts: 259
Topics: 54

PostPosted: Wed Jul 29, 2020 6:37 pm    Post subject: Reply with quote

I missed this trick, make 'TEEE' to zero, and sort the record to make the TEEE record first in order.

Kolusu wrote:

Code:
  INREC OVERLAY=(81:26,4,CHANGE=(1,C'TEEE',C'0'),   
                        NOMATCH=(C'1'))



Thank you very much, awesome idea !!!

Regards,
Magesh
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group