SORT to remove specific duplicate record
Select messages from
# through # FAQ
[/[Print]\]

MVSFORUMS.com -> Utilities

#1: SORT to remove specific duplicate record Author: Magesh_J PostPosted: Wed Jul 29, 2020 3:07 pm
    —
Is this possible in sort.

case 1
Code:

AAABBBCCCDDDEEEFF        TEEE
AAABBBCCCDDDEEEFF        TAAA
AAABBBCCCDDDEEE12        TEEE
AAABBBCCCDDDEEE13        BCCC

Case 2

Code:

AAABBBCCCDDDEEEFF        TAAA
AAABBBCCCDDDEEE12        TEEE
AAABBBCCCDDDEEE13        BCCC
AAABBBCCCDDDEEEFF        KAAA



If the duplicate key exists(1,17) and if one of the duplicates has TEEE in pos(26,4), then retain only the TEEE record.
if the duplicate key exists(1,17) and both the records don't have TEEE in pos(26,4), then the sort should return RC16.
if no duplicate key then just copy the same value.

Output for case 1
Code:

AAABBBCCCDDDEEEFF        TEEE
AAABBBCCCDDDEEE12        TEEE
AAABBBCCCDDDEEE13        BCCC

output for case 2 ==> should return 16, as the duplicate record dont have TEEE Record.

Thanks

Regards,
Magesh

#2:  Author: kolusuLocation: San Jose PostPosted: Wed Jul 29, 2020 4:50 pm
    —
Magesh_J,

Couple of clarifications.

1. If I understand your question correctly, you want an RC=16 when any record does NOT have a 'TEEE' record associated to that key is that it? if so then it is quite easy.


2. How do you plan to handle Unique records? What if you have a key with value other than 'TEEE'?

Code:

EEEEEEEEEEEEEEEEE        WXYZ


3. How do you plan to handle more than 2 duplicates? something like this

Code:

FFFFFFFFFFFFFFFFF        DUP1
FFFFFFFFFFFFFFFFF        DUP2
FFFFFFFFFFFFFFFFF        DUP3
FFFFFFFFFFFFFFFFF        TEEE


4. How do you plan to handle more than 1 duplicates for TEEE? something like this
Code:

GGGGGGGGGGGGGGGGG        DUP1
GGGGGGGGGGGGGGGGG        DUP2
GGGGGGGGGGGGGGGGG        TEEE
GGGGGGGGGGGGGGGGG        TEEE

#3:  Author: Magesh_J PostPosted: Wed Jul 29, 2020 5:08 pm
    —
Hi Kolusu,

Thanks for looking.

If we have a unique record alone, then it is valid, it should be copied and RC should be 0, even if it doesn't have TEEE.

if we have a duplicate record, it may be 1 or 2 or many, then only the record with TEEE should be retained, rest should be removed.

if we have duplicates 2 or more and TEEE is missing in all the duplicates, then it should fail.

We will not have two TEEE for the same account.

The files were getting from two sources and it gets merged, the problem is we don't have access to that process, we have access after it gets merged, so most likely we will not get more than 2 records as duplicates, but I don't know, it may happen.

Thanks
Magesh

#4:  Author: kolusuLocation: San Jose PostPosted: Wed Jul 29, 2020 6:01 pm
    —
Magesh,

In the sort step we will tag the 'TEEE' records with an indicator of 0 and other records with 1. Sorting on the KEY and the indicator would make TEEE record as the first record.

On OUTREC using when=group, you push the TEEE record and a sequence number onto the duplicate records.

Using OUTFIL we write 2 files.

KEYS will have unique records per key and the first record whatever it has

CHKRC will ONLY contain records that do NOT have a single TEEE but are duplicates on the key.

Since sort product cannot set an RC=16, we use IDCAMS to set the RC=16 if there is even 1 duplicate record that does not have a TEEE record.


Code:

//STEP0100 EXEC PGM=SORT                                     
//SYSOUT   DD SYSOUT=*                                       
//SORTIN   DD DISP=SHR,DSN=your input file
/*
//KEYS     DD DSN=Your output file,
//            DISP=(NEW,CATLG,DELETE)
//            SPACE=(CYL,(X,Y),RLSE) 
/*   
//CHKRC    DD DSN=Your CHKRC file,
//            DISP=(NEW,CATLG,DELETE)
//            SPACE=(CYL,(X,Y),RLSE) 
/*   
//SYSIN    DD *                                     
** TAG 'TEEE' RECORDS TO BE SORTED AS FIRST RECORD PER KEY
                                                   
  INREC OVERLAY=(81:26,4,CHANGE=(1,C'TEEE',C'0'),   
                        NOMATCH=(C'1'))             
                                                   
                                                   
  SORT FIELDS=(01,17,CH,A,         # KEY           
               81,01,CH,A)         # IND           
                                                   
  OUTREC IFTHEN=(WHEN=GROUP,KEYBEGIN=(1,17),       
                 PUSH=(81:26,4,85:SEQ=3))           
                                                   
  OUTFIL FNAMES=KEYS,                               
  INCLUDE=(85,3,ZD,EQ,1),                           
  BUILD=(01,80)                                     
                                                   
  OUTFIL FNAMES=CHKRC,                             
  INCLUDE=(85,3,ZD,GT,1,AND,81,4,CH,NE,C'TEEE'),   
  BUILD=(01,80)                                     
/*                                                 
//**********************************************************
//*  SET RC=16 IF CHKRC DATASET HAS ATLEAST 1 RECORD       *
//**********************************************************
//STEP0200 EXEC PGM=IDCAMS                                   
//SYSPRINT DD SYSOUT=*                                       
//INP      DD DISP=SHR,DSN=Your CHKRC File                   
//SYSIN    DD *                                             
  PRINT INFILE(INP) CHARACTER COUNT(1)                       
  IF LASTCC = 0 THEN DO                                     
     SET MAXCC=16                                           
     END                                                     
  ELSE IF LASTCC = 4 THEN DO                                 
       SET MAXCC=0                                           
  END                                                       
/*                                                           


You can also delete the KEYS file if CHKRC file has 1 record where we are setting the RC=16

#5:  Author: Magesh_J PostPosted: Wed Jul 29, 2020 6:37 pm
    —
I missed this trick, make 'TEEE' to zero, and sort the record to make the TEEE record first in order.

Kolusu wrote:

Code:
  INREC OVERLAY=(81:26,4,CHANGE=(1,C'TEEE',C'0'),   
                        NOMATCH=(C'1'))



Thank you very much, awesome idea !!!

Regards,
Magesh



MVSFORUMS.com -> Utilities


output generated using printer-friendly topic mod. All times are GMT - 5 Hours

Page 1 of 1

Powered by phpBB © 2001, 2005 phpBB Group