MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Break records after particular Count and column value change

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
Suchay
Beginner


Joined: 29 Jun 2017
Posts: 43
Topics: 9

PostPosted: Fri Jul 28, 2017 4:58 pm    Post subject: Break records after particular Count and column value change Reply with quote

We extract around 320000 records thru Db2 utility and split the records into 8 files when the count is greater than 40000 and the first column value change after reading 40000k records. We used seqnum currently to split on 40000, it works fine but we want to break on column value change and count should be greater than 40000 so it goes into 8 datasets

Please advise

Code:

col1 col2
001  10 
001  10 
001  10
001  10  Record 40000
....
001  10  records 41032 -- Write to file D0001A since after 40000 records the col2 changes in the next row
001  11  Record 41033 
001  11 
001  11
..
001  12 Record 56000
001  12 records 80000
..
001  12 records 82312 -   Write to file D0001B since after 40000 records the col2 changes in the next row
001  13   records 82313

Similarly, we have to break after 40000 records where the last record was written and when col2  changes

Currently, we have coded using SEQNUM, but the column changes are not handled

//SYSIN   DD *                                           
  OPTION COPY                                           
                                                         
  INREC  IFTHEN=(WHEN=(1,3,CH,EQ,C'001'),OVERLAY=(110:SEQNUM,8,ZD))     
           
 OUTFIL FNAMES=D0001A,                                               
        INCLUDE=(1,3,CH,EQ,C'001',AND,110,8,ZD,LE,40000)             
 OUTFIL FNAMES=D0001B,                                               
        INCLUDE=(1,3,CH,EQ,C'001',AND,110,8,ZD,GT,40000,AND,         
                                     110,8,ZD,LE,80000)               
  OUTFIL FNAMES=D0001C,                                       
       INCLUDE=(1,3,CH,EQ,C'001',AND,110,8,ZD,GT,80000,AND, 
                                    110,8,ZD,LE,120000)     
OUTFIL FNAMES=D0001D,                                       
       INCLUDE=(1,3,CH,EQ,C'001',AND,110,8,ZD,GT,120000,AND,
                                    110,8,ZD,LE,160000)     
OUTFIL FNAMES=D0001E,                                       
       INCLUDE=(1,3,CH,EQ,C'001',AND,110,8,ZD,GT,160000,AND,
                                    110,8,ZD,LE,200000)     
OUTFIL FNAMES=D0001F,                                       
       INCLUDE=(1,3,CH,EQ,C'001',AND,110,8,ZD,GT,200000,AND,
                                    110,8,ZD,LE,240000)     
                                                             
OUTFIL FNAMES=D0001G,                                       
       INCLUDE=(1,3,CH,EQ,C'001',AND,110,8,ZD,GT,240000,AND,
                                    110,8,ZD,LE,280000)     
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12357
Topics: 75
Location: San Jose

PostPosted: Sat Jul 29, 2017 12:14 pm    Post subject: Reply with quote

Suchay,

If your intention is to split the file based on COL2 , why bother about about the limit of 40,000? Use WHEN=GROUP with KEYBEGIN on COL2 and add an id number at the end.

Try the following untested control cards ( I assumed that your input is already sorted on col2, if it is not the replace the COPY statement with SORT FIELDS=(6,2,CH,A)
Code:

//SYSIN    DD *                                             
  OPTION COPY   
  INCLUDE COND=(1,3,CH,EQ,C'001')                                               
  OUTREC IFTHEN=(WHEN=GROUP,KEYBEGIN=(6,2),PUSH=(110:ID=2)) 
                                                             
  OUTFIL FNAMES=D0001A,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,1)
  OUTFIL FNAMES=D0001B,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,2)
  OUTFIL FNAMES=D0001C,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,3)
  OUTFIL FNAMES=D0001D,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,4)
  OUTFIL FNAMES=D0001E,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,5)
  OUTFIL FNAMES=D0001F,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,6)
  OUTFIL FNAMES=D0001G,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,7)
  OUTFIL FNAMES=D0001H,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,8)
/*

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Suchay
Beginner


Joined: 29 Jun 2017
Posts: 43
Topics: 9

PostPosted: Sat Jul 29, 2017 12:49 pm    Post subject: Reply with quote

Reason for spitting into 8 files: we provide these reports to users into spreadsheets, so we it would be cumbersome for them to look into the reports, so we break into 40,000 records and when the col2 changes after 40000 records we create separate files
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12357
Topics: 75
Location: San Jose

PostPosted: Sat Jul 29, 2017 3:53 pm    Post subject: Reply with quote

Suchay wrote:
Reason for spitting into 8 files: we provide these reports to users into spreadsheets, so we it would be cumbersome for them to look into the reports, so we break into 40,000 records and when the col2 changes after 40000 records we create separate files


Suchay,

The control cards i gave will split the file into 8 files. It is not lmited to 40000 records. Are you telling that you have more than 40000 records per col2 values? If that is the case just add SEQ=6 in the Oush statement after id and check that value on OUTFIL
Code:


  OUTREC IFTHEN=(WHEN=GROUP,KEYBEGIN=(6,2),PUSH=(110:ID=,SEQ=6))


Code:


INCLUDE=(110,2,ZD,EQ,1,AND,112,6,ZD,LE,40000))

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Suchay
Beginner


Joined: 29 Jun 2017
Posts: 43
Topics: 9

PostPosted: Sat Jul 29, 2017 10:16 pm    Post subject: Reply with quote

Kolusu, Col2 is sorted and the catch is Col2 might have values between 40000. IF you check the first post after 40000 records we have 11 and 12 in the next 40000 and after 40000 record ,col2 value changes to 13, then we create a separate file.


col1 col2
001 10
001 10
001 10
001 10 Record 40000
....
001 10 records 41032 -- Write to file D0001A since after 40000 records the col2 changes in the next row
001 11 Record 41033
001 11
001 11
..
001 12 Record 56000
001 12 records 80000
..
001 12 records 82312 - Write to file D0001B since after 40000 records the col2 changes in the next row
001 13 records 82313


[\Quote]
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12357
Topics: 75
Location: San Jose

PostPosted: Mon Jul 31, 2017 11:18 am    Post subject: Reply with quote

Suchay wrote:

001 10 records 41032 -- Write to file D0001A since after 40000 records the col2 changes in the next row


Suchay,

Your requirements are conflicting. Shouldn't be written to D0001B since you are past 40000 record mark?

Lets clear the requirement.

Lets say your COL2 have the following number of records.

Code:

10 - 200,000 records
11- 8,000 records
12- 15,000 records
13- 97,000 records
14 - 5000 records
total number of records = 325,000


Now how do you plan to split these records?
Code:

D0001A -  00001   to  40,000 records of 10
D0001B - 40,001   to  80,000 records of 10
D0001C - 80,001   to 120,000 records of 10
D0001D - 120,001  to 160,000 records of 10
D0001E - 160,001  to 200,000 records of 10

D0001F -  8,000 of 11 + 15,000 of 12 + 0001  to 7,000 of 13
D0001G -  7,0001  to 47,000 records of 13
D0001H -  47,0001  to 87,000 records of 13


If you did split it like that then you have still 10,000 records of 13 and 5000 records of 14.
_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Suchay
Beginner


Joined: 29 Jun 2017
Posts: 43
Topics: 9

PostPosted: Mon Jul 31, 2017 11:58 am    Post subject: Reply with quote

Kolusu,

Generally, we don't get more than 30000 records for a particular COl2 record.
We just try to keep the count close to 40000 in a file. Currently, we do this manually we check the col2 value
and break the files whenever the col2 value changes after each 40000 record.


Code:

10 - 20,000 records
11- 8,000 records
12- 15,000 records
13- 17,000 records
14 - 15,000 records
15 - 9500 records
16 - 23,000 records
17 -18,500 records.



Let us try to keep it simple with 3 files with above data and output should like below. is it possible



Code:

D0001A -  00001   to  43,000 records of 10  ,11, 12
D0001B - 43,001   to  84,500 records of 13, 14, 15
D0001C - 84,501   to 126,000 records of 16, 17
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12357
Topics: 75
Location: San Jose

PostPosted: Mon Jul 31, 2017 12:32 pm    Post subject: Reply with quote

Suchay wrote:
Let us try to keep it simple with 3 files with above data and output should like below. is it possible


Suchay,


You need to design solutions that would work for any kind of scenario. If you requirements are clear it is a simple task.

All you had to say is "I have to split an input file into chunks of 40000 records but if there are more records for the last key beyond 40000 records, I want them included in the same file even though the number of records included would be greater than 40000."

You spent 2 days to get this simple requirement right. You could have a written a COBOL program by now to get the desired results. We can ONLY read your text and interpret it and if your requirements aren't clear you would just be wasting my time as well as your time.

Any way use the following control cards which I think will give you the desired results
Code:

//SYSIN    DD *                                                     
  OPTION COPY                                                       
  INREC IFTHEN=(WHEN=GROUP,RECORDS=40000,PUSH=(110:ID=2,113:SEQ=6)),
        IFTHEN=(WHEN=GROUP,KEYBEGIN=(6,2),PUSH=(120:SEQ=6)),       
        IFTHEN=(WHEN=GROUP,                                         
               BEGIN=(113,6,ZD,EQ,40000,AND,120,6,ZD,GT,1),         
                 END=(120,6,ZD,EQ,1),                               
                PUSH=(128:110,2)),                                 
        IFTHEN=(WHEN=(120,6,ZD,NE,1,AND,128,2,CH,GT,C' '),         
       OVERLAY=(110:128,2))                                         
                                                                   
  OUTFIL FNAMES=D0001A,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,1)       
  OUTFIL FNAMES=D0001B,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,2)       
  OUTFIL FNAMES=D0001C,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,3)       
  OUTFIL FNAMES=D0001D,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,4)       
  OUTFIL FNAMES=D0001E,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,5)       
  OUTFIL FNAMES=D0001F,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,6)       
  OUTFIL FNAMES=D0001G,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,7)       
  OUTFIL FNAMES=D0001H,BUILD=(1,109),INCLUDE=(110,2,ZD,EQ,8)       
/*

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Magesh_J
Intermediate


Joined: 21 Jun 2014
Posts: 259
Topics: 54

PostPosted: Mon Jul 31, 2017 5:42 pm    Post subject: Reply with quote

kolusu wrote:

Code:

        IFTHEN=(WHEN=GROUP,                                         
               BEGIN=(113,6,ZD,EQ,40000,AND,120,6,ZD,GT,1),         
                 END=(120,6,ZD,EQ,1),                               
                PUSH=(128:110,2)),                                 
        IFTHEN=(WHEN=(120,6,ZD,NE,1,AND,128,2,CH,GT,C' '),         
       OVERLAY=(110:128,2))     



Impressive thinking !!!...

Thanks
Magesh
Back to top
View user's profile Send private message
Magesh_J
Intermediate


Joined: 21 Jun 2014
Posts: 259
Topics: 54

PostPosted: Mon Jul 31, 2017 5:47 pm    Post subject: Reply with quote

Also we may have as follows, so we may not miss any records.
Code:

OUTFIL FNAMES=D0001H,BUILD=(1,109),INCLUDE=(110,2,ZD,GE,8)


Regards,
Magesh
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12357
Topics: 75
Location: San Jose

PostPosted: Tue Aug 01, 2017 10:27 am    Post subject: Reply with quote

Magesh_J wrote:
Also we may have as follows, so we may not miss any records.
Code:

OUTFIL FNAMES=D0001H,BUILD=(1,109),INCLUDE=(110,2,ZD,GE,8)


Regards,
Magesh


Magesh,

You can use it, however I prefer SAVE instead of INCLUDE.
Code:

OUTFIL FNAMES=REMAIN,BUILD=(1,109),SAVE


I am creating a 2 Byte ID for every 40000 records. So the maximum is 99 values for 40000*99 = 3,960,000 records and it restarts from 00. With a greater than condition you will skip the 00. But with SAVE you will not.

OP needs to change the ID to 8 byte value to make sure that he doesn't miss any records. I will leave the exercise to him given that he already has a template solution.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Suchay
Beginner


Joined: 29 Jun 2017
Posts: 43
Topics: 9

PostPosted: Tue Aug 01, 2017 4:23 pm    Post subject: Reply with quote

Thanks Kolusu sort card worked fine
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group