MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Split file by a specific count across groups

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
kkittinger
Beginner


Joined: 17 Oct 2006
Posts: 3
Topics: 1

PostPosted: Tue May 25, 2010 10:15 am    Post subject: Split file by a specific count across groups Reply with quote

Hello,

I am looking at a way to split a large file (4.5 mil records) into smaller chunks say about 100,000 records each, but keep a specific grouping together.

The file contains Student information, but I need to keep all the students together that belong to a campus if the split happens in the middle of it.

Is there a way to do this in the sort criteria (DFSORT or SyncSort) or easier to just write a separate routine.

thanks,
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12378
Topics: 75
Location: San Jose

PostPosted: Tue May 25, 2010 10:23 am    Post subject: Reply with quote

kkittinger,

Show us a sample of input and desired output with split taken into consideration for a group. Also what is the LRECL and RECFM of the input and output files?

How many output files do you plan to create?
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
kkittinger
Beginner


Joined: 17 Oct 2006
Posts: 3
Topics: 1

PostPosted: Tue May 25, 2010 10:43 am    Post subject: Reply with quote

LRECL = 100
RECTM = FB

the number of records in each file has not been determined yet as tot he number of files. I just know that I will need to split into manageable chunks and keep the 1st 9 digits in the same file.

[ Key Info ]
256999001 student information ...................................
256999001 student information ...................................
256999001 student information ...................................
256999001 student information ...................................
256999001 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
257999001 student information ...................................
257999001 student information ...................................
257999002 student information ...................................
257999003 student information ...................................
257999004 student information ...................................


so if splitting on record count of 10 then

File 1: (13 records since the group/key needs to be kept together)
256999001 student information ...................................
256999001 student information ...................................
256999001 student information ...................................
256999001 student information ...................................
256999001 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................
256999002 student information ...................................

file 2:
257999001 student information ...................................
257999001 student information ...................................
257999002 student information ...................................
257999003 student information ...................................
257999004 student information ...................................


thanks for the quick response.
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12378
Topics: 75
Location: San Jose

PostPosted: Tue May 25, 2010 10:55 am    Post subject: Reply with quote

kkittinger,

How about splitting based on the groups instead of going by record count? Each output file can have max of 10 groups or so?
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
kkittinger
Beginner


Joined: 17 Oct 2006
Posts: 3
Topics: 1

PostPosted: Tue May 25, 2010 11:07 am    Post subject: Reply with quote

There will be approximately 11,000 groups since this is all the schools in the state of Texas. True some of the groupings will be small, but when getting into the Houston and Dallas area, they get kinda big.

This data is being used to populate are server side data bases and I am being told they want it in manageable chunks in case of errors they only need to reload that piece.

Record counts are just what I am use to doing. Smile
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12378
Topics: 75
Location: San Jose

PostPosted: Tue May 25, 2010 11:45 am    Post subject: Reply with quote

kkittinger,

Try this DFSORT JCL. Here i am splitting the records in chunks of 10. You can change that number to any number you want. The parm RECORDS=n specifies the maximum number of records in a group. n can be 1 to 2000000000

Code:

//STEP0100 EXEC PGM=SORT                                               
//SYSOUT   DD SYSOUT=*                                                 
//SORTIN   DD *                                                       
256999001 STUDENT INFORMATION ...................................     
256999001 STUDENT INFORMATION ...................................     
256999001 STUDENT INFORMATION ...................................     
256999001 STUDENT INFORMATION ...................................     
256999001 STUDENT INFORMATION ...................................     
256999002 STUDENT INFORMATION ...................................     
256999002 STUDENT INFORMATION ...................................     
256999002 STUDENT INFORMATION ...................................     
256999002 STUDENT INFORMATION ...................................     
256999002 STUDENT INFORMATION ...................................     
256999002 STUDENT INFORMATION ...................................     
256999002 STUDENT INFORMATION ...................................     
256999002 STUDENT INFORMATION ...................................     
257999001 STUDENT INFORMATION ...................................     
257999001 STUDENT INFORMATION ...................................     
257999002 STUDENT INFORMATION ...................................     
257999003 STUDENT INFORMATION ...................................     
257999004 STUDENT INFORMATION ...................................     
//OUT1     DD SYSOUT=*                                                 
//OUT2     DD SYSOUT=*                                                 
//SYSIN    DD *                                                       
  SORT FIELDS=COPY                                                     
  INREC IFTHEN=(WHEN=INIT,OVERLAY=(101:SEQNUM,8,ZD,RESTART=(1,9))),   
  IFTHEN=(WHEN=GROUP,RECORDS=10,PUSH=(109:ID=8)),                     
  IFTHEN=(WHEN=GROUP,BEGIN=(101,8,ZD,EQ,1),PUSH=(117:109,8))           
  OUTFIL FNAMES=OUT1,INCLUDE=(117,8,ZD,EQ,1),BUILD=(1,100)             
  OUTFIL FNAMES=OUT2,INCLUDE=(117,8,ZD,EQ,2),BUILD=(1,100)             
//*

_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group