Posted: Mon Nov 17, 2014 3:08 pm Post subject: DFSORT to count multiple column and build a summary report
Hi,
I trying to find the count of multiple columns, Is it possible.
The challenge here is we have 241 millions records input and length 198 VB. So if we use sort filelds, that will affect performance.
Below key shown is packed decimal
Code:
VAR1 VAR2 VAR3
7/AN 9/PS 2/SNUM
(1-7) (8-16) (17-18)
2---------- 3------------------ 4-----------
********************************* TOP OF DAT
AAAAA 12345671010100 0
I need built a summary report based on VAR2 PS hence converted to Character
Code:
INREC BUILD=(1,4,12,9,PD,TO=ZD,LENGTH=17,5) ==> this is not working I am trying bulid RDW + Converted string + actual record.
OUTFIL BUILD FNAMES=OUTPUT1,BUILD=(1,4,22) RDW + Actual record
OUTFIL BUILD FNMAES=REPORT ==> here is where I am struck
DATE 11/15/14
TIME 050942
Report Sample
Total number of records read 14
Group Header Trailer
21 1 2
29 1 1
31 2
43 1
19 2 4
Group is in position 8th character of var2
I need to count the header and trailer record for each group.
Header is position 12th character of the Var2 if 12,1,Ch,EQ,C'01'
Trailer is position 12th character of the Var2 if 12,1,Ch,NE,C'01'
Note : The advantage is the group are specifics, So it can be hard coded in condition.
Joined: 26 Nov 2002 Posts: 12378 Topics: 75 Location: San Jose
Posted: Mon Nov 17, 2014 4:06 pm Post subject:
Magesh_J,
You just need a summary report so all you need is to pull of the group number, header and trailer from the file using IFTHEN statement and then SORT that on a much shorter length.
So now the file is reduced to mere LRECL of 8 (RDW + group number + header/trailer)
I used CHANGE parm to set the initial count for header and trailer. So when the header is 01 , the counts for header are set to '1' and trailer to '0'. So '10' for the change and the reverse of it is header is set to '0' and trailer to '1' and hence in the NOMATCH it is '01'
Now we sort on the GROUP number and use the reporting features to generate the counts by TOTaling the header and trailer counters we set it up using the CHANGE command.
Thanks, Its worked like charm, "Change" keyword is a great alternate thinking.
The other challenge we have is, We need to build two files. Sorry I should have explained this properly in my 1st quote.
Followed your advise and Built in two steps to achieve it.
1. VSAM File ==> Since it is a vsam file (SORT FIELDS(5,2) may result in key out of sequence, hence in first step no sort field
2. Report file. ==> Sort fields + Building report
That says, "from the current record make a new current record (you can say that every time you see BUILD (or FIELDS on INREC/OUTREC or OUTREC on OUTFIL if you see old code)".
It continues by saying "copy the RDW from the current record to the new current record" and then "copy all the data from position five to the end of the current record to the new current record".
In short, you've just made an identical copy of your current record. You don't want to do that, and you certainly don't want to do that 241m times. Remove the INREC.
Code:
OUTFIL FNAMES=SORTOUT,BUILD=(1,4,5)
The default for OUTFIL is SORTOUT, so you don't need FNAMES. For the BUILD, see above. That's a further 241m of those you don't want.
In short, you can remove that OUTFIL completely and the input file will be copied, unchanged, to SORTOUT.
It is not at all a good idea to call DDname SORTOUT1, so change that. If you start any DDname with SORT you will at best confuse readers into thinking it is something that is part of SORT itself, and at worst you will actually screw something up.
If you are now writing out the file for analysis in a second step, you don't need the RDW. The RDW is doubling the space used by the file. Kill the RDW and use VTOF. Adjust your second step accordingly.
I'd also consider a comparison of Kolusu's original + REPRO against this. Say 1m record representative sample, and compare IO and CPU.
Joined: 26 Nov 2002 Posts: 12378 Topics: 75 Location: San Jose
Posted: Tue Nov 18, 2014 12:30 pm Post subject:
Magesh_J,
William pretty much discussed all the parameters. By default the output goes to SORTOUT, you really don't have to specify FNAMES for sortout. We can use REPRO to copy the input file to a VSAM cluster. However if you want to use DFSORT then here are optimized control cards. I converted the VB file to FB in first step and then build only the fields I need in step 2 there by reducing the sorting file length to just 4 bytes.
ICE000I 1 - CONTROL STATEMENTS FOR 5694-A01, Z/OS DFSORT V1R10 - 09:56 ON TUE NO
OPTION COPY
OUTFIL FNAMES=OUT1,VTOF,
IFTHEN=(WHEN=INIT,BUILD=(12,9,PD,ZD,LENGTH=17)),
IFTHEN=(WHEN=INIT,BUILD=(8,2,
12,2,CHANGE=(2,C'01',C'10'),
NOMATCH=(C'01')))
ICE201I H RECORD TYPE IS V - DATA STARTS IN POSITION 5
ICE126A 9 INCONSISTENT REFORMATTING FOR OUT1: REASON CODE 05, IFTHEN 1
ICE751I 0 C5-K90025 C6-K90025 C7-K54603 C8-K62201 E9-K60823 E7-K62201
ICE052I 3 END OF DFSORT
Also have doubt with IFTHEN=(WHEN
Suppose input is FB 10
OUTFIL FNAMES=OUT1,INCLUDE(condition),BUILD=(1,5) Build exactly 5 length record in output
OUTFIL FNAMES=OUT1, IFTHEN=(WHEN=(condition),BUILD=(1,5)
Thought i give build above, it is getting only input record length 10 and not 5 Length character. i.e output file is with length 10 having records from (1,5)
Well if you looked at ICE126A(Suffix A stands for Abend and Suffix I stands for Informational) message in DFSORT Messages, Codes and Diagnosis Guide manual, it is quite self-explanatory. Here is the link to the message. See the explanation for reason code 5 and the NOTE in bold underneath it.
Suppose input is FB 10
OUTFIL FNAMES=OUT1,INCLUDE(condition),BUILD=(1,5) Build exactly 5 length record in output
OUTFIL FNAMES=OUT1, IFTHEN=(WHEN=(condition),BUILD=(1,5)
Thought i give build above, it is getting only input record length 10 and not 5 Length character. i.e output file is with length 10 having records from (1,5)
Please advice how this works.
When you code IFTHEN=(WHEN=(condition) , DFSORT does not know if there are other conditions that would need the full record. So it retains the original LRECL as is. You can counter that with IFOUTLEN=5 or using WHEN=NONE
Remember IFTHEN in DFSORT works similar to EVALUATE statement in COBOL. So once the condition is true it skips the other IFTHEN conditions unless you have a HIT=NEXT.
IFTHEN=(WHEN=INIT and WHEN=GROUP will be executed ALWAYS before the WHEN=CONDITION. So you can have multiple WHEN=INIT and WHEN=GROUP and ALL of them will be executed. _________________ Kolusu
www.linkedin.com/in/kolusu
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum