MVSFORUMS.com

Santlou

Hi All,

I am looking for an efficient way of using DFSORT to Match records from 2 files (Variable LRECL) and generate a result of Matching Records.

FileA:

Nic Clouston · Posted: Thu May 03, 2012 9:49 am Post subject:

JOINKEYS? And you might get a better response if you had posted in the Utilities part of the forum. It has a description that starts 'DFsort' so it should have been hard to miss.
_________________
Utility and Program control cards are NOT, repeat NOT, JCL.

delta403 · Posted: Thu May 03, 2012 9:55 am Post subject:

Please use the ICETOOL utility for this purpose.

COPY FROM(IN1) TO(TEMP)
COPY FROM(IN2) TO(TEMP)
SPLICE FROM(TEMP) TO(REPORT) ON(1,6,CH) WITH(23,24)

Please Do Not advertise here

dbzTHEdinosauer · Posted: Thu May 03, 2012 10:04 am Post subject:

Santlou,

why do you have 445566B... twice in the output?
and why are the others there at all?

if the files are presorted (as you probably know)
the process will be quicker.

do you actually want a match? or a join of the records?

a match would mean a minimum of 2 records in output.

you could use JOINKEYS in one process to create 1 fixed length record for each 'match'
since file 2 is a subset of file 1,
the number of potential duplicates of file1 matching a file2 record
dictate how many output records there will be.
but a file record size equal to file 1 max record size PLUS file 2 max record size.

on the output side of the JOINKEYS,
you could 'split' the records though i do not see why you would want to,

Frank/Kolusu will be along shortly (or even now)
and provide you with a more relevant answer.
_________________
Dick Brenholtz
American living in Varel, Germany

dbzTHEdinosauer · Posted: Thu May 03, 2012 10:06 am Post subject:

delta403.

the splice would require a sort,
where the JOINKEYS on OPTION COPY with presorted files, would not.
_________________
Dick Brenholtz
American living in Varel, Germany

delta403 · Posted: Thu May 03, 2012 10:18 am Post subject:

@dbz: SPLICE will sort the records automatcially, we don't need to specify any sort step for this.

dbzTHEdinosauer · Posted: Thu May 03, 2012 10:23 am Post subject:

exactly.
but we can avoid a sort with JOINKEYS, ...,SORTED.
_________________
Dick Brenholtz
American living in Varel, Germany

kolusu · Posted: Thu May 03, 2012 10:34 am Post subject:

kolusu · Posted: Thu May 03, 2012 10:41 am Post subject:

Santlou,

What is the LRECL and RECFM of both the files? You said your files are variable block files , so the key actually starts from position 5 as the first 4 bytes have the RDW. Is the data already presorted on the key?
_________________
Kolusu
www.linkedin.com/in/kolusu

Santlou · Posted: Thu May 03, 2012 12:28 pm Post subject:

Thanks for all your responses. I appreciate all the help.

Delta, Yes. I would not have a problem using ICETOOLS if it would get the results that I am looking for. However, your solution basically combines both files first, making it inefficient with my files. If FileA has 40 million records and FileB has 1000 records, copying both files to TEMP will mean copying many millions of records that we don't want. I'm looking for a solution that would basically extract ONLY the records from FILEA that Match FILEB without building another file that includes both FILEA and FILEB. With 40million records in filea, I would not have enough DASD to build TEMP. Also using the SPLICE that you suggested, wouldn't I still have the records on fileA that are duplicate (i.e. Key=334455) on fileA but are not on fileB?

DBZ... I have 445566B Twice in the output because I originally wanted All Records from FILEB and Only those Recs from FILEA that MATCH FILEB. But, I can live with just a match that would result in an output file that Only includes Records from FileA that Match records in FileB. Also, Yes, the files can be pre-sorted and creating a Result file with Only records from Filea that Match Records from Fileb would suffice. However, the output Result file MUST be in the SAME FORMAT as FILEA. Both FileA and FileB are LRECL=10004, RECFM=VB, DSORG=PS. The Result has to be in the same format with the same record lengths as the Original records from FileA. How can using JoinKeys to create 1 fixed length record help me obtain a Result file in VB format? I appreciate your input and I admit that I have never used Joinkeys, so pardon my ignorance here. However, would your solution, given Pre-Sorted files, provide a result file that includes all the records from fileA that match position 1-6 (Yes - for Variable records I would specify positions 5-10 in the sort cards) of FileB?

Thanks for your assistance and your expertise.

I know that I can achieve my results by using ICETOOLS to basically combine both files, the use a SPLICE to identify only the keys from FileB, then do an ALLDUPS then remove unwanted records by selecting only those records Flagged by the SPLICE. But as I stated, this is basically combining both files into one big file, which is simply too inefficient, then eliminating unwanted records. This works fine for smaller files. However, when working with a FileA that is 40million+ records vs a FileB that has about 1000 records, the result file should be only about 800,000 records which is far shorter than the 40Million+ records from Filea.

Also, the requirement that the format of the Result file, including record lengths of all records be the same as FileA (LRECL=10004, RECFM=VB, DSORG=PS) is an issue for me since I would normally add an extra byte (if it was FB) to the end of the file to indicate a FileB record and use that byte to SPLICE into all the matching records. This will allow me to remove all dups on filea that are not on fileb. But this also requires me to basically combine both files into one with an ALLDUPS, creating a file that contains about 39 million records that I do not need.

This is what I've done in the past, but this won't work because of the file size:

Step 1: Copy FILEB to TMP. TMP LRECL is 1 byte bigger than FILEB to accomodate a "B" flag that I will insert for FILEB records. Also copy FileA to TMP1 to add this extra byte.

Step2: Concatenate TMP and TMP1 and SELECT ALLDUPS.
This will result in One file that has all duplicates from the combined files. Making sure that the records from FileB result at the top of each Group in Sorted Sequence.

Step3. SPLICE the records from Step2 so that I put the "B" flag on all records that Match the Keys from FILEB.

Step4. Remove all unwanted records by selecting only those records flagged with a "B" (i.e. INCLUDE COND=...)

However, to do this for 40 million records is not very efficient. What I'm looking for is a way to achieve this without having to create a "Combined" file. Also, the variable LRECL is an issue for me since I need to persist the LRECL of the Original record from FILEA - I basically have no place to put the "B" without losing the LRECL on the VB file.

I appreciate any assistance and I appologize if my description is not detailed enought.

Thanks,

kolusu · Posted: Thu May 03, 2012 12:41 pm Post subject:

santlou,

If your intention is to just get the matched records from fileA then it is very easy with Joinkeys. The following DFSORT JCL will give all the records from FileA which has matching record in FILEB.

Santlou · Posted: Mon May 07, 2012 12:33 am Post subject:

Thanks All...

However, my client only has DFSORT V1R5.

ICE250I 0 VISIT http://www.ibm.com/storage/dfsort FOR DFSORT PAPERS, EXAMPLES A
ICE000I 1 - CONTROL STATEMENTS FOR 5694-A01, Z/OS DFSORT V1R5 - 00:22 ON MON MA
OPTION COPY
JOINKEYS F1=INA,FIELDS=(5,22,A),SORTED,NOSEQCK
$
ICE005A 0 STATEMENT DEFINER ERROR
JOINKEYS F2=INB,FIELDS=(5,22,A),SORTED,NOSEQCK
$
ICE005A 0 STATEMENT DEFINER ERROR
REFORMAT FIELDS=(F1:1,4,5)
$
ICE005A 0 STATEMENT DEFINER ERROR
ICE056A 0 SORTIN NOT DEFINED
ICE751I 0 C5-K90013 C6-K90013 C7-K90000 C8-K90013 E7-K24705

Is there a solution compatible with this release of DFSORT?

Thanks

Sqlcode · Intermediate Joined: 15 Dec 2006 Posts: 157 Topics: 38

Santlou,

Nic Clouston · Posted: Mon May 07, 2012 10:56 am Post subject:

I don't think JOINKEYS is available in 1.5 except, perhaps, with a PTF. I believe it is outdated and unsupported now. May be mis-remembering posts ona nother forum but Kolusu will set us all straight.
_________________
Utility and Program control cards are NOT, repeat NOT, JCL.

Santlou · Posted: Mon May 07, 2012 10:59 am Post subject:

Sqlcode,

My guess is that this message is misleading.

Since the JOINKEYS are not recognized, DFSORT throws them out and expects a SORTIN DD statement since the INA and INB DD statements that I am referencing in my F1 and F2 parameters of my JOINKEYS statements is also ignored by DFSORT because it does not know what JOINKEYS is.

According to what I'm seeing in the DFSORT docs, I should not need a SORTIN statement.

Does that make sense?