View previous topic :: View next topic |
Author |
Message |
vivek Beginner
Joined: 15 Jul 2004 Posts: 95 Topics: 11 Location: Edison,NJ
|
Posted: Mon Sep 27, 2004 2:32 pm Post subject: syncsort - sorting huge file with multiple record types with |
|
|
I have a huge file 28million records . I have put my requirements here
I have following files with me.
1. status file
a. this may be more than one row the latest status is numbered with 1and prior one is 2 and so on.
e.g.
following customer A should be chosen
CustomerA _________________ Vivek,NJ
Db2,IDMS |
|
Back to top |
|
 |
kolusu Site Admin

Joined: 26 Nov 2002 Posts: 12388 Topics: 75 Location: San Jose
|
Posted: Mon Sep 27, 2004 2:53 pm Post subject: |
|
|
Vivek,
hmm, seems to much work here , but let me see if I understand your requirement clearly.
Files :
Code: |
1. Status file
2. charges file
3. charge detail file
4. customer addr file
|
Requirement :
Read in the STATUS file (file 1) and select all the customers who has a status value of '05' in the file and write a file T1.
Now for all the records of T1, You need to get the records from file2, 3, and 4 . Based on the code on the files(2,3,4,) you need to write 2 output files. one is a special customer file and the other error customer file.
Is that right?
Since your files 2,3,4 involves dupes, and with your version of syncsort , I would say it is an impossible task. Also the volume of records on status file is not small which voids the dynamic include/omit generation.
IF easytrieve is an option then it may be coded very easily. If you don't have it then you are left with the option of writting a cobol program.
Hope this helps...
Cheers
kolusu _________________ Kolusu
www.linkedin.com/in/kolusu |
|
Back to top |
|
 |
vivek Beginner
Joined: 15 Jul 2004 Posts: 95 Topics: 11 Location: Edison,NJ
|
Posted: Mon Sep 27, 2004 3:30 pm Post subject: |
|
|
kolusu,
Charges file is straight fwd. Just get all customers (unique) with charge id = 1 or 2 or 3 or 4.
My problem lies with charge details
Lets take one at a time. Try to do maximum with SyncSort.
first I need to find all customers (unique) with 77 charge detail
I would write a sync sort step to output a file-77 which has customers with atleast one 77
this is straight forward.
Next I need to find customers with both c4 and c5 charge detail
so i create a file which contains c4
another file with c5
and do a filtering process like c4+c5 - 77
which means merging customers in both c4 file and c5 file and not in 77 file is regarded as special customer
if they exist in 77 file then they are error customers.
how do i do this operation ? _________________ Vivek,NJ
Db2,IDMS |
|
Back to top |
|
 |
kolusu Site Admin

Joined: 26 Nov 2002 Posts: 12388 Topics: 75 Location: San Jose
|
Posted: Mon Sep 27, 2004 4:30 pm Post subject: |
|
|
vivek,
Please post a sample input and desired output for each of the files along with the DCB parameters. Also the positions of the fields to be validated.
Kolusu _________________ Kolusu
www.linkedin.com/in/kolusu |
|
Back to top |
|
 |
vivek Beginner
Joined: 15 Jul 2004 Posts: 95 Topics: 11 Location: Edison,NJ
|
Posted: Tue Sep 28, 2004 5:41 am Post subject: |
|
|
Ok Kolusu, all the files are related by customer id.
Assume the file size is 80 fixed.
status input is like this
CustomerA _________________ Vivek,NJ
Db2,IDMS |
|
Back to top |
|
 |
vivek Beginner
Joined: 15 Jul 2004 Posts: 95 Topics: 11 Location: Edison,NJ
|
Posted: Tue Sep 28, 2004 9:06 am Post subject: |
|
|
Let me go this way to be more simpler.
I have a file with three records.
abcdef12345
bcdefg99999
abcdef67890
i am trying to sort on first 6 chars . I want duplicates abcdef duplicate rows to go into one file and , bcdefg non duplicate into another file ? _________________ Vivek,NJ
Db2,IDMS |
|
Back to top |
|
 |
vivek Beginner
Joined: 15 Jul 2004 Posts: 95 Topics: 11 Location: Edison,NJ
|
Posted: Tue Sep 28, 2004 10:23 am Post subject: |
|
|
Quote: |
I want duplicates abcdef duplicate rows to go into one file and , bcdefg non duplicate into another file ?
|
Nevermind I use xsum option in sumfields to dump duplicate datasets and non duplicates goes to sortout. _________________ Vivek,NJ
Db2,IDMS |
|
Back to top |
|
 |
vivek Beginner
Joined: 15 Jul 2004 Posts: 95 Topics: 11 Location: Edison,NJ
|
Posted: Thu Sep 30, 2004 11:18 am Post subject: |
|
|
this is still tricky. the Xsum has dups from the second occurence. however sortout has FIRST entry of the key that has multiple rows. I did not even want the FIRST entry to be in.
I solved it by using a FILE identifier to determine which file it came from. after sorting , i removed 1st duplicate entry from sortout by using the file id.
Kinda complicated to explain, but can explain if one really wants. _________________ Vivek,NJ
Db2,IDMS |
|
Back to top |
|
 |
|
|