MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Extract a variable group of records from a VB

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
Fab
Beginner


Joined: 15 Nov 2016
Posts: 46
Topics: 5

PostPosted: Thu Apr 06, 2017 2:31 am    Post subject: Extract a variable group of records from a VB Reply with quote

Hello everybody,
I have a file with variable lenght which contain logical groups of records, each group pertain a specific client, groups are composed by different records type, records type follows a specific sequence which is always the same, number of records in a group can't be previously determined. File lenght is 504 (record is 500, plus 4 bytes for variable lenght I suppose), record format is VB.
Following an example of 2 groups:
Code:

BEGINNING RECORD OF THE FILE
STDANN xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
ST100 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
INO xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
... rec1
... rec2
... recn
D11
STDANN xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
ST100 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
INO xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
... rec1
... rec2
... rec3
... recn
D11
END00000700000003882001241167297

So these are two distinct groups, I used "..." to specify a variable number of records, I don't know previously how many records they will be (it depends on how many operations that client did during the past year, and this is a variable). BEGINNING and END records are NOT part of any groups, starting record is just a placeholder, while ending record is NOT, it MUST report 'END' followed by the number of groups I have extracted (in my previous example 70), by the number of records extracted (3882 in my example), and finally by the number of bytes extracted, so ending record is composed by 3 counters like this:
Code:

END
0000070 number of groups extracted (I will have to csalculate it)
0000003882 number of records extracted except ending one (needs to be calculated)
001241167297 number of bytes extracted (I guess I can just avoid it because probably it will be calculated again later)

STDANN, ST100, INO are records types, they are always present and always in that sequence, and each group begins with STDANN record type, so a group is the variable amount of records between a STDANN and the next one. Also I have noted that every group always ends with a D11 record type, this D11 is a 93 bytes long record and it always begins with 'D11' followed by all blanks. I dunno if this is important but obviously every record type might have a different lenght (we are facing a VB file).
I have to:
1) extract in output the first and the last records (which identify the beginning and the ending of the file). The first and last records in input must be the first and last records in output also. in the last record I will have to specify how many groups I have extracted, how many records, and probably the number of bytes (not sure about that, in first instance I can just put it to zeroes)
2) extract in output all the variable groups of records that have the record type INO which will be like this:
Code:

----+----10---+----2----+----3----+----4
INO                                00000

so probably begins with INO and have '00000' at position 36 (I guess it will be at position 40 because of the variables bytes, so 36+4=40). these group of records must be complete and the same as input, obviously.
3) input is already sorted, I don't need to sort that again.

so in input I will have
Code:

START RECORD
STDANN   2017040800001
ST100 000000
INO                                00000
...
D11
STDANN   2017040800002
ST100 000000
INO                                12345
...
D11
STDANN   2017040800003
ST100 000000
INO                                67891
...
D11
STDANN   2017040800004
ST100 000000
INO                                00000
...
D11
END00000040000003882001241167297


and in output I would expect
Code:

START RECORD
STDANN   2017040800001
ST100 000000
INO                                00000
...
D11
STDANN   2017040800004
ST100 000000
INO                                00000
...
D11
END00000020000000500001241167297


Is this possible to do with a sort?
Thanks in advance for any kind of help you can provide.
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 11745
Topics: 74
Location: San Jose

PostPosted: Thu Apr 06, 2017 7:32 am    Post subject: Reply with quote

fab wrote:
STDANN, ST100, INO are records types, they are always present and always in that sequence


Fab,

I thought by now you should be well versed to use the trick of Joinkeys to extract the groups of records. Smile

If the INO record is always present in the group, then it is quite easy to extract the group of records that has INO 0000

Here is an untested job which I think will give you the desired results. Please excuse me if I missed something as it is too early and just had my first coffee.

Brief explanation of the job.

Since your input is VB we need to preserve the length as is, so any temporary fields that we add, we have to put them right after RDW. For fixed length records, we usually add temp fields at the end of the record and then chop it off later. But for variable length files you cannot add temporary fields at the end as it will make all variable length records as full length. So to preserve the variable lengths we add the temp fields right after RDW. Later once we remove the temp fields we are still left with the original variable length records.

JNF1CNTL - will use WHEN=INIT to pad 9 spaces(8 byte group key + 1 byte pick byte for ino rec)

Using WHEN=GROUP, we will tag all the groups that begin with STDANN with a seqnum

Note that I checked for position 14 instead of 5 as the INIT statement padded 9 spaces before the actual data. So now your actual data starts at 14.

JNF2CNTL - Will ONLY pick the INO records with INCLUDE COND (this acts before INREC statement so we use the original record position)

Since we just need the INO sequence record number for matching , I chopped of the length to just 13 bytes (4 byte rdw+ 8 byte seqnum+ 1 pick byte)

If the INO record has 0000 at position 49, then we update the pick byte to 'p' so that it will match with file 1

Note that I checked for position 49 instead of 40 as the INIT statement padded 9 spaces before the actual data. So now your actual data starts at 14.

This will match the files and we extract the matched records as well unmatched records from file1

SYSIN - Using an INCLUDE cond we filter out the desired records i.e matched records, BEGIN and END records.

Since you wanted the BYTE count, we once again rebuild the record using INREC. The byte count is in the first 2 bytes of RDW. But a minor caveat here is that it has an additional 1 byte because of the match indicator(?) on the REFORMAT statement. So we need to subtract that 1 from RDW to get the correct length as we don't need that.

Using WHEN=GROUP, we will once again tag the matched group with a seqnum so that we can use that to generate the stats you need at the end.

Using OUTFIL BUILD we remove the temp fields that we added and get the original file as is

Using Reporting features TRAILER1 we get the counts and byte total that you want.


Code:

//STEP0100 EXEC PGM=SORT                                         
//SYSOUT   DD SYSOUT=*                                           
//INA      DD DISP=SHR,DSN=Your input VB file
//INB      DD DISP=SHR,DSN=Same Input VB file     
//SORTOUT  DD SYSOUT=*                                           
//SYSIN    DD *                                                   
  OPTION COPY,VLSHRT,VLSCMP                                       
  JOINKEYS F1=INA,FIELDS=(5,9,A),SORTED,NOSEQCK                   
  JOINKEYS F2=INB,FIELDS=(5,9,A),SORTED,NOSEQCK                   
  JOIN UNPAIRED,F1                                               
  REFORMAT FIELDS=(F1:1,4,                $ RDW                   
                      ?,                  $ MATCH INDICATOR       
                   F1:14)                 $ ACTUAL DATA           
                                                                 
  INCLUDE COND=(5,1,CH,EQ,C'B',OR,                               
                6,5,CH,EQ,C'BEGIN',OR,                           
                6,3,CH,EQ,C'END')                                 
                                                                 
  INREC IFTHEN=(WHEN=INIT,                                       
        BUILD=(1,4,                       $ RDW                   
               8X,                        $ SPACES FOR GROUP COUNT
               1,2,BI,SUB,+1,BI,LENGTH=2, $ BYTE COUNT           
               6)),                       $ ACTUAL DATA 
         
  IFTHEN=(WHEN=GROUP,                                             
         BEGIN=(15,6,CH,EQ,C'STDANN'),                           
          PUSH=(05:ID=8))                                         
                                                                 
  OUTFIL REMOVECC,                                               
        BUILD=(1,4,                       $ RDW                   
               15),                       $ ACTUAL DATA           
                                                                 
  TRAILER1=(5,8,                                                 
            ' NUMBER OF GROUPS EXTRACTED',/,                     
            COUNT-1=(M11,LENGTH=8),                               
            ' NUMBER OF RECORDS EXTRACTED',/,                     
            TOT=(13,2,BI,M11,LENGTH=8),                           
            ' NUMBER OF BYTES EXTRACTED')                         
//*                                                               
//JNF1CNTL DD *                                                   
  OPTION VLSHRT                                                   
  INREC IFTHEN=(WHEN=INIT,                                         
        BUILD=(1,4,                       $ RDW                   
               8X,                        $ SPACES FOR GROUP KEY   
               C'P',                      $ PICK FOR INO REC       
               5)),                       $ ACTUAL DATA           
                                                                   
  IFTHEN=(WHEN=GROUP,                                             
         BEGIN=(14,6,CH,EQ,C'STDANN'),                             
          PUSH=(05:ID=8))                                         
//*                                                               
//JNF2CNTL DD *                                                   
  OPTION VLSCMP                                                   
  INCLUDE COND=(5,3,CH,EQ,C'INO')                                 
  INREC IFOUTLEN=13,                                               
  IFTHEN=(WHEN=INIT,                                               
        BUILD=(1,4,                       $ RDW                   
               SEQNUM,8,ZD,               $ SEQNUM FOR GROUP KEY   
               X,                         $ SPACES FOR INO REC     
               5)),                       $ ACTUAL DATA           
  IFTHEN=(WHEN=(14,3,CH,EQ,C'INO',AND,                             
                49,4,CH,EQ,C'0000'),                               
      OVERLAY=(13:C'P'))                  $ PICK FOR INO REC       
//*

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Fab
Beginner


Joined: 15 Nov 2016
Posts: 46
Topics: 5

PostPosted: Thu Apr 06, 2017 7:47 am    Post subject: Reply with quote

Kolusu lol Smile
I am lucky because you always answer me, and this is amazing...the bad part is that we have approx 9 hours gap (you are behind). I spent whole morning editing that file extracting by hand hundreds of groups...this because they did not give me the time to think...the problem here is too urgent for give me time to think...so no I am not confident with joinkeys (not yet)... But I have not finished yet so, if your solution works I would be definitely lucky Smile I am now taking my time to study your code...hope they will leave me quiet Smile


Edited
this morning I was very perplexed because INO is not at the beginning of the group, so I thought it not could be done...
Back to top
View user's profile Send private message
Fab
Beginner


Joined: 15 Nov 2016
Posts: 46
Topics: 5

PostPosted: Thu Apr 06, 2017 9:37 am    Post subject: Reply with quote

It worked, just ending record is not formatted as I would need, it is not a problem at all. I get this
Code:

END03300610017430781001241167297   
00000236 NUMBER OF GROUPS EXTRACTED
00012556 NUMBER OF RECORDS EXTRACTED
00949999 NUMBER OF BYTES EXTRACTED 


I would have needed this
Code:

END00002360000012556000000949999

I will format it by myself, no prob as I said

Great thanks Kolusu
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 11745
Topics: 74
Location: San Jose

PostPosted: Thu Apr 06, 2017 10:36 am    Post subject: Reply with quote

Fab wrote:
It worked, just ending record is not formatted as I would need


Fab,

Glad that worked. I guess I understood it wrong when you showed this

Code:

and finally by the number of bytes extracted, so ending record is composed by 3 counters like this:

END
0000070 number of groups extracted (I will have to csalculate it)
0000003882 number of records extracted except ending one (needs to be calculated)
001241167297 number of bytes extracted (I guess I can just avoid it because probably it will be calculated again later)


I thought you wanted the counters on separate lines.

Do you really need the BEGIN and END from your original input file or can they be generated? If they can be generated it makes it quite easy. You wouldn't need the INCLUDE and JOIN UNPAIRED,F1 statements.


Also what are the lengths of the group count, record count and byte count? I had them at 8 bytes each but you seem to have more
_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Fab
Beginner


Joined: 15 Nov 2016
Posts: 46
Topics: 5

PostPosted: Fri Apr 07, 2017 2:49 am    Post subject: Reply with quote

Hi Kolusu,
yes they can be generated, would be perfect to me.
starting record is always the same:
Code:

----+----1----+----2----+----3----+----4---
TSTDEMA                            SXNXSXXX


while the END record is variable because of the counters of course. This is what I have now as output
Code:

----+----1----+----2----+----3--
END03300610017430781001241167297


as you can see it is composed by:
- END (a constant)
- 0330061 a 7 digits counter for the number of groups extracted/elaborated
- 0017430781 a 10 digits counter for the number of records
- 001241167297 a 12 digits counter for to the number of bytes

Thanks again for you amazing support Kolusu.
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 11745
Topics: 74
Location: San Jose

PostPosted: Fri Apr 07, 2017 7:17 am    Post subject: Reply with quote

Fab,

Here are the modified untested control cards. However I would like to point out that the header byte count for the header record(TSTDEMA record) and the trailer record (END) is not added to the total of byte count. So your byte count may be off around 80-90 bytes. The header record if I counted correctly is about 47 bytes and the END record is about 43 bytes inclusive of the RDW. If you do need to add these bytes, we can. Let me know if you want them. we would add them to one of the record.

I added +1 to the count to account for the header record. If you don't want to count the header record, then simply change it to COUNT instead of COUNT+1 on the trailer1.

Ideally I would have preferred the counters to be separated by delimiter(|),so that I clearly know where each count ends. something like this
Code:

END0330061|0017430781|001241167297


Either way it is quite easy to add the delimiters on the END record. I will leave it for you to handle it if you need them.

Code:

//SYSIN    DD *                                                     
  OPTION COPY                                                       
  JOINKEYS F1=INA,FIELDS=(5,9,A),SORTED,NOSEQCK                     
  JOINKEYS F2=INB,FIELDS=(5,9,A),SORTED,NOSEQCK                     
                                                                     
  REFORMAT FIELDS=(F1:1,4,                $ RDW                     
                      14)                 $ ACTUAL DATA             
                                                                     
  INREC IFTHEN=(WHEN=INIT,                                           
        BUILD=(1,4,                       $ RDW                     
               7X,                        $ SPACES FOR GROUP COUNT   
               1,2,BI,PD,LENGTH=2,        $ BYTE COUNT               
               5)),                       $ ACTUAL DATA             
                                                                     
  IFTHEN=(WHEN=GROUP,                                               
         BEGIN=(14,6,CH,EQ,C'STDANN'),                               
          PUSH=(05:ID=7))                                           
                                                                     
  OUTFIL REMOVECC,                                                   
        BUILD=(1,4,                       $ RDW                     
               14),                       $ ACTUAL DATA             
                                                                     
  HEADER1=('TSTDEMA',36:'SXNXSXXX'),

  TRAILER1=('END',5,7,                                               
            COUNT+1=(M11,LENGTH=10),                                 
            TOT=(12,2,PD,M11,LENGTH=12))                             
//*
//JNF1CNTL DD *                                                     
  OPTION VLSCMP                                                     
  OMIT COND=(5,7,CH,EQ,C'TSTDEMA',OR,                               
             5,3,CH,EQ,C'END')                                     
                                                                   
  INREC IFTHEN=(WHEN=INIT,                                         
        BUILD=(1,4,                       $ RDW                     
               8X,                        $ SPACES FOR GROUP KEY   
               C'P',                      $ PICK FOR INO REC       
               5)),                       $ ACTUAL DATA             
                                                                   
  IFTHEN=(WHEN=GROUP,                                               
         BEGIN=(14,6,CH,EQ,C'STDANN'),                             
          PUSH=(05:ID=8))                                           
//*                                                                 
//JNF2CNTL DD *                                                     
  OPTION VLSCMP                                                     
  INCLUDE COND=(5,3,CH,EQ,C'INO')                                   
  INREC IFOUTLEN=13,                                               
  IFTHEN=(WHEN=INIT,                                               
        BUILD=(1,4,                       $ RDW                     
               SEQNUM,8,ZD,               $ SEQNUM FOR GROUP KEY   
               X,                         $ SPACES FOR INO REC     
               5)),                       $ ACTUAL DATA             
  IFTHEN=(WHEN=(14,3,CH,EQ,C'INO',AND,                             
                49,4,CH,EQ,C'0000'),                               
      OVERLAY=(13:C'P'))                  $ PICK FOR INO REC       
//*

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Fab
Beginner


Joined: 15 Nov 2016
Posts: 46
Topics: 5

PostPosted: Fri Apr 07, 2017 7:55 am    Post subject: Reply with quote

Thanks you very much Kolusu, I will try this very soon
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group