MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Find replace duplicate apostrophe in a varying length string

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
nachiyappan
Beginner


Joined: 08 Sep 2020
Posts: 8
Topics: 2

PostPosted: Tue Sep 08, 2020 12:09 pm    Post subject: Find replace duplicate apostrophe in a varying length string Reply with quote

Need help. There is a need to replace duplicate double quotes (apostrophe) in a varying length string in a FB file having LRECL as 133. Here's the example
Input:
Code:
05,09082020,1234536666,"'ABC,DEF'",123456-,123456,123456789,0,"JOHN CARPENTER : "FIRST TEXT" END STRING",12,""GHIJKL""


Desired Output:
Code:
05,09082020,1234536666,"'ABC,DEF'",123456-,123456,123456789,0,"JOHN CARPENTER : 'FIRST TEXT' END STRING",12,"'GHIJKL'"


This is a comma separated file and the length of each field in the record varies. If you look closer, I need to replace the duplicate Apostrophe with single quote in the fields.
1. First duplicate
From
Code:

"JOHN CARPENTER : "FIRST TEXT" END STRING"

To
Code:

"JOHN CARPENTER : 'FIRST TEXT' END STRING"

2. Second duplicate
From
Code:

""GHIJKL""

To
Code:

"'GHIJKL'"


I tried thinking to use PARSE, IFTHEN , FINDREP but couldn't get closer. I did search in the forum to the best possible extent and couldn't find a closing match. Can you please help?
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12367
Topics: 75
Location: San Jose

PostPosted: Tue Sep 08, 2020 12:45 pm    Post subject: Reply with quote

nachiyappan,


Code:

Comma        = X'6B'
Single quote = X'7D'
Double quote = X'7F'


so the instances you have are
Code:

comma + double quote + double quote
double quote + double quote + Space (the last field)

so use a FINDREP to replace them using hex characters.

Now we need to handle the inner double quotes. (ex: First string)

You can chek for
Code:

Space + Double quote
Double quote + Space


You can use anoter FINDREP to replace these.

so something like this
Code:

//STEP0100 EXEC PGM=SORT                         
//SYSOUT   DD SYSOUT=*                           
//SORTIN   DD DISP=SHR,DSN=Your Input file 
//SORTOUT  DD SYSOUT=*                           
//SYSIN    DD *                                   
  OPTION COPY                                     
  INREC FINDREP=(INOUT=(X'6B7F7F',X'6B7F7D',     
                        X'7F7F40',X'7D7F40'))     
                                                 
  OUTREC FINDREP=(INOUT=(X'407F',X'407D',         
                         X'7F40',X'7D40'))       
/*

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
nachiyappan
Beginner


Joined: 08 Sep 2020
Posts: 8
Topics: 2

PostPosted: Tue Sep 08, 2020 1:11 pm    Post subject: Reply with quote

Thanks Srihari Kolusu very much. The inner double quotes are not always followed by a space, or in other words there is no known pattern. If I had to write a COBOL code, the my logic would be to take the length of each string and then omit the first and last apostrophe and replace any apostrophe in between to a single quote. Can this be done in DFSORT?
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12367
Topics: 75
Location: San Jose

PostPosted: Tue Sep 08, 2020 1:34 pm    Post subject: Reply with quote

nachiyappan wrote:
Thanks Srihari Kolusu very much. The inner double quotes are not always followed by a space, or in other words there is no known pattern. If I had to write a COBOL code, the my logic would be to take the length of each string and then omit the first and last apostrophe and replace any apostrophe in between to a single quote. Can this be done in DFSORT?


Well we can PARSE and remove the Double quotes with the string. From your sample you have 12 parsed fields. So if you can let us know the fields that needs the double quotes to removed, then we can handle that.

btw this field does NOT make any sense at all

Code:

"'ABC,DEF'",


If the above field is a comma separated field, why does it have quotes as if it is a single field. Are they 2 different fields with mismatched quotes?

Should it be
Code:

"'ABC","DEF'",

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
nachiyappan
Beginner


Joined: 08 Sep 2020
Posts: 8
Topics: 2

PostPosted: Tue Sep 08, 2020 3:19 pm    Post subject: Reply with quote

Hi Kolusu, There are only 11 fields.
Code:

"'ABC,DEF'"

is one field and is a name field with first and last name separated by a comma. The comma in there is not part of the CSV formatting and hence included in Apostrophe.

The fields that need the duplicate double quotes to be removed are 9 and 11. Thank you!
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12367
Topics: 75
Location: San Jose

PostPosted: Tue Sep 08, 2020 3:51 pm    Post subject: Reply with quote

nachiyappan wrote:
Hi Kolusu, There are only 11 fields.

The fields that need the duplicate double quotes to be removed are 9 and 11. Thank you!


Use the following control cards. I assumed that the largest field you have is 50 bytes.

Code:

//SYSIN    DD *                                                   
  OPTION COPY                                                     
  INREC IFTHEN=(WHEN=INIT,                                       
         PARSE=(%01=(ENDBEFR=C',',FIXLEN=50),                     
                %02=(ENDBEFR=C',',FIXLEN=50),                     
                %03=(ENDBEFR=C',',FIXLEN=50),                     
                %04=(ENDAT=X'7D7F',FIXLEN=50),                   
                %05=(ADDPOS=1,ENDBEFR=C',',FIXLEN=50),           
                %06=(ENDBEFR=C',',FIXLEN=50),                     
                %07=(ENDBEFR=C',',FIXLEN=50),                     
                %08=(ENDBEFR=C',',FIXLEN=50),                     
                %09=(ENDBEFR=C',',FIXLEN=52),                     
                %10=(ENDBEFR=C',',FIXLEN=50),                     
                %11=(ENDBEFR=C',',FIXLEN=52)),                   
                                                                 
         BUILD=(001:%01,                                         
                051:%02,                                         
                101:%03,                                         
                151:%04,                                         
                201:%05,                                         
                251:%06,                                         
                301:%07,                                         
                351:%08,                                         
                401:52X,                                         
                453:%10,                                         
                503:52X,                                         
                555:%09,JFY=(SHIFT=LEFT,LEAD=C'@',TRAIL=C'@'),   
                607:%11,JFY=(SHIFT=LEFT,LEAD=C'@',TRAIL=C'@'))), 
                                                                 
** SAVE THE BEGIN AND END DOUBLE QUOTE FOR FIELDS 9 & 11  **     
                                                                 
        IFTHEN=(WHEN=INIT,                                       
       FINDREP=(STARTPOS=555,                                     
                INOUT=(C'@"',C'@@',                               
                       C'"@',C'@@'))),                           
                                                                 
** REPLACE THE DOUBLE QUOTE IN BETWEEN FOR FIELDS 9 & 11   **   
                                                               
        IFTHEN=(WHEN=INIT,                                     
       FINDREP=(STARTPOS=555,                                 
                INOUT=(C'"',C''''))),                         
                                                               
** PUT BACK THE 9TH AND 11 FIELD IN THEIR ORIGINAL POSITION **

        IFTHEN=(WHEN=INIT,                                     
       OVERLAY=(401:555,52,                                   
                503:607,52,                                   
                555:104X)),                                   
                                                               
** PUT BACK THE DOUBLE QUOTE FOR 9TH AND 11TH FIELDS        **
                                                               
        IFTHEN=(WHEN=INIT,                                     
       FINDREP=(STARTPOS=401,                                 
                  ENDPOS=554,                                 
                INOUT=(C'@@',C'"'))),                         
                                                               
** RECONSTRUCT THE RECORD AS IS                             **
                                                               
        IFTHEN=(WHEN=INIT,                                     
        BUILD=(1,554,SQZ=(SHIFT=LEFT,                         
                            MID=C',',                         
                         LENGTH=133,                           
                           PAIR=QUOTE)))                       
/*

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
nachiyappan
Beginner


Joined: 08 Sep 2020
Posts: 8
Topics: 2

PostPosted: Tue Sep 08, 2020 4:05 pm    Post subject: Reply with quote

Hi Kolusu, Awesome, I can understand the logic , will try this and let you know the outcome. Thanks a lot !!
Back to top
View user's profile Send private message
nachiyappan
Beginner


Joined: 08 Sep 2020
Posts: 8
Topics: 2

PostPosted: Thu Sep 10, 2020 9:11 am    Post subject: Reply with quote

Hi Kolusu, for the example I provided , this approach worked really well. Thank a lot for providing this valuable solution. I have another challenge. If any fields are blank in the subsequent records, then they are getting omitted, but I want to keep them intact. Here's the example for this situation

Input:
Code:

05,09082020,1234536666,"'ABC,DEF'",123456-,123456,123456789,0,"JOHN CARPENTER : "FIRST TEXT" END STRING",12,""GHIJKL"" 
05,09082020,1234536666,"'ABC,DEF'",,,123456789,0,"JOHN CARPENTER : "FIRST TEXT" END STRING",12,""GHIJKL"" 

Output:
Code:

05,09082020,1234536666,"'ABC,DEF'",123456-,123456,123456789,0,"JOHN CARPENTER : 'FIRST TEXT' END STRING",12,"'GHIJKL'"
05,09082020,1234536666,"'ABC,DEF'",123456789,0,"JOHN CARPENTER : 'FIRST TEXT' END STRING",12,"'GHIJKL'"   


the desired output is
Code:

05,09082020,1234536666,"'ABC,DEF'",123456-,123456,123456789,0,"JOHN CARPENTER : 'FIRST TEXT' END STRING",12,"'GHIJKL'"
05,09082020,1234536666,"'ABC,DEF'",,,123456789,0,"JOHN CARPENTER : 'FIRST TEXT' END STRING",12,"'GHIJKL'"   


The 5th and 6th columns had blank in the input which got omitted in the output but I wanted to keep that intact. Can this be done? Thanks much in advance.
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12367
Topics: 75
Location: San Jose

PostPosted: Thu Sep 10, 2020 10:27 am    Post subject: Reply with quote

nachiyappan,

You need to use square brackets for code tags. I have been editing them for you. Here is an example of how to use code tags

https://www.mvsforums.com/helpboards/viewtopic.php?p=19031#19031

Now coming to your new requirement, it is easy to handle. In the parse itself we can add a lead character to identify the empty strings. I used '#' to tag the empty fields.

Code:

//SYSIN    DD *                                                   
  OPTION COPY                                                     
  INREC IFOUTLEN=133,                                             
        IFTHEN=(WHEN=INIT,                                         
         PARSE=(%01=(ENDBEFR=C',',FIXLEN=50),                     
                %02=(ENDBEFR=C',',FIXLEN=50),                     
                %03=(ENDBEFR=C',',FIXLEN=50),                     
                %04=(ENDAT=X'7D7F',FIXLEN=50),                     
                %05=(ADDPOS=1,ENDBEFR=C',',FIXLEN=50),             
                %06=(ENDBEFR=C',',FIXLEN=50),                     
                %07=(ENDBEFR=C',',FIXLEN=50),                     
                %08=(ENDBEFR=C',',FIXLEN=50),                     
                %09=(ENDBEFR=C',',FIXLEN=52),                     
                %10=(ENDBEFR=C',',FIXLEN=50),                     
                %11=(ENDBEFR=C',',FIXLEN=52)),                     
                                                                   
         BUILD=(001:%01,JFY=(SHIFT=LEFT,LEAD=C'#'),               
                051:%02,JFY=(SHIFT=LEFT,LEAD=C'#'),               
                101:%03,JFY=(SHIFT=LEFT,LEAD=C'#'),               
                151:%04,JFY=(SHIFT=LEFT,LEAD=C'#'),               
                201:%05,JFY=(SHIFT=LEFT,LEAD=C'#'),               
                251:%06,JFY=(SHIFT=LEFT,LEAD=C'#'),               
                301:%07,JFY=(SHIFT=LEFT,LEAD=C'#'),               
                351:%08,JFY=(SHIFT=LEFT,LEAD=C'#'),               
                401:52X,                                           
                453:%10,JFY=(SHIFT=LEFT,LEAD=C'#'),               
                503:52X,                                           
                555:%09,JFY=(SHIFT=LEFT,LEAD=C'@',TRAIL=C'@'),     
                607:%11,JFY=(SHIFT=LEFT,LEAD=C'@',TRAIL=C'@'))),   
                                                                   
** SAVE THE BEGIN AND END DOUBLE QUOTE FOR FIELDS 9 & 11  **       
                                                                   
        IFTHEN=(WHEN=INIT,                                         
       FINDREP=(STARTPOS=555,                                     
                INOUT=(C'@"',C'@@',                               
                       C'"@',C'@@'))),                             
                                                                   
** REPLACE THE DOUBLE QUOTE IN BETWEEN FOR FIELDS 9 & 11   **     
                                                                   
        IFTHEN=(WHEN=INIT,                                         
       FINDREP=(STARTPOS=555,                                     
                INOUT=(C'"',C''''))),                             
                                                                   
** PUT BACK THE 9TH AND 11 FIELD IN THEIR ORIGINAL POSITION **     
        IFTHEN=(WHEN=INIT,                                         
       OVERLAY=(401:555,52,                                       
                503:607,52,                                       
                555:104X)),                                       
                                                                   
** PUT BACK THE DOUBLE QUOTE FOR 9TH AND 11TH FIELDS        **     
                                                                   
        IFTHEN=(WHEN=INIT,                                         
       FINDREP=(STARTPOS=401,                                     
                  ENDPOS=554,                                     
                INOUT=(C'@@',C'"'))),                             
                                                                   
** RECONSTRUCT THE RECORD AS IS                             **     
                                                                   
        IFTHEN=(WHEN=INIT,                                         
        BUILD=(1,554,SQZ=(SHIFT=LEFT,                             
                            MID=C',',                             
                           PAIR=QUOTE))),                         
                                                                   
** REMOVE THE DELIMITER # WHICH IS USED TO SAVE EMPTY FIELDS**     
                                                                   
        IFTHEN=(WHEN=INIT,                                         
       FINDREP=(INOUT=(C'#',C'')))                                 
/*                                                                 

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
nachiyappan
Beginner


Joined: 08 Sep 2020
Posts: 8
Topics: 2

PostPosted: Thu Sep 10, 2020 1:41 pm    Post subject: Reply with quote

Hi Kolusu,
In fact, I tried this already and really glad that this worked out. The work which may span into multiple days of development definitely shortened because of your help. Thank you very much.

I will keep note to use square brackets for code tags going forward.

Thanks,
Nachi
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group