MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Huge Files Mapping

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Application Programming
View previous topic :: View next topic  
Author Message
rsivananda
Beginner


Joined: 11 Aug 2004
Posts: 30
Topics: 10

PostPosted: Mon Jun 11, 2007 4:40 am    Post subject: Huge Files Mapping Reply with quote

Hi
I have a input file with swift messages
The input file has around 1.5 million input messages and the same no of the output messages

The job is to match the input message with the corresponding out message with such huge files

Could any one help me out on the best possible way of doing this..

A sample message is given below. There are 1.5 million msgs like below in each file to be mapped correspondingly based on certain criteria

HEADERS F01XXXXXXXXXXXXXXXXXXXXXX
O103XXXXXXXXXXXXXXXXXXXXXXXXXX
119:STP
:20: ABC111111111
:23B:CRED
:32A:07186546463626
:33B:USD100
:50K:ABCVELOPMENT
INTERNATIONAL CORP
:53A:BS12345
:54A:IR12345
:57A:B12345
:59: /200064 333.00
XXXXXXXXXXXX
:71A:ABC
:71F:USD100


Say the filter criteria is the amount and the the text in feld 32A

The prog lang basically is used is PL/1

any help to reduce the effort is greatly appreciated


Thanks
Siva
Back to top
View user's profile Send private message
CICS Guy
Intermediate


Joined: 30 Apr 2007
Posts: 292
Topics: 3

PostPosted: Mon Jun 11, 2007 4:58 am    Post subject: Re: Huge Files Mapping Reply with quote

rsivananda wrote:
The job is to match the input message with the corresponding out message with such huge files
Code:
HEADERS  F01XXXXXXXXXXXXXXXXXXXXXX                         
         O103XXXXXXXXXXXXXXXXXXXXXXXXXX
         119:STP                                           
:20: ABC111111111                                 
:23B:CRED                                                   
:32A:07186546463626               
:33B:USD100                                           
:50K:ABCVELOPMENT                                   
     INTERNATIONAL CORP                                     
:53A:BS12345                                         
:54A:IR12345                                         
:57A:B12345                                         
:59: /200064 333.00                                         
     XXXXXXXXXXXX                           
:71A:ABC                                                 
:71F:USD100
Which ones are the 'input message' and which ones are the 'corresponding out message'?
Back to top
View user's profile Send private message
prino
Banned


Joined: 01 Feb 2007
Posts: 45
Topics: 5
Location: Oostende

PostPosted: Mon Jun 11, 2007 6:04 am    Post subject: Reply with quote

Convert both to one SWIFT message per record with fixed positions for the tags (so insert plenty of blanks or x'00' or x'ff' whatever), making sure you add dummy tags for thos that are missing. Then sort both files on the required tag and start reading both, matching data.

Robert
Back to top
View user's profile Send private message
Phantom
Data Mgmt Moderator
Data Mgmt Moderator


Joined: 07 Jan 2003
Posts: 1056
Topics: 91
Location: The Blue Planet

PostPosted: Mon Jun 11, 2007 9:08 am    Post subject: Reply with quote

rsivananda,

You have been in this board for nearly 3 years, yet you did not follow any of the rules.

1. Please provide complete information. Pls don't make us Guess - What are the DCB parameters of your files ?

2. Make Use BB Tags (CODE - /CODE). That format your data and will be easy to read.

3. Give us proper samples of input and output files. You are talking about 3 files, but you have given example on only one. We have no clue whether it is input / output file.

4. If you are ok with solutions involving Sort, then please check this
http://www.mvsforums.com/helpboards/viewtopic.php?t=5399

Most of the times, matching data using utlities will be much faster and efficient thatn using programming languages,


Thanks,
Phantom
Back to top
View user's profile Send private message
rsivananda
Beginner


Joined: 11 Aug 2004
Posts: 30
Topics: 10

PostPosted: Mon Jun 11, 2007 10:56 am    Post subject: Reply with quote

Sorry about not posting the DCBs It's a miss

Ya the files are Vb files with LRECL of 32000

The reason i gave only one file format is that the infile and outfile looks the same except some tags which might change..

I am trying to put these msgs into one liners and then see if i can sort them on the tags i need and do a compare ....


Thanks for the hints..

Siva
Back to top
View user's profile Send private message
ChrisR
Beginner


Joined: 10 Jun 2007
Posts: 5
Topics: 1

PostPosted: Mon Jun 11, 2007 12:07 pm    Post subject: Reply with quote

This is also my problem (see posting on Hash and Data Compression above yours http://www.mvsforums.com/helpboards/viewtopic.php?t=8559 ).

We had figured the solution was to pass data to be matched to a hash routine and generate an index of the much shorter hash keys. A matching hash key does not guarantee a matching record, but would reduces the compares to be made by many milions. Just need to figure out how to invoke one of the many hash routines embedded in IBM Software.
Chris
Back to top
View user's profile Send private message
CICS Guy
Intermediate


Joined: 30 Apr 2007
Posts: 292
Topics: 3

PostPosted: Mon Jun 11, 2007 4:04 pm    Post subject: Re: Huge Files Mapping Reply with quote

I ask again:
CICS Guy wrote:
Which ones are the 'input message' and which ones are the 'corresponding out message'?
Back to top
View user's profile Send private message
rsivananda
Beginner


Joined: 11 Aug 2004
Posts: 30
Topics: 10

PostPosted: Wed Jun 13, 2007 2:42 am    Post subject: Reply with quote

Hi Everyone

I am back again with my problem

Here is IN and OUT msg Resp as they look in the file .

The files are VB files with LRECL of 32000

{1:F01XXXXXXXXXXXXXXXXX}{2:O202XXXXXXXXXXXXXXXXXXXXXX}
{4:
:20: XXXXXXXXXXXXXXXXXXX
:21: ABCABC
:32A:050103USD88644,47
:57A:XXXXXXXXXXX
:58A:XXXXXXXXXXXXXXXX
XXXXXXXXXX
:72: XXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXX
/CHGS/USD210,86/
-}


{1:F01XXXXXXXXXXXXXXXXX}{2:O202XXXXXXXXXXXXXXXXXXXXXX}{3:{XXXXXXXXXXXXXX}}
{4:
:20: XXXXXXXXXXXXX
:21: ABCABC
:32A:050103USD88644,47
:52A:XXXXXXXXXXXXX
:57A:XXXXXXXXXXXX
:58A:XXXXXXXXXX
:72: XXXXXXXXXXXXXXXXXXXX
//VALUE XXXXXXXXXXXXX
//REF XXXXXXXXXXXX
-}


So the dauting task is there are about 1.5 M msgs in IN file and about 1.3 M in out file

Now my task is to map each IN msg with corresponding out msg based on certain ref like below

1. Fields :32A: --Date In first 6 bytes follwed by amount
2. :21: Which has the Ref No

I tried putting them in one line to sort them and then compare. However since the one liners are fixed length, i couldn't do it with simple sort

Can any one suggest a better way of doing this while i try the programmiing with Pl/+ to acheive it.


Thanks
Siva
Back to top
View user's profile Send private message
bauer
Intermediate


Joined: 10 Oct 2003
Posts: 317
Topics: 50
Location: Germany

PostPosted: Wed Jun 13, 2007 3:57 am    Post subject: Reply with quote

This is input ????

Code:

{1:F01XXXXXXXXXXXXXXXXX}{2:O202XXXXXXXXXXXXXXXXXXXXXX}
{4:
:20: XXXXXXXXXXXXXXXXXXX
:21: ABCABC
:32A:050103USD88644,47
:57A:XXXXXXXXXXX
:58A:XXXXXXXXXXXXXXXX
XXXXXXXXXX
:72: XXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXX
/CHGS/USD210,86/
-}




and this output ????



Code:

{1:F01XXXXXXXXXXXXXXXXX}{2:O202XXXXXXXXXXXXXXXXXXXXXX}{3:{XXXXXXXXXXXXXX}}
{4:
:20: XXXXXXXXXXXXX
:21: ABCABC
:32A:050103USD88644,47
:52A:XXXXXXXXXXXXX
:57A:XXXXXXXXXXXX
:58A:XXXXXXXXXX
:72: XXXXXXXXXXXXXXXXXXXX
//VALUE XXXXXXXXXXXXX
//REF XXXXXXXXXXXX
-}
Back to top
View user's profile Send private message
rsivananda
Beginner


Joined: 11 Aug 2004
Posts: 30
Topics: 10

PostPosted: Wed Jun 13, 2007 4:36 am    Post subject: Reply with quote

yes bauer

the input shown above is just one msg and we have 1.5M such msgs

each input begins with {1: and ends with -}

i need to find the corresponding msg in the outfile for the corrsponding in infile..

Please let me know for more details...

Siva
Back to top
View user's profile Send private message
CICS Guy
Intermediate


Joined: 30 Apr 2007
Posts: 292
Topics: 3

PostPosted: Wed Jun 13, 2007 4:52 am    Post subject: Reply with quote

Finally....
Two files containing logical records that can span physical records - or can multiple logical records also share a physical record too?
Find matching logical records based upon a key(s) that float somewhere in the logical record.
Is that anywhere near what you need?
Back to top
View user's profile Send private message
dbzTHEdinosauer
Supermod


Joined: 20 Oct 2006
Posts: 1411
Topics: 26
Location: germany

PostPosted: Wed Jun 13, 2007 5:06 am    Post subject: Reply with quote

everyone,
please refer to this link which will give you a general understanding(confusion??Mr. Green ) of the SWIFT msg architecture.

It is a PDF.... it defines the Swift Monetary Core Formats; not sure which MTtype we are playing with here- actually does not matter.

Swift is undergoing (has been continuously for last 10 years) changes, formats are changing and Data Centers that went the cheap route to implement Swift originally are caught in a lack-of-forethought trap of their own making similar to the challenges of EDI. I can only guess, but i imagine that the results of his match merge will/should provide the messages that the OP's system provided no response, as well as those that did invoke a response.

OP insists that Tags 21 and 32A will always be present.
These are variable length files; meaning the location of the tags can not be expected to be in the same place for any two records (of either file).

I think what the OP wants is to find the 21 & 32A tags of each record, and sort the files with these identified keys. Keep in mind that the length of the data associated with the 21 and 32A Tags is variable.

OP has two files
  1. input file - swift msgs to OP's data center
  2. output file - OP data center responses


I would imagine that the OP needs to
  1. sort/reformat (put a copy of the 21 & 32A Tags in front of each record) each file
  2. match the two sorted files and generate some kind of report


Prino has obviously encountered this situation before and has suggested a method whereby the two files are normalized (given a fixed structure) to simplify the SORTs and then the matching logic in PL/1.

I am not familiar with PL/1 and do not know the limitations when it comes to dealing with undefined structures - parsing.
_________________
Dick Brenholtz
American living in Varel, Germany


Last edited by dbzTHEdinosauer on Wed Jun 13, 2007 5:32 am; edited 1 time in total
Back to top
View user's profile Send private message
dbzTHEdinosauer
Supermod


Joined: 20 Oct 2006
Posts: 1411
Topics: 26
Location: germany

PostPosted: Wed Jun 13, 2007 5:30 am    Post subject: Reply with quote

possibly the parse function of sort can be used to generate sorted files if the OP does not want to 'restructure' his files and is willing to deal with the necessary parsing logic in a PL/1 report pgm.
_________________
Dick Brenholtz
American living in Varel, Germany
Back to top
View user's profile Send private message
semigeezer
Supermod


Joined: 03 Jan 2003
Posts: 1014
Topics: 13
Location: Atlantis

PostPosted: Wed Jun 13, 2007 4:12 pm    Post subject: Reply with quote

first impression:
Read and reformat each record (or set as it were) into a single data structure of variable length. Link these data structures into a balanced binary tree and then read the output file, searching the tree for each record. Should be very fast IF the whole think can fit in storage. I'd think that it can (say each record is 200 bytes, 1.5Million records = 300Meg+ a few more for overhead). If not, just store the keys and record offsets in the tree. A balanced tree search is Olog2 search time so you have at most log2(1500000) or 21 comparisons per output record.
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Application Programming All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group