MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Same query different CPU time with different host value

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Database
View previous topic :: View next topic  
Author Message
rsantosh
Beginner


Joined: 20 Dec 2014
Posts: 38
Topics: 9

PostPosted: Sat Jun 20, 2020 12:35 pm    Post subject: Same query different CPU time with different host value Reply with quote

Hello,

I have a SP which performance differently depending on host variable value.

e.g
Code:

SELECT BRANCH,ACCOUNT,SECURITY
FROM TABLE1
WHERE SECURITY LIKE :HV1
     AND BRANCH BETWEEN '300' AND '600'


Index IX1 has two columns SECURITY & BRANCH(duplicate Index). SECURITY is CHAR(4)

When HV1 is say "ABCD" query is very fast but when HV1 is say "ADEF" query is taking considerably high time.

Why is there such a big difference? What can be done to overcome this issue?

Actual query is little complex but EXPLAIN does show it is using the Index scan.

Regards,
Santosh
Back to top
View user's profile Send private message
haatvedt
Beginner


Joined: 14 Nov 2003
Posts: 66
Topics: 0
Location: St Cloud, Minnesota USA

PostPosted: Sat Jun 20, 2020 1:08 pm    Post subject: Reply with quote

Santosh,

I suspect that the data is skewed and does not have a uniform distribution. I have a few questions.

How many rows exist on the table with SECURITY = "ABCD" ?

How many rows exist on the table with SECURITY = "ADEF" ?

Does the program ever use a "%" in the host variable HV1 ? if not then change the predicate to use an equal condition as follows

WHERE SECURITY = :HV1
AND BRANCH BETWEEN '300' AND '600'
_________________
Chuck Haatvedt

email --> clastnameatcharterdotnet

(replace lastname, at, dot with appropriate
characters)
Back to top
View user's profile Send private message
rsantosh
Beginner


Joined: 20 Dec 2014
Posts: 38
Topics: 9

PostPosted: Sun Jun 21, 2020 6:55 am    Post subject: Reply with quote

Hi Chuck,

Program doesn't use %. Table has two index,Unique IX1 on SECURITY,BRANCH,ACCOUNT+some more columns and Unique IX2 on BRANCH,ACCOUNT,SECURITY+some more columns.

When we used SECURITY = :HV1, explain showed very high cost(500+) and use of Index IX1 so we replaced "="(equal) with "LIKE" to force db2 to go for INDEX IX2(explain showed very low cost 0.8).

But with SPUFI(same query) what we have observed is sometimes "LIKE" works better and sometimes "=".

Table has 80M rows.

Regards,
Santosh
Back to top
View user's profile Send private message
haatvedt
Beginner


Joined: 14 Nov 2003
Posts: 66
Topics: 0
Location: St Cloud, Minnesota USA

PostPosted: Sun Jun 21, 2020 5:54 pm    Post subject: Reply with quote

Santosh, optimizing SQL queries is not trivial, if you want to get more help on this perhaps you could send me an email (see my signature).

remember that the explain cost is an ESTIMATE based on the statistics that DB2 has available to it. it may be worthwhile to check and see how current the stats are on the table and indexes. the DBA should be able to help with this if you don't have access to that information.

it's difficult to determine the optimal access path without seeing the DB2 stats and also doing some queries to determine the distribution of the key values. also it would be helpful to know the cluster ratio of the 2 indexes you referred to.

it would appear that using SECURITY = :HV1 should be most efficient as it is matching on 2 columns on the X1 index (note it has an equal predicate on the first column and would be scanning on the second column). for X2 index it would only have a 1 column match and that would be scanning on BRANCH.

using SPUFI to do any type of performance testing is not reliable.

the best way to test the performance is to generate a file containing the values to be used in your SQL query. the file should contain at least a few thousand records with randomly generated values.

what i usually do is to create a small program which contain both version of the SQL statement that I am testing with a runtime switch which will determine which version of the SQL statement to execute.

then create a batch job with 2 steps both executing the test program with the same run time switch and input file. the create a second job with the same conditions other than the run time switch which should execute the other version of the SQL statement.

then run both jobs on the same LPAR and compare the cpu time of the second steps of the jobs.

the reason to use two steps is that the cpu time is highly dependent on whether the DB2 data is in memory or not. By running the steps twice in the same job, the second step of both jobs should have nearly the same benefit of having the data in memory as the first step would access the same data.

did you run a SELECT COUNT(*) on the two values of the SECURITY to see if the data is skewed ?
_________________
Chuck Haatvedt

email --> clastnameatcharterdotnet

(replace lastname, at, dot with appropriate
characters)
Back to top
View user's profile Send private message
rsantosh
Beginner


Joined: 20 Dec 2014
Posts: 38
Topics: 9

PostPosted: Tue Jun 23, 2020 11:34 am    Post subject: Reply with quote

Hi Chuck,

Thank you for the detailed info.

Today when I ran EXPLAIN on the query I saw it using same INDEX with both "LIKE" and "=(equal)" unlike the other day.

LIKE :HV1 vs = :HV1

I am puzzled now.

I am trying to see if I can get the EXPLAIN result after masking the data. I understand without the full result it would be difficult for anyone to provide solution.

Regards,
Santosh
Back to top
View user's profile Send private message
rsantosh
Beginner


Joined: 20 Dec 2014
Posts: 38
Topics: 9

PostPosted: Tue Jun 23, 2020 11:50 am    Post subject: Reply with quote

Just to add, the table has 80M rows as of today and everyday 20K+ rows get added to it. Would it be better to use REOPT option to get the access determined during run time?

Regards,Santosh
Back to top
View user's profile Send private message
rsantosh
Beginner


Joined: 20 Dec 2014
Posts: 38
Topics: 9

PostPosted: Wed Jun 24, 2020 12:02 pm    Post subject: Reply with quote

What are the cons of using bind option REOPT(ALWAYS)? Is it advisable to use this option?

Regards,
Santosh
Back to top
View user's profile Send private message
haatvedt
Beginner


Joined: 14 Nov 2003
Posts: 66
Topics: 0
Location: St Cloud, Minnesota USA

PostPosted: Wed Jun 24, 2020 3:58 pm    Post subject: Reply with quote

Santosh,

REOPT(ALWAYS) is something that I've seen used when the data is skewed. So far I don't know if you have shown that the data distribution is skewed. If the data is not skewed or you are access the more frequently occurring SECURITY & BRANCH ranges, then the extra overhead used to reoptimize the SQL statement would not result in any saving and would actually increase the cost. Again the way to determine this is to execute the SQL statement many times....

when looking at the explain data, did you make note of the number of matching columns ?

it would be helpful to know the cardinality of the SECURITY and BRANCH columns.

have you written a small batch program to test the performance of different versions of the query ?

remember that EXPLAIN only provides an ESTIMATE of the cost, the best way to test the performance of an SQL statement is to actually EXECUTE it many times to get an average cost. The query below will give you a list of the worst performing combinations of SECURITY / BRANCH combinations. This will highlight issues with skewed data which I suspect is contributing to the performance issues.

execute the following query to generate the test data

SELECT SECURITY, BRANCH, COUNT(*)
FROM TABLE1
GROUP BY SECURITY, BRANCH
ORDER BY 3 DESC
FETCH FIRST 500 ROWS ONLY
WITH UR;

then execute your test program in batch with the output of the query above concatenating the file from above 4 times so it will execute your SQL statement 2,000 times.
_________________
Chuck Haatvedt

email --> clastnameatcharterdotnet

(replace lastname, at, dot with appropriate
characters)
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Database All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group