MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Median in DB2

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Mainframe Challenge
View previous topic :: View next topic  
Author Message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 11794
Topics: 74
Location: San Jose

PostPosted: Mon May 02, 2005 10:09 am    Post subject: Median in DB2 Reply with quote

1. Write an Sql to find the median from a table.

The median is the point at which exactly half of the data(sorted in ascending sequence) are above and half below. These halves meet at the median position. If the number of observations is odd, the median fits perfectly and the depth of the median position will be a whole number. If the number of observations is even, the depth of the median position will include a decimal. You need to find the midpoint between the numbers on either side of the median position.

ex: consider the following numbers

Code:

col1
----
10
20
30
40
50
60
70
80
90


Here the total no: of records is 9 which is an odd number, so the median will be 50(exactly the middle value)

Now insert another row with a value of 100. Now the total no: of records is 10 which is an even number. So the median will the avg of 5th and 6th value. ie. (50 + 60) / 2 = 110/2 = 55

Kolusu
_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 11794
Topics: 74
Location: San Jose

PostPosted: Thu May 05, 2005 8:02 am    Post subject: Reply with quote

Here is a solution for the challenge.


Let us create a table and insert the values.

Code:

CREATE TABLE T1 (COL1           INTEGER);     

INSERT INTO T1 VALUES(10);
INSERT INTO T1 VALUES(20);
INSERT INTO T1 VALUES(30);
INSERT INTO T1 VALUES(40);
INSERT INTO T1 VALUES(50);
INSERT INTO T1 VALUES(60);
INSERT INTO T1 VALUES(70);
INSERT INTO T1 VALUES(80);
INSERT INTO T1 VALUES(90);


Basically we need to assign the seqnum to the column and pick up the middle value. That can be achieved with the following logic.

Code:

SELECT A.COL1 AS VAL1                                     
      ,COUNT(*)    AS CNT                                 
  FROM T1 A                                               
      ,T1 B                                               
 WHERE A.COL1 >= B.COL1                                   
 GROUP BY A.COL1                                           
 HAVING VALUE ( DECIMAL (COUNT(*)  ,7, 2  ) / 1 , 0 ) =   
                                INTEGER ( COUNT(*) / 1  );


This will produce
Code:

---+---------+---------+--
 VAL1          CNT       
---+---------+---------+--
   10            1       
   20            2       
   30            3       
   40            4       
   50            5       
   60            6       
   70            7       
   80            8       
   90            9       


Once we sequence the column it is easy to pick the middle one , as it is total count divided by 2. However you need to use the round function to get to the integer.

Here the no: of rows is odd , so the median in this case would be 50 which is the exact middle row.

Here is the sql to get it
Code:

SELECT AVG(B.VAL1)                                             
  FROM (SELECT DEC(A.COL1) AS VAL1                             
              ,COUNT(*)    AS CNT                             
          FROM T1 A                                           
              ,T1 B                                           
         WHERE A.COL1 >= B.COL1                               
         GROUP BY A.COL1                                       
         HAVING VALUE ( DECIMAL (COUNT(*)  ,7, 2  ) / 1 , 0 ) =
                                INTEGER ( COUNT(*) / 1  )) B   
 WHERE B.CNT = (SELECT ROUND(DEC(COUNT(*)+0)/2,0) FROM T1)     
    OR B.CNT = (SELECT ROUND(DEC(COUNT(*)+1)/2,0) FROM T1)     
   ;                                                           


The output is
Code:

  50.0000 


Now let us insert another row so that total no: of rows is even.

Code:

INSERT INTO T1 VALUES(100);


Now the median will be an avg of 50 and 60 which are the 5th and 6th rows. And we use the same sql shown above for odd numbers.

Code:

SELECT AVG(B.VAL1)                                             
  FROM (SELECT DEC(A.COL1) AS VAL1                             
              ,COUNT(*)    AS CNT                             
          FROM T1 A                                           
              ,T1 B                                           
         WHERE A.COL1 >= B.COL1                               
         GROUP BY A.COL1                                       
         HAVING VALUE ( DECIMAL (COUNT(*)  ,7, 2  ) / 1 , 0 ) =
                                INTEGER ( COUNT(*) / 1  )) B   
 WHERE B.CNT = (SELECT ROUND(DEC(COUNT(*)+0)/2,0) FROM T1)     
    OR B.CNT = (SELECT ROUND(DEC(COUNT(*)+1)/2,0) FROM T1)     
   ;                                                           


The output is

Code:

 55.0000


Kolusu
_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Mainframe Challenge All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group