Median in DB2
Select messages from
# through # FAQ
[/[Print]\]

MVSFORUMS.com -> Mainframe Challenge

#1: Median in DB2 Author: kolusuLocation: San Jose PostPosted: Mon May 02, 2005 10:09 am
    —
1. Write an Sql to find the median from a table.

The median is the point at which exactly half of the data(sorted in ascending sequence) are above and half below. These halves meet at the median position. If the number of observations is odd, the median fits perfectly and the depth of the median position will be a whole number. If the number of observations is even, the depth of the median position will include a decimal. You need to find the midpoint between the numbers on either side of the median position.

ex: consider the following numbers

Code:

col1
----
10
20
30
40
50
60
70
80
90


Here the total no: of records is 9 which is an odd number, so the median will be 50(exactly the middle value)

Now insert another row with a value of 100. Now the total no: of records is 10 which is an even number. So the median will the avg of 5th and 6th value. ie. (50 + 60) / 2 = 110/2 = 55

Kolusu

#2:  Author: kolusuLocation: San Jose PostPosted: Thu May 05, 2005 8:02 am
    —
Here is a solution for the challenge.


Let us create a table and insert the values.

Code:

CREATE TABLE T1 (COL1           INTEGER);     

INSERT INTO T1 VALUES(10);
INSERT INTO T1 VALUES(20);
INSERT INTO T1 VALUES(30);
INSERT INTO T1 VALUES(40);
INSERT INTO T1 VALUES(50);
INSERT INTO T1 VALUES(60);
INSERT INTO T1 VALUES(70);
INSERT INTO T1 VALUES(80);
INSERT INTO T1 VALUES(90);


Basically we need to assign the seqnum to the column and pick up the middle value. That can be achieved with the following logic.

Code:

SELECT A.COL1 AS VAL1                                     
      ,COUNT(*)    AS CNT                                 
  FROM T1 A                                               
      ,T1 B                                               
 WHERE A.COL1 >= B.COL1                                   
 GROUP BY A.COL1                                           
 HAVING VALUE ( DECIMAL (COUNT(*)  ,7, 2  ) / 1 , 0 ) =   
                                INTEGER ( COUNT(*) / 1  );


This will produce
Code:

---+---------+---------+--
 VAL1          CNT       
---+---------+---------+--
   10            1       
   20            2       
   30            3       
   40            4       
   50            5       
   60            6       
   70            7       
   80            8       
   90            9       


Once we sequence the column it is easy to pick the middle one , as it is total count divided by 2. However you need to use the round function to get to the integer.

Here the no: of rows is odd , so the median in this case would be 50 which is the exact middle row.

Here is the sql to get it
Code:

SELECT AVG(B.VAL1)                                             
  FROM (SELECT DEC(A.COL1) AS VAL1                             
              ,COUNT(*)    AS CNT                             
          FROM T1 A                                           
              ,T1 B                                           
         WHERE A.COL1 >= B.COL1                               
         GROUP BY A.COL1                                       
         HAVING VALUE ( DECIMAL (COUNT(*)  ,7, 2  ) / 1 , 0 ) =
                                INTEGER ( COUNT(*) / 1  )) B   
 WHERE B.CNT = (SELECT ROUND(DEC(COUNT(*)+0)/2,0) FROM T1)     
    OR B.CNT = (SELECT ROUND(DEC(COUNT(*)+1)/2,0) FROM T1)     
   ;                                                           


The output is
Code:

  50.0000 


Now let us insert another row so that total no: of rows is even.

Code:

INSERT INTO T1 VALUES(100);


Now the median will be an avg of 50 and 60 which are the 5th and 6th rows. And we use the same sql shown above for odd numbers.

Code:

SELECT AVG(B.VAL1)                                             
  FROM (SELECT DEC(A.COL1) AS VAL1                             
              ,COUNT(*)    AS CNT                             
          FROM T1 A                                           
              ,T1 B                                           
         WHERE A.COL1 >= B.COL1                               
         GROUP BY A.COL1                                       
         HAVING VALUE ( DECIMAL (COUNT(*)  ,7, 2  ) / 1 , 0 ) =
                                INTEGER ( COUNT(*) / 1  )) B   
 WHERE B.CNT = (SELECT ROUND(DEC(COUNT(*)+0)/2,0) FROM T1)     
    OR B.CNT = (SELECT ROUND(DEC(COUNT(*)+1)/2,0) FROM T1)     
   ;                                                           


The output is

Code:

 55.0000


Kolusu



MVSFORUMS.com -> Mainframe Challenge


output generated using printer-friendly topic mod. All times are GMT - 5 Hours

Page 1 of 1

Powered by phpBB © 2001, 2005 phpBB Group