Tuesday, 22 February 2011

Usefulness of WITH Clause - SQL Simplicity & Performance Improvement


The WITH clause, or sub-query factoring clause, is part of the SQL-99 standard and was added into the Oracle SQL syntax in Oracle 9.2. Through this article we will try to find out how we can use WITH clause.


[Note: I'm not suggesting that the queries that I have stated below as examples are the best way to retrieve the required information. They merely demonstrate the use of the WITH clause.]

There are many times when we use same sub-query multiple times in a single SQL. Or we write multiple SQLs that run consecutively and that use same sub-query.

Using WITH clause we can run the sub-query once and use that as reference in the SQL multiple times. This reduces the complexity of the SQLs and improves the performance as well. Lets see one example here…
Example:
We have 2 tables as follows:
SQL> DESC TEST_EMP
Name     Type         Nullable Default Comments
-------- ------------ -------- ------- --------
EMP_NO   NUMBER(5)    Y                        
EMP_NAME VARCHAR2(50) Y                        

SQL>
SQL>
SQL> DESC TEST_EMP_SAL
Name      Type         Nullable Default Comments
--------- ------------ -------- ------- --------
FIN_YEAR  NUMBER(4)    Y                         
EMP_NO    NUMBER(5)    Y                        
SAL_MONTH VARCHAR2(3)  Y                        
MTH_SAL   NUMBER(10,2) Y                        
YTD_SAL   NUMBER(10,2) Y    
                    

The first table (TEST_EMP) stores employee information. The second table (TEST_EMP_SAL) stores the salary given to each employee for each month (MTH_SAL) and also the year to date salary for the employee for each month (YTD_SAL).

So, the data in TEST_EMP is as follows:
SQL> select * from TEST_EMP;
EMP_NO EMP_NAME
------ --------------------------------------------------
  1001 John
  1002 Peter
  1003 Allan
  1004 Steve
  1005 Mary

The data in TEST_EMP_SAL for an employee (e.g. EMP_NO = 1001) is as follows:
SQL> select * from TEST_EMP_SAL where emp_no = 1001;
FIN_YEAR EMP_NO SAL_MONTH      MTH_SAL      YTD_SAL
-------- ------ --------- ------------ ------------
    2009   1001 JAN            4000.00      4000.00
    2009   1001 FEB            4200.00      8200.00
    2009   1001 MAR            4300.00     12500.00
    2009   1001 APR            4100.00     16600.00
    2009   1001 MAY            4000.00     20600.00
    2009   1001 JUN            4000.00     24600.00
    2009   1001 JUL            4200.00     28800.00
    2009   1001 AUG            4300.00     33100.00
    2009   1001 SEP            4100.00     37200.00
    2009   1001 OCT            4000.00     41200.00
    2009   1001 NOV            4100.00     45300.00
    2009   1001 DEC            4000.00     49300.00


Now as per some requirement we need to select the MTH_SAL and YTD_SAL for each employee for all the months of the year in a tabular format like…
EMP_NO
EMP_NAME
MTH JAN SAL
MTH FEB SAL
...
…and
EMP_NO
EMP_NAME
YTD JAN SAL
YTD FEB SAL
...

So, we used the following SQL to generate the report:
-- Getting MTH Salary
SELECT t2.fin_year, t1.emp_no, t1.emp_name, 'MTH SALARY',
       sum(decode(t2.sal_month, 'JAN', mth_sal)) MTH_SAL_JAN,
       sum(decode(t2.sal_month, 'FEB', mth_sal)) MTH_SAL_FEB,
       sum(decode(t2.sal_month, 'MAR', mth_sal)) MTH_SAL_MAR,
       sum(decode(t2.sal_month, 'APR', mth_sal)) MTH_SAL_APR,
       sum(decode(t2.sal_month, 'MAY', mth_sal)) MTH_SAL_MAY,
       sum(decode(t2.sal_month, 'JUN', mth_sal)) MTH_SAL_JUN,
       sum(decode(t2.sal_month, 'JUL', mth_sal)) MTH_SAL_JUL,
       sum(decode(t2.sal_month, 'AUG', mth_sal)) MTH_SAL_AUG,
       sum(decode(t2.sal_month, 'SEP', mth_sal)) MTH_SAL_SEP,
       sum(decode(t2.sal_month, 'OCT', mth_sal)) MTH_SAL_OCT,
       sum(decode(t2.sal_month, 'NOV', mth_sal)) MTH_SAL_NOV,
       sum(decode(t2.sal_month, 'DEC', mth_sal)) MTH_SAL_DEC      
FROM  test_emp t1, test_emp_sal t2
WHERE t1.emp_no = t2.emp_no
AND t2.fin_year = 2009
GROUP BY t2.fin_year, t1.emp_no, t1.emp_name, 'MTH SALARY'
UNION ALL

-- Getting YTD Salary
SELECT t2.fin_year, t1.emp_no, t1.emp_name, 'YTD SALARY',
       sum(decode(t2.sal_month, 'JAN', ytd_sal)) YTD_SAL_JAN,
       sum(decode(t2.sal_month, 'FEB', ytd_sal)) YTD_SAL_FEB,
       sum(decode(t2.sal_month, 'MAR', ytd_sal)) YTD_SAL_MAR,
       sum(decode(t2.sal_month, 'APR', ytd_sal)) YTD_SAL_APR,
       sum(decode(t2.sal_month, 'MAY', ytd_sal)) YTD_SAL_MAY,
       sum(decode(t2.sal_month, 'JUN', ytd_sal)) YTD_SAL_JUN,
       sum(decode(t2.sal_month, 'JUL', ytd_sal)) YTD_SAL_JUL,
       sum(decode(t2.sal_month, 'AUG', ytd_sal)) YTD_SAL_AUG,
       sum(decode(t2.sal_month, 'SEP', ytd_sal)) YTD_SAL_SEP,
       sum(decode(t2.sal_month, 'OCT', ytd_sal)) YTD_SAL_OCT,
       sum(decode(t2.sal_month, 'NOV', ytd_sal)) YTD_SAL_NOV,
       sum(decode(t2.sal_month, 'DEC', ytd_sal)) YTD_SAL_DEC      
FROM  test_emp t1, test_emp_sal t2                  
WHERE t1.emp_no = t2.emp_no
AND t2.fin_year = 2009
GROUP BY t2.fin_year, t1.emp_no, t1.emp_name, 'YTD SALARY';

The EXPLAIN PLAN of the above SQL is as follows:
------------------------------------------------------------------------------------------
OPERATION           OBJECT_OWNER        OBJECT_NAME               COST   CARDIN.     BYTES
------------------------------------------------------------------------------------------
SELECT STATEMENT                                                    44        26       533
 SORT                                                               44        26       533
  UNION-ALL                                                                               
   SORT                                                             22        13       260
    HASH JOIN                                                        5        60      1200
     TABLE ACCESS   CDW_ODS             TEST_EMP                     2         5        40
     TABLE ACCESS   CDW_ODS             TEST_EMP_SAL                 2        60       720
   SORT                                                             22        13       273
    HASH JOIN                                                        5        60      1260
     TABLE ACCESS   CDW_ODS             TEST_EMP                     2         5        40
     TABLE ACCESS   CDW_ODS             TEST_EMP_SAL                 2        60       780
------------------------------------------------------------------------------------------


So, we are using the same kind of SQLs to fetch MTH and YTD data. Now using WITH clause we can re-write the SQL as follows:
WITH emp_sal_det AS
       -- Getting MTH and YTD Salary in one go
      (SELECT t2.fin_year, t1.emp_no, t1.emp_name,
       sum(decode(t2.sal_month, 'JAN', mth_sal)) MTH_SAL_JAN,
       sum(decode(t2.sal_month, 'FEB', mth_sal)) MTH_SAL_FEB,
       sum(decode(t2.sal_month, 'MAR', mth_sal)) MTH_SAL_MAR,
       sum(decode(t2.sal_month, 'APR', mth_sal)) MTH_SAL_APR,
       sum(decode(t2.sal_month, 'MAY', mth_sal)) MTH_SAL_MAY,
       sum(decode(t2.sal_month, 'JUN', mth_sal)) MTH_SAL_JUN,
       sum(decode(t2.sal_month, 'JUL', mth_sal)) MTH_SAL_JUL,
       sum(decode(t2.sal_month, 'AUG', mth_sal)) MTH_SAL_AUG,
       sum(decode(t2.sal_month, 'SEP', mth_sal)) MTH_SAL_SEP,
       sum(decode(t2.sal_month, 'OCT', mth_sal)) MTH_SAL_OCT,
       sum(decode(t2.sal_month, 'NOV', mth_sal)) MTH_SAL_NOV,
       sum(decode(t2.sal_month, 'DEC', mth_sal)) MTH_SAL_DEC,
       sum(decode(t2.sal_month, 'JAN', ytd_sal)) YTD_SAL_JAN,
       sum(decode(t2.sal_month, 'FEB', ytd_sal)) YTD_SAL_FEB,
       sum(decode(t2.sal_month, 'MAR', ytd_sal)) YTD_SAL_MAR,
       sum(decode(t2.sal_month, 'APR', ytd_sal)) YTD_SAL_APR,
       sum(decode(t2.sal_month, 'MAY', ytd_sal)) YTD_SAL_MAY,
       sum(decode(t2.sal_month, 'JUN', ytd_sal)) YTD_SAL_JUN,
       sum(decode(t2.sal_month, 'JUL', ytd_sal)) YTD_SAL_JUL,
       sum(decode(t2.sal_month, 'AUG', ytd_sal)) YTD_SAL_AUG,
       sum(decode(t2.sal_month, 'SEP', ytd_sal)) YTD_SAL_SEP,
       sum(decode(t2.sal_month, 'OCT', ytd_sal)) YTD_SAL_OCT,
       sum(decode(t2.sal_month, 'NOV', ytd_sal)) YTD_SAL_NOV,
       sum(decode(t2.sal_month, 'DEC', ytd_sal)) YTD_SAL_DEC             
FROM  test_emp t1, test_emp_sal t2
WHERE t1.emp_no = t2.emp_no
AND t2.fin_year = 2009
GROUP BY t2.fin_year, t1.emp_no, t1.emp_name)
SELECT fin_year, emp_no, emp_name, 'MTH SALARY',
       MTH_SAL_JAN, MTH_SAL_FEB, MTH_SAL_MAR, MTH_SAL_APR,
       MTH_SAL_MAY, MTH_SAL_JUN, MTH_SAL_JUL, MTH_SAL_AUG,
       MTH_SAL_SEP, MTH_SAL_OCT, MTH_SAL_NOV, MTH_SAL_DEC
FROM  emp_sal_det                 
UNION ALL
SELECT fin_year, emp_no, emp_name, 'YTD SALARY',
       YTD_SAL_JAN, YTD_SAL_FEB, YTD_SAL_MAR, YTD_SAL_APR,
       YTD_SAL_MAY, YTD_SAL_JUN, YTD_SAL_JUL, YTD_SAL_AUG,
       YTD_SAL_SEP, YTD_SAL_OCT, YTD_SAL_NOV, YTD_SAL_DEC
FROM  emp_sal_det;


The EXPLAIN PLAN of the above SQL is as follows:
------------------------------------------------------------------------------------------
OPERATION           OBJECT_OWNER        OBJECT_NAME               COST   CARDIN.     BYTES
------------------------------------------------------------------------------------------
SELECT STATEMENT                                                     4        26      5434
  RECURSIVE EXECUTIO                    SYS_LE_2_0                                       
 TEMP TABLE TRANSFOR                                                                     
  UNION-ALL                                                                               
   VIEW             CDW_ODS                                          2        13      2717
    TABLE ACCESS    SYS                 SYS_TEMP_0FD9D6602_7         2        13       312
   VIEW             CDW_ODS                                          2        13      2717
    TABLE ACCESS    SYS                 SYS_TEMP_0FD9D6602_7         2        13       312
------------------------------------------------------------------------------------------

Comparing the “Cost” in the EXPLAIN PLAN for both SQLs we may see that the SQL that has been written using “WITH” clause is having 10 times less cost andincreased performance by 3 times at least.

So, using WITH clause does not only reduces the complexity of the SQL, but also improves the performance by manifolds. 

I would highly recommend changing the codes that were written in before Oracle 9.2 and has been migrated to 9.2+ as part of migration or after migration to improve the performance quite drastically.

No comments:

Post a Comment