Tuesday, 22 February 2011

Usefulness of WITH Clause - SQL Simplicity & Performance Improvement


The WITH clause, or sub-query factoring clause, is part of the SQL-99 standard and was added into the Oracle SQL syntax in Oracle 9.2. Through this article we will try to find out how we can use WITH clause.


[Note: I'm not suggesting that the queries that I have stated below as examples are the best way to retrieve the required information. They merely demonstrate the use of the WITH clause.]

There are many times when we use same sub-query multiple times in a single SQL. Or we write multiple SQLs that run consecutively and that use same sub-query.

Using WITH clause we can run the sub-query once and use that as reference in the SQL multiple times. This reduces the complexity of the SQLs and improves the performance as well. Lets see one example here…
Example:
We have 2 tables as follows:
SQL> DESC TEST_EMP
Name     Type         Nullable Default Comments
-------- ------------ -------- ------- --------
EMP_NO   NUMBER(5)    Y                        
EMP_NAME VARCHAR2(50) Y                        

SQL>
SQL>
SQL> DESC TEST_EMP_SAL
Name      Type         Nullable Default Comments
--------- ------------ -------- ------- --------
FIN_YEAR  NUMBER(4)    Y                         
EMP_NO    NUMBER(5)    Y                        
SAL_MONTH VARCHAR2(3)  Y                        
MTH_SAL   NUMBER(10,2) Y                        
YTD_SAL   NUMBER(10,2) Y    
                    

The first table (TEST_EMP) stores employee information. The second table (TEST_EMP_SAL) stores the salary given to each employee for each month (MTH_SAL) and also the year to date salary for the employee for each month (YTD_SAL).

So, the data in TEST_EMP is as follows:
SQL> select * from TEST_EMP;
EMP_NO EMP_NAME
------ --------------------------------------------------
  1001 John
  1002 Peter
  1003 Allan
  1004 Steve
  1005 Mary

The data in TEST_EMP_SAL for an employee (e.g. EMP_NO = 1001) is as follows:
SQL> select * from TEST_EMP_SAL where emp_no = 1001;
FIN_YEAR EMP_NO SAL_MONTH      MTH_SAL      YTD_SAL
-------- ------ --------- ------------ ------------
    2009   1001 JAN            4000.00      4000.00
    2009   1001 FEB            4200.00      8200.00
    2009   1001 MAR            4300.00     12500.00
    2009   1001 APR            4100.00     16600.00
    2009   1001 MAY            4000.00     20600.00
    2009   1001 JUN            4000.00     24600.00
    2009   1001 JUL            4200.00     28800.00
    2009   1001 AUG            4300.00     33100.00
    2009   1001 SEP            4100.00     37200.00
    2009   1001 OCT            4000.00     41200.00
    2009   1001 NOV            4100.00     45300.00
    2009   1001 DEC            4000.00     49300.00


Now as per some requirement we need to select the MTH_SAL and YTD_SAL for each employee for all the months of the year in a tabular format like…
EMP_NO
EMP_NAME
MTH JAN SAL
MTH FEB SAL
...
…and
EMP_NO
EMP_NAME
YTD JAN SAL
YTD FEB SAL
...

So, we used the following SQL to generate the report:
-- Getting MTH Salary
SELECT t2.fin_year, t1.emp_no, t1.emp_name, 'MTH SALARY',
       sum(decode(t2.sal_month, 'JAN', mth_sal)) MTH_SAL_JAN,
       sum(decode(t2.sal_month, 'FEB', mth_sal)) MTH_SAL_FEB,
       sum(decode(t2.sal_month, 'MAR', mth_sal)) MTH_SAL_MAR,
       sum(decode(t2.sal_month, 'APR', mth_sal)) MTH_SAL_APR,
       sum(decode(t2.sal_month, 'MAY', mth_sal)) MTH_SAL_MAY,
       sum(decode(t2.sal_month, 'JUN', mth_sal)) MTH_SAL_JUN,
       sum(decode(t2.sal_month, 'JUL', mth_sal)) MTH_SAL_JUL,
       sum(decode(t2.sal_month, 'AUG', mth_sal)) MTH_SAL_AUG,
       sum(decode(t2.sal_month, 'SEP', mth_sal)) MTH_SAL_SEP,
       sum(decode(t2.sal_month, 'OCT', mth_sal)) MTH_SAL_OCT,
       sum(decode(t2.sal_month, 'NOV', mth_sal)) MTH_SAL_NOV,
       sum(decode(t2.sal_month, 'DEC', mth_sal)) MTH_SAL_DEC      
FROM  test_emp t1, test_emp_sal t2
WHERE t1.emp_no = t2.emp_no
AND t2.fin_year = 2009
GROUP BY t2.fin_year, t1.emp_no, t1.emp_name, 'MTH SALARY'
UNION ALL

-- Getting YTD Salary
SELECT t2.fin_year, t1.emp_no, t1.emp_name, 'YTD SALARY',
       sum(decode(t2.sal_month, 'JAN', ytd_sal)) YTD_SAL_JAN,
       sum(decode(t2.sal_month, 'FEB', ytd_sal)) YTD_SAL_FEB,
       sum(decode(t2.sal_month, 'MAR', ytd_sal)) YTD_SAL_MAR,
       sum(decode(t2.sal_month, 'APR', ytd_sal)) YTD_SAL_APR,
       sum(decode(t2.sal_month, 'MAY', ytd_sal)) YTD_SAL_MAY,
       sum(decode(t2.sal_month, 'JUN', ytd_sal)) YTD_SAL_JUN,
       sum(decode(t2.sal_month, 'JUL', ytd_sal)) YTD_SAL_JUL,
       sum(decode(t2.sal_month, 'AUG', ytd_sal)) YTD_SAL_AUG,
       sum(decode(t2.sal_month, 'SEP', ytd_sal)) YTD_SAL_SEP,
       sum(decode(t2.sal_month, 'OCT', ytd_sal)) YTD_SAL_OCT,
       sum(decode(t2.sal_month, 'NOV', ytd_sal)) YTD_SAL_NOV,
       sum(decode(t2.sal_month, 'DEC', ytd_sal)) YTD_SAL_DEC      
FROM  test_emp t1, test_emp_sal t2                  
WHERE t1.emp_no = t2.emp_no
AND t2.fin_year = 2009
GROUP BY t2.fin_year, t1.emp_no, t1.emp_name, 'YTD SALARY';

The EXPLAIN PLAN of the above SQL is as follows:
------------------------------------------------------------------------------------------
OPERATION           OBJECT_OWNER        OBJECT_NAME               COST   CARDIN.     BYTES
------------------------------------------------------------------------------------------
SELECT STATEMENT                                                    44        26       533
 SORT                                                               44        26       533
  UNION-ALL                                                                               
   SORT                                                             22        13       260
    HASH JOIN                                                        5        60      1200
     TABLE ACCESS   CDW_ODS             TEST_EMP                     2         5        40
     TABLE ACCESS   CDW_ODS             TEST_EMP_SAL                 2        60       720
   SORT                                                             22        13       273
    HASH JOIN                                                        5        60      1260
     TABLE ACCESS   CDW_ODS             TEST_EMP                     2         5        40
     TABLE ACCESS   CDW_ODS             TEST_EMP_SAL                 2        60       780
------------------------------------------------------------------------------------------


So, we are using the same kind of SQLs to fetch MTH and YTD data. Now using WITH clause we can re-write the SQL as follows:
WITH emp_sal_det AS
       -- Getting MTH and YTD Salary in one go
      (SELECT t2.fin_year, t1.emp_no, t1.emp_name,
       sum(decode(t2.sal_month, 'JAN', mth_sal)) MTH_SAL_JAN,
       sum(decode(t2.sal_month, 'FEB', mth_sal)) MTH_SAL_FEB,
       sum(decode(t2.sal_month, 'MAR', mth_sal)) MTH_SAL_MAR,
       sum(decode(t2.sal_month, 'APR', mth_sal)) MTH_SAL_APR,
       sum(decode(t2.sal_month, 'MAY', mth_sal)) MTH_SAL_MAY,
       sum(decode(t2.sal_month, 'JUN', mth_sal)) MTH_SAL_JUN,
       sum(decode(t2.sal_month, 'JUL', mth_sal)) MTH_SAL_JUL,
       sum(decode(t2.sal_month, 'AUG', mth_sal)) MTH_SAL_AUG,
       sum(decode(t2.sal_month, 'SEP', mth_sal)) MTH_SAL_SEP,
       sum(decode(t2.sal_month, 'OCT', mth_sal)) MTH_SAL_OCT,
       sum(decode(t2.sal_month, 'NOV', mth_sal)) MTH_SAL_NOV,
       sum(decode(t2.sal_month, 'DEC', mth_sal)) MTH_SAL_DEC,
       sum(decode(t2.sal_month, 'JAN', ytd_sal)) YTD_SAL_JAN,
       sum(decode(t2.sal_month, 'FEB', ytd_sal)) YTD_SAL_FEB,
       sum(decode(t2.sal_month, 'MAR', ytd_sal)) YTD_SAL_MAR,
       sum(decode(t2.sal_month, 'APR', ytd_sal)) YTD_SAL_APR,
       sum(decode(t2.sal_month, 'MAY', ytd_sal)) YTD_SAL_MAY,
       sum(decode(t2.sal_month, 'JUN', ytd_sal)) YTD_SAL_JUN,
       sum(decode(t2.sal_month, 'JUL', ytd_sal)) YTD_SAL_JUL,
       sum(decode(t2.sal_month, 'AUG', ytd_sal)) YTD_SAL_AUG,
       sum(decode(t2.sal_month, 'SEP', ytd_sal)) YTD_SAL_SEP,
       sum(decode(t2.sal_month, 'OCT', ytd_sal)) YTD_SAL_OCT,
       sum(decode(t2.sal_month, 'NOV', ytd_sal)) YTD_SAL_NOV,
       sum(decode(t2.sal_month, 'DEC', ytd_sal)) YTD_SAL_DEC             
FROM  test_emp t1, test_emp_sal t2
WHERE t1.emp_no = t2.emp_no
AND t2.fin_year = 2009
GROUP BY t2.fin_year, t1.emp_no, t1.emp_name)
SELECT fin_year, emp_no, emp_name, 'MTH SALARY',
       MTH_SAL_JAN, MTH_SAL_FEB, MTH_SAL_MAR, MTH_SAL_APR,
       MTH_SAL_MAY, MTH_SAL_JUN, MTH_SAL_JUL, MTH_SAL_AUG,
       MTH_SAL_SEP, MTH_SAL_OCT, MTH_SAL_NOV, MTH_SAL_DEC
FROM  emp_sal_det                 
UNION ALL
SELECT fin_year, emp_no, emp_name, 'YTD SALARY',
       YTD_SAL_JAN, YTD_SAL_FEB, YTD_SAL_MAR, YTD_SAL_APR,
       YTD_SAL_MAY, YTD_SAL_JUN, YTD_SAL_JUL, YTD_SAL_AUG,
       YTD_SAL_SEP, YTD_SAL_OCT, YTD_SAL_NOV, YTD_SAL_DEC
FROM  emp_sal_det;


The EXPLAIN PLAN of the above SQL is as follows:
------------------------------------------------------------------------------------------
OPERATION           OBJECT_OWNER        OBJECT_NAME               COST   CARDIN.     BYTES
------------------------------------------------------------------------------------------
SELECT STATEMENT                                                     4        26      5434
  RECURSIVE EXECUTIO                    SYS_LE_2_0                                       
 TEMP TABLE TRANSFOR                                                                     
  UNION-ALL                                                                               
   VIEW             CDW_ODS                                          2        13      2717
    TABLE ACCESS    SYS                 SYS_TEMP_0FD9D6602_7         2        13       312
   VIEW             CDW_ODS                                          2        13      2717
    TABLE ACCESS    SYS                 SYS_TEMP_0FD9D6602_7         2        13       312
------------------------------------------------------------------------------------------

Comparing the “Cost” in the EXPLAIN PLAN for both SQLs we may see that the SQL that has been written using “WITH” clause is having 10 times less cost andincreased performance by 3 times at least.

So, using WITH clause does not only reduces the complexity of the SQL, but also improves the performance by manifolds. 

I would highly recommend changing the codes that were written in before Oracle 9.2 and has been migrated to 9.2+ as part of migration or after migration to improve the performance quite drastically.

Magic of ROWNUM in improving performance of inline views



The inline view is a construct in Oracle SQL where you can place a query in the SQL FROM, clause, just as if the query was a table name.


A common use for in-line views in Oracle SQL is to simplify complex queries by removing join operations and condensing several separate queries into a single query.

Example of simple inline views:
Select t1.emp_no, t1.emp_name, t2.dept_name
from
      (Select emp_no, dept_no, emp_name from emp) t1, -- Inline view 1
      (Select dept_no, dept_name from dept) t2 -- Inline view 2
where t1.dept_no = t2.dept_no;

If the SQL is complex and uses inline views, then it takes longer time to execute. For example the following SQL – which is a bit complex one, took around 530 seconds to execute.

SELECT fbal.financial_year, fbal.org_unit_code, acc.account_code_n106, 
acc.PRODUCT_CODE, SUM(fbal.balance_jul), SUM(fbal.balance_aug),
SUM(fbal.balance_sep), SUM(fbal.balance_oct), SUM(fbal.balance_nov),
SUM(fbal.balance_dec), SUM(fbal.balance_jan), SUM(fbal.balance_feb),
SUM(fbal.balance_mar), SUM(fbal.balance_apr), SUM(fbal.balance_may),
SUM(fbal.balance_jun)
FROM FIN_MONTH_BALANCE  fbal,
      /* Inline view START */
(SELECT oa.financial_year, account_code_n106, 
            nvl(ods.product_code, oa.product_code) PRODUCT_CODE
      FROM ACCOUNTS_DET OA, PRODUCT_TYPES_SYNONYM ODS
      WHERE oa.product_code = ods.parent_product_code(+)
      AND oa. process_key = (SELECT MAX(PROCESS_KEY)
            FROM ACCOUNTS_DET WHERE financial_year =  2010)) acc
      /* Inline view END */
,ORG_UNIT_CODE_SYNONYM orgsyn
WHERE fbal.account_code_n106 = acc.account_code_n106
 and    fbal.financial_year = 2010
 AND    fbal.process_key = 2338051
 AND acc.account_code_n106 in (
      SELECT drlm.report_line FROM REPORT_LN_DIMENSION drlm,
      CUBE_REPLN_REFERENCE ocrr WHERE
      drlm.report_line_type ='A'
      AND CUBE_NAME='Mse'
      AND drlm.master_structure_code = ocrr.MASTER_STRUCTURE_CODE)
AND    orgsyn.from_year_org (+) = fbal.org_unit_code
AND    orgsyn.from_year (+) = 2010
AND    orgsyn.to_year (+) = 2010
GROUP BY fbal.financial_year, fbal.org_unit_code, acc.account_code_n106, 
acc.PRODUCT_CODE;

The Explain Plan for the SQL is as follows:
------------------------------------------------------------------------------------------
OPERATION           OBJECT_OWNER        OBJECT_NAME               COST   CARDIN.     BYTES
------------------------------------------------------------------------------------------
SELECT STATEMENT                                                 10514    198772  35580188
 SORT                                                             7637    105829  16720982
  HASH JOIN                                                       6386    105829  16720982
   HASH JOIN                                                      6183     93524  13654504
    HASH JOIN                                                     1471     14218    696682
     VIEW           CDW_ODS                                       1121     69806   2233792
      COUNT                                                                              
       HASH JOIN                                                  1121     69806   3699718
        TABLE ACCESSCDW_ODS             ACCOUNTS_DET              1085     69806   1884762
         SORT                                                                  1        10
          TABLE ACCECDW_ODS             ACCOUNTS_DET              1085     89750    897500
        TABLE ACCESSCDW_ODS             PRODUCT_TYPES_SYNONY         2       164      4264
     VIEW           SYS                 VW_NSO_1                   306     14218    241706
      HASH JOIN                                                    306     14218    625592
       TABLE ACCESS CDW_ODS             CUBE_REPLN_REFERENCE         2         2        38
       TABLE ACCESS CDW_ODS             REPORT_LN_DIMENSION        303    104034   2600850
    PARTITION RANGE                                                                      
     TABLE ACCESS   CDW_ODS             FIN_MONTH_BALANCE         4452    459175  44539975
   INDEX            CDW_ODS             INDEX1                      20      5920     71040
------------------------------------------------------------------------------------------
 
Without trying to re-write the SQL logic or doing anything else to improve the performance, I just added “ROWNUM” as one of the columns in the inline view. So, the modified SQL is as follows:
SELECT fbal.financial_year, fbal.org_unit_code, acc.account_code_n106, 
acc.PRODUCT_CODE, SUM(fbal.balance_jul), SUM(fbal.balance_aug),
SUM(fbal.balance_sep), SUM(fbal.balance_oct), SUM(fbal.balance_nov),
SUM(fbal.balance_dec), SUM(fbal.balance_jan), SUM(fbal.balance_feb),
SUM(fbal.balance_mar), SUM(fbal.balance_apr), SUM(fbal.balance_may),
SUM(fbal.balance_jun)
FROM FIN_MONTH_BALANCE  fbal,
      /* Inline view START */
(SELECT ROWNUM, oa.financial_year, account_code_n106, 
            nvl(ods.product_code, oa.product_code) PRODUCT_CODE
      FROM ACCOUNTS_DET OA, PRODUCT_TYPES_SYNONYM ODS
      WHERE oa.product_code = ods.parent_product_code(+)
      AND oa. process_key = (SELECT MAX(PROCESS_KEY)
            FROM ACCOUNTS_DET WHERE financial_year =  2010)) acc
      /* Inline view END */
,ORG_UNIT_CODE_SYNONYM orgsyn
WHERE fbal.account_code_n106 = acc.account_code_n106
 and    fbal.financial_year = 2010
 AND    fbal.process_key = 2338051
 AND acc.account_code_n106 in (
      SELECT drlm.report_line FROM REPORT_LN_DIMENSION drlm,
      CUBE_REPLN_REFERENCE ocrr WHERE
      drlm.report_line_type ='A'
      AND CUBE_NAME='Mse'
      AND drlm.master_structure_code = ocrr.MASTER_STRUCTURE_CODE)
AND    orgsyn.from_year_org (+) = fbal.org_unit_code
AND    orgsyn.from_year (+) = 2010
AND    orgsyn.to_year (+) = 2010
GROUP BY fbal.financial_year, fbal.org_unit_code, acc.account_code_n106, 
acc.PRODUCT_CODE;
 
Now the SQL got executed in 105 seconds – so it ran 5 times faster!!

The Explain Plan for this SQL is as follows:
------------------------------------------------------------------------------------------
OPERATION           OBJECT_OWNER        OBJECT_NAME               COST   CARDIN.     BYTES
------------------------------------------------------------------------------------------
SELECT STATEMENT                                                  7637    105829  16720982
 SORT                                                             7637    105829  16720982
  HASH JOIN                                                       6386    105829  16720982
   HASH JOIN                                                      6183     93524  13654504
    HASH JOIN                                                     1471     14218    696682
     VIEW           CDW_ODS                                       1121     69806   2233792
      COUNT                                                                               
       HASH JOIN                                                  1121     69806   3699718
        TABLE ACCESSCDW_ODS             ACCOUNTS_DET              1085     69806   1884762
         SORT                                                                  1        10
          TABLE ACCECDW_ODS             ACCOUNTS_DET              1085     89750    897500
        TABLE ACCESSCDW_ODS             PRODUCT_TYPES_SYNONY         2       164      4264
     VIEW           SYS                 VW_NSO_1                   306     14218    241706
      HASH JOIN                                                    306     14218    625592
       TABLE ACCESS CDW_ODS             CUBE_REPLN_REFERENCE         2         2        38
       TABLE ACCESS CDW_ODS             REPORT_LN_DIMENSION        303    104034   2600850
    PARTITION RANGE                                                                      
     TABLE ACCESS   CDW_ODS             FIN_MONTH_BALANCE         4452    459175  44539975
   INDEX            CDW_ODS             INDEX1                      20      5920     71040
------------------------------------------------------------------------------------------

Adding “ROWNUM” in inline views materializes the same improving the performance and IO many folds.