Friday 25 January 2013

Sql tuning

Hi Guys,

In almost all the projects that i have worked for over the last few years one of the most common requirement in sql tuning .The database might be different but sql tuning is one of the most important things to know in Business Intellegence .It gives you the edge in any project that you are working .

Now this is a big topic and I am not DBA to know all the details but i am going to mention few points that everyone can look into and try to get their sql to perform better .I will also help you to understand the explain plan ( its quite diff to make use of it in few minutes).



Below TIP for finding query time is for Oracle 11g

One useful thing I found today .Usually we are using set timing on Or we are looking at elapsed time from sql developer.Which is not correct way of testing how much time a query takes to execute because oracle will store the result set in cache


USUAL way of looking at run time 




Consider we execute the same query 5 time then if at first run it takes 10 s then at 5th run it might take 0.5 s without us making any change to query .This is because of oracle cache.Use below statement to find the run time .Consider we run a statement 5 times then executions column will show 5.Its Best to look for time_taken when execution is 1

To identify your query just put a comment in the query UPPER(SQL_TEXT) like '%KAP%'


select LAST_LOAD_TIME, to_char(ELAPSED_TIME/1000, '999,999,999.000') || ' ms' , MODULE, SQL_TEXT elasped ,executions from v$sql
where UPPER(SQL_TEXT) like '%KAP%'
AND UPPER(MODULE) LIKE '%SQL%DEVE%'
  order by LAST_LOAD_TIME desc


Point to be noted in this time

1)      This time does not include time oracle takes to print result set to our screen
2)      Consider we run the same query 10 times then we need to divide the time taken by number of executions to get average time.

-------------------------------------------------------------------------------------------------------------------
Basic idea (irrespective of database)

1) Find the query that is causing the most trouble (delays)
a) Simple idea is to use select count(*) from ( select * from most trouble) .This will give you the time required for the query to run

2)Check the filters used . By reducing the amount of data we can speed up the query .Like if your query fetches last 4 years data check with business if some one really is using this 4 years data or can you make it to fetch only last 24 months .

3)Make sure table has indexes (Its very difficult to find project in which tables are not indexed properly but give it a shot ... )

4)Try to simplify query .Queries with nested subqueries,inline views perform poorly .

(most commonly found) ---here k is your inline view.
select k.dept,a.emp from emp a ,(select dept from xyz) k

5) Make use of WITH clause (Subquery factoring) this performs faster then your inline views.

WITH dept_cost as ( select dept from xyz)
select dept,a.emp from emp a,dept_cost

6)Most professional way of doing it is by Using explain plan.But since most developer do not have that level of db knowledge and would usaully will not like to go for it .I have discussed it at the last

7) Below are some points that i liked taken from the link below

Beginners Sql site


 Sometimes you may have more than one subqueries in your main query. Try to minimize the number of subquery block in your query.
For Example: Write the query as
SELECT name 
FROM employee 
WHERE (salary, age ) = (SELECT MAX (salary), MAX (age) 
FROM employee_details) 
AND dept = 'Electronics'; 
Instead of:
SELECT name 
FROM employee
WHERE salary = (SELECT MAX(salary) FROM employee_details) 
AND age = (SELECT MAX(age) FROM employee_details) 
AND emp_dept = 'Electronics';


Make use of IN and EXISTS efficiently .
a) IN has slower performance

This is not Always true it depends on your situation.  Sometimes in may be faster sometime Exists may be faster. Understand the situation first. Below is taken from ASKtom. To read full article

ASKtom website link for article

Well, the two are processed very very differently.

Select * from T1 where x in ( select y from T2 )

is typically processed as:

select * 
  from t1, ( select distinct y from t2 ) t2
 where t1.x = t2.y;

The subquery is evaluated, distinct'ed, indexed (or hashed or sorted) and then joined to 
the original table -- typically.


As opposed to 

select * from t1 where exists ( select null from t2 where y = x )

That is processed more like:


   for x in ( select * from t1 )
   loop
      if ( exists ( select null from t2 where y = x.x )
      then 
         OUTPUT THE RECORD
      end if
   end loop

It always results in a full scan of T1 whereas the first query can make use of an index 
on T1(x).


So, when is where exists appropriate and in appropriate?

Lets say the result of the subquery
    ( select y from T2 )

is "huge" and takes a long time.  But the table T1 is relatively small and executing ( 
select null from t2 where y = x.x ) is very very fast (nice index on t2(y)).  Then the 
exists will be faster as the time to full scan T1 and do the index probe into T2 could be 
less then the time to simply full scan T2 to build the subquery we need to distinct on.


Lets say the result of the subquery is small -- then IN is typicaly more appropriate.


If both the subquery and the outer table are huge -- either might work as well as the 
other -- depends on the indexes and other factors. 


IN Example

select ename from emp e
    where mgr in (select empno from emp where ename = 'KING');

Same as 

select e1.ename from emp e1,(select empno from emp where ename = 'KING') e2
    where e1.mgr = e2.empno;

The idea to use In is the result from inner query is less compared to outer query . In such cases In works well 

EXISTS

select ename from emp e
    where exists (select 0 from emp where e.mgr = empno and ename = 'KING');


My favorite is the below .As you will notice that we end up having distinct in our subqueries .Which slows the performance the most as first entire data is picked up and then grouped to find distinct .So best is to try to avoid having distinct especially in your inline views


 Use EXISTS instead of DISTINCT when using joins which involves tables having one-to-many relationship.
For Example: Write the query as
SELECT d.dept_id, d.dept 
FROM dept d 
WHERE EXISTS ( SELECT 'X' FROM employee e WHERE e.dept = d.dept);
Instead of:
SELECT DISTINCT d.dept_id, d.dept 
FROM dept d,employee e 
WHERE e.dept = e.dept;

For those who want to impr.ove their sql tuning skill further my idea is to start from Oracle basics how it works .I have noted few points in by article below



Adding Two more things to check 

I was facing issue that the seq.next_val was taking long time as the caching value was low. This was causing the inserts to take time. You can know this by lookin at v$sesion table and knowing on what is the session waiting for . It will be waiting long time on things like db_sequention_read then you can find out what is the object on which it is waiting . you can chek out below link 

http://cognossimplified.blogspot.com/2013/10/etl-tools-and-sessions-in-db.html

Using Wrong data type for index column like product_key = '40' . This causes oracle to skip using the index on product key make sure you are passing correct data type for indexed columns

With Clause for SQL tuning  

With clause is best to be used when you code is getting too long and you can easily divide it into parts. With clause speeds up the query by allowing oracle to store intermediate result in intermediate tables. The concept is similar to global temporary concept which was there before. But only difference is global temporary table would force oracle to create temporary tables. By using with clause oracle has the choice of creating temporary tables if it think its a good idea.

But we cannot always trust oracle. It will estimate the cost of not creating temp tables and it might think its good idea not to go for temp tables. In such cases you need to force oracle to go for temp tables to store the with clause result by using the hint /*+materialize*/ . I have obtained a query improvement of 75 seconds for huge volume by going for that hint.






Regarding Using Case statement

I had a query which was taking 130 seconds to execute. When I checked by building query block by block I observed case statement takes maximum time. Without case it hardly took 10 seconds. So here is what was happening

Select case( 1=1 and 2=2 and 3=3) a1,
          case( 1=3 and 5=4 and 6=6) b1
From
( select 1 , 2 , 3 from abc inner join with hhh on 1=8 )

So what I did was push the condition. Its easy for oracle to do row level calculation like suppose we have two columns 1 and 3 then its easier for oracle to compare ,  add, subtract and all that

Select a1,b1          
From
( select 1 , 2 , 3, ( 1=1 and 2=2 and 3=3) a1, case( 1=3 and 5=4 and 6=6) b1 from abc inner join with hhh on 1=8 )

How to tune by reading Explain pland and Statistics IO

This is the actual way of doing it.But it will take atleast a weeks reading to understand the basics and use it to help you solve your professional problems.Below are some of my notes for starting.

How to check the explain plan .

run the below statement 

explain plan for (select * from abc)

this will store the explain plan data in a plan table which can be accessed by using below statement

select * from table (dbms_xplan.display)

Go through below article it will give you some basic idea of explain plan.Understanding explain plan and using it to tune is a big task will take lot of patient ( days worth of work ).Not for the faint hearted .


Below are link for my articles on Oracle db basic and Reading Explain plans(For the experienced folks )




Below is link for Article on Temp space issue in DB which was causing performance issues for us. 


http://cognossimplified.blogspot.com/2014/01/lack-of-temp-space-causing-performance.html


Bitmap Indexes on Fact Table can improve performance a lot. Hints like star transformation in oracle has huge impact 

http://cognossimplified.blogspot.com/2013/07/sql-tuning-with-bitmap-indexes-and-star.html 

Sometime Cast conversion from varchar to integer has impact on performance ,Consider you fact stores result as varchar and you are summing on it. Oracle will convert it to int. Which will affect performance. So make sure you know the datatype of columns you are summing




No comments:

Post a Comment