Friday, 13 June 2014

Understanding Explain plan for Parallel Queries in Oracle



Hi Guys

Basic

In the previous article we discussed the general rule for applying parallel hint Parallel Hint.  Parallel explain plan are more complex to read compared to serial plans. When we use hint paralle(2) . 4 Processes are created 2 consumers and 2 producers. The general assumption is if some work can be done by 1 process in 1 minute then it can be done by 2 processors in 1/2 minute or with 4 processes in 0.25 .But this is not true due to various things.  Recently I had faced issue with paralle queries. I found the articles by Randolf Geist and Jonathan lewis. Below are the link . This articles is my take away from their articles and what I found in my queries . 

When you create many parallel processes each process is assigned a PGA memory. Consider you have 128 processes than a lot of PGA memory will be used and this memory when full they will start to use  the Temp Space  thus slowing down the sql 


Article by Randolf Geist. ( Very Very good)


Degree of Parallelism

http://www.oracle.com/technetwork/issue-archive/2010/o40parallel-092275.html

Click on the Image or Save the Image if you are not able to read the Text writen on image 











The change involved moving the table join to a subquery. So basically the idea is make sure processes are working as expected. Read the below article about details on how parallel queries internally work

If degree of parallelism is set to 4 it means 8 parallel processes will be used. 4 as producers and 4 as consumers.When we see PX BLOCK ITERATOR Data is accessed not at partition level but at data block level. Since data is accesed at block level it requires distribution of data

In Order for Parallel queries to work efficiently

Efficient parallel execution plan

Good Link on Bloom filters




Some Notes after reading

Due to this Consumer / Producer model Oracle has to deal with the situation that both Parallel Slave Sets are busy (one producing, the other consuming data) but the data according to the execution plan has to be consumed by the next related parent operation. If this next operation is supposed to be executed by a separate Parallel Slave Set (you can tell this from the TQ column of the DBMS_XPLAN output) then there is effectively no slave set / process left that could consume the data, hence Oracle sometimes needs to revert to sync points (or blocking operations that otherwise wouldn't be blocking) where the data produced needs to be "parked" until one of the slave sets is available for picking up the data.

In recent releases of Oracle you can spot these blocking operations quite easily in the execution plan. Either these are separate operations (BUFFER SORT – not to be confused with regular BUFFER SORT operations that are also there in the serial version of the execution plan) or one of the existing operations is turned into a BUFFERED operation, like a HASH JOIN BUFFERED.

Whenever we say something is buffered it is writen to temp and read back.If the amount of data to buffer is large, it cannot be held in memory and therefore has to be written to temporary disk space, only to be re-read by / to be sent to the Parallel Slave set that is supposed to consume / pick up the data. And even if it can be held in memory, the additional PGA memory required holding the data can be significant.


What does this mean --- simply because Oracle cannot have more than two Parallel Slave sets active per Data Flow Operation ????

SQL work areas -- How has join tables are made ????

What is a Data flow operation


DFO means "Data Flow Operator". Actually, “queries” don’t run in parallel, it’s "data flow operations" (DFOs) that run in parallel, and a single query can be made up of several data flow operations. DFO tree is composite with DFOs, usually one query have one DFO tree. such as

The result from v$pq_tqstat:
 
You can just go through the Screenshot below. Follow the comments that all is needed to understand those plans











 
There are Three ways in which parallel join operations can be performed

First

One slave sets the reads the table as shown in example and one slave set joins the data as shown in the figure. Q1,00 reads the data and Q1,04 joins the data this requires buffering

Second
Look at the broadcast being done , it is received by Q1,06 which also does the full scan of the sec acss grp person. This method does not require buffering for hash joining as entire data has being broadcasted

Note -- If any time you see multiple DFO being used then you will see Q1, Q2 in the TQ column .

For Other two read the original article by oracle ACE Randolf Geist which I found helpful

Note

Window Sorts are Analytical functions

No comments:

Post a Comment