Friday 20 June 2014

PX Deq Credit: send blkd causing Parallel queries to hang in Oracle


The main blocker in parallel queries in the Query coordinator, Suppose you want to sum on 500 million rows, Then even if you run in parallel , Each query coordinator will do the sum 
and then give result to query coordinator to further sum it. 

Now consider you have grouping on 2 columns, Then it has to sort the data (group) so the 
query coordinator has lot of work to do in case of 500 million. You will see in parallel plans
p->s which means parallel to sequential 

What this Send blocked error means is Query coordinator already has lot of work so it cant accept any more data from Slave processes and Slaves are waiting for Query coordinator to become free 

Solution - Try to write PL/SQL code using Table functions and DBMS_PARALLEL approach it allows you to linearly scale up

If you really want to make use of parallel , then best to go for procedure and parallelise
procedure using dbms_parallel . This is truly parallel in database

( Assumption -In most of cases you will never be slowed by read speed, 1 billion rows is not a
issue, if you are doing in TB then have to think)

Table Functions  (Very important for datawarehouse implementation and Next is dbms_parallel) 

http://docs.oracle.com/cd/B28359_01/appdev.111/b28425/pipe_paral_tbl.htm

How to make pl/sql go in parallel 

http://www.oracle.com/au/products/database/o30plsql-086044.html



We were facing issue with one of our Fact table population queries getting hanged in Datawarehouse. After long day of analysis below is the summary

1) Checked temp space , it was not increasing. Usually almost all parallel queries bypass buffer cache and they use temp space for most of the sorting , grouping , hash join,. PGA size is small though your PGA can be in GBs but a single process can use max 5% of it. So content gets spilled to temp space. So when temp space is not changing means nothing much is happening in query. It was stucked

2) The sorting , Hashing operation were not taking much part of temp space whatever their size was it was constant for more than 30 minutes

3) The v$session and V$session_wait both tables shows waits on below particular parallel events



Query coordinator ( QC)

Session are waiting on 3 parallel events  ( V$session, V$session_wait)

PX Deq: Execution Msg – Means slave process has finished execution waiting for message from QC to die
PX Deq: Table Q Normal – Means Consumers slaves waiting for producer slaves for data
PX Deq Credit: send blkd – Means slaves are not able to send data to QC they are blocked because QC has not finished processing earlier data. Reason being its parallel buffer pool is full

Root cause of issue – Too many parallel processing sending message to QC and QC not having enough buffer pool to accept those messages

Solution – Increasing buffer pool size for parallel (parallel_automatic_tuning) or reducing degree of parallelism. Can be done by removing default parallelism on tables and setting it manually





Select * From V$tempseg_Usage--------- Temp usage for query is low

select ----------------------------------------  none of the sorts are using high temp space
   a.username,
   sum(srt.blocks * 8 / 1024) "Used Space in MB",
   sum(srt.blocks * 8 / 1024)/1024 "Used Space in GB"
from
   v$session    a,
   v$sort_usage srt
where
   a.saddr = srt.session_addr
   and sid in ( select  sid FROM v$SESSION WHERE OSUSER= )
   Group By A.Username

SELECT p1 file#, p2 block#, p3 class#,EVENT-------------------- Session is waiting for 3 parallel events only
 FROM v$session_wait where sid in (
 Select  Sid From V$session
Where Osuser=''
AND USERNAME =
and blocking_session is null)


The parallel_automatic_tuning parameter sets the pool of buffer messaging. If the parameter is enabled, the buffer will be in the large pool or will be on shared pool whether the parameter is disabled. So if it is expected that your database run many parallel queries, consider fit correctly the pool size.