Hi Guys,
We are trying to bulk load 200 million rows into staging table from source by using Sql Loader direct path load (Bulk load). When we tried using explain plan on the sql fired by SQL loader it uses a hint called SYS_DL_CURSOR which does a conventional load.
Reason -- We were using a option in Pentaho which says load on the fly. So it reads the data from sql and starts loading into oracle as rows come in. The problem is you never have enough rows available for sql loader to go for Bulk load. so it goes for conventional load.
Solution -- Use Automatic load at the end. Any option which creates a Data file on the ETL server and then SQL loader will load from this data file. This will allow sql loader to go for direct load
Note -- The hint SYS_DL_CURSOR does not matter . Even if the explain plan shows that it is doing a conventional load. You should look into log file of Sql loader you will come to know whether it is doing direct load or conventional load. I have lot of confusion because the explain plan was showing conventional load.
Analysis which was flawed and you might even do this so just noting down. Even this did not work because it never had enough data to do direct load.
We are trying to bulk load 200 million rows into staging table from source by using Sql Loader direct path load (Bulk load). When we tried using explain plan on the sql fired by SQL loader it uses a hint called SYS_DL_CURSOR which does a conventional load.
Speeds worth to be mentioned ---SQL loader loads
180k pers seconds which is 11 million per minute.
Db links directly
can load -- 3.5 millions per minute . Loaded 180 million in 60 minutes. Some
time spent on sql read
Reason -- We were using a option in Pentaho which says load on the fly. So it reads the data from sql and starts loading into oracle as rows come in. The problem is you never have enough rows available for sql loader to go for Bulk load. so it goes for conventional load.
Solution -- Use Automatic load at the end. Any option which creates a Data file on the ETL server and then SQL loader will load from this data file. This will allow sql loader to go for direct load
Note -- The hint SYS_DL_CURSOR does not matter . Even if the explain plan shows that it is doing a conventional load. You should look into log file of Sql loader you will come to know whether it is doing direct load or conventional load. I have lot of confusion because the explain plan was showing conventional load.
Analysis which was flawed and you might even do this so just noting down. Even this did not work because it never had enough data to do direct load.
Note - I have tried to manually create SQL loader
control file and triggering it from the Linux server. Still it adds the hint
SQL loader control file (.ctl)
load data
infile '/archive/Load_trial.txt'
into table ABC
fields terminated by "," optionally
enclosed by '"'
( SRC,
CD ,
SRC_VAL_ID,
GG_CD ,
NMT_IND,
DENOM_IND,
NMT_XCLS_IND,
DENOM_XCLS_IND,
DENOM_XCPT_IND,
RSLT_ANS,
SR_EPSD_ID,
XCLS_RSN_CD,
VAL_TYP_CD,
CREAT_USER_ID,
CREAT_DTM DATE
'mm/dd/yy')
Command to load using SQL loader
sqlldr
userid=username/password@servicename
control=archive/Trial_load1.ctl
log=archive/Trial_logg.log DIRECT=TRUE
Alternate solution
Create a procedure using db links which loads data from source schema to target schema. Procedure will perform good because you are eliminating ETL server in the middle. But you will be limited by the speed of the db link connection. DBA will not agree because DB links are a security issue as they expose the data over the network. But keep the procedure ready as contigency measure if load fails and you want urgent load
Alternate Solution 2
Data pump. Pull the tables required from source schema to target schema using data pump and write procedure in target schema to load the data from these pulled source tables. Increadible speed can be achieved. 200 million records can be inserted in less than 1 hour in oracle.
Data pump is fast and can easily give you speed of 20 million per minute when creating a dump file for import/Export
Your blog is in a convincing manner, thanks for sharing such an information with lots of your effort and time
ReplyDeletedatastage online training
datastage online training India
datastage online training Hyderabad