Hi Guys,
One of the important things to learn about any ETL tool is how it interacts with Database.In any ETL job we have 3 things source transformation(lookup, join, rank etc) and target.Now when anyone says a job is running slow it can be something wrong with source , target or the transformation.
Now transformation is more tool specific how its maintaining cache , best approaches and all that.But source and target are DB dependent. There are number of ways of improving data intake from source and improving data insert into target.So in this article we will discuss that.
Below is a beautiful article on session in oracle and informatica
http://seethehippo.com/the-missing-piece-of-the-jigsaw-understanding-how-informatica-and-oracle-interact/
Other points of interest include
1) What is bulk loading why its is faster
2) Why storing index and table data in different disk speeds up performance
3) How having a large checkpoint interval speeds up performance
4) How much throughput can you expect from your oracle system. and if so what is the benchmark you are expecting for your ETL job ( Consider a simple source to target mapping ) both source and target system are different db .Target is Teradata and source is oracle.We have to load 25 million records ( original time in production system is 40 minutes )
http://oracle-online-help.blogspot.in/2006/12/top-sqls.html--- How to find session id of sql which is taking long time to run
Some General helpful commands
select event, wait_time, seconds_in_wait, state from v$session_wait
select event, total_waits, time_waited from v$session_event
select owner, object_name, object_type from dba_objects
select event, sum(time_waited) from v$active_session_history
select * v$sql
select * from v$process---- table only available to dba
select ses.sid SID,sqa.SQL_TEXT SQL from v$session ses, v$sqlarea sqa, v$process proc
where ses.paddr=proc.addr and ses.sql_hash_value=sqa.hash_value
and proc.spid=17480;--- this finds the SID by first identifying process id by using TOP processes in linux
select * from v$session -- from the machine name from which query was fired you can get session id
select * from v$session_longops where
sid = 17 and time_remaining > 0'
One of the important things to learn about any ETL tool is how it interacts with Database.In any ETL job we have 3 things source transformation(lookup, join, rank etc) and target.Now when anyone says a job is running slow it can be something wrong with source , target or the transformation.
Now transformation is more tool specific how its maintaining cache , best approaches and all that.But source and target are DB dependent. There are number of ways of improving data intake from source and improving data insert into target.So in this article we will discuss that.
Below is a beautiful article on session in oracle and informatica
http://seethehippo.com/the-missing-piece-of-the-jigsaw-understanding-how-informatica-and-oracle-interact/
Other points of interest include
1) What is bulk loading why its is faster
2) Why storing index and table data in different disk speeds up performance
3) How having a large checkpoint interval speeds up performance
4) How much throughput can you expect from your oracle system. and if so what is the benchmark you are expecting for your ETL job ( Consider a simple source to target mapping ) both source and target system are different db .Target is Teradata and source is oracle.We have to load 25 million records ( original time in production system is 40 minutes )
http://oracle-online-help.blogspot.in/2006/12/top-sqls.html--- How to find session id of sql which is taking long time to run
Some General helpful commands
select event, wait_time, seconds_in_wait, state from v$session_wait
select event, total_waits, time_waited from v$session_event
select owner, object_name, object_type from dba_objects
select event, sum(time_waited) from v$active_session_history
select * v$sql
select * from v$process---- table only available to dba
select ses.sid SID,sqa.SQL_TEXT SQL from v$session ses, v$sqlarea sqa, v$process proc
where ses.paddr=proc.addr and ses.sql_hash_value=sqa.hash_value
and proc.spid=17480;--- this finds the SID by first identifying process id by using TOP processes in linux
select * from v$session -- from the machine name from which query was fired you can get session id
select * from v$session_longops where
sid = 17 and time_remaining > 0'
If you are using sql developer. go to to tools > Monitor session you will get all active session and sql being fired.
select event, total_waits, time_waited from v$session_event where sid = 135
select event, sum(time_waited) from v$active_session_history where session_id =135 group by event
If you run the sql multiple times the session id will remain same so you will need to use sample time to distinguish old run and new run
select event, sum(time_waited),sample_time from v$active_session_history where session_id =135 group by event,sample_time
select owner, object_name, object_type
from dba_objects
where object_id in ( select ROW_WAIT_OBJ#
from v$session
where sid = 134 )
My Super Query to see where is the Bottleneck
select object_name, object_type,CURRENT_OBJ#, event, sum(time_waited) from v$active_session_history, dba_objects
where session_id =134
and object_id = CURRENT_OBJ#
group by object_name, object_type,CURRENT_OBJ#, event
order by 1
You can try to divide it by 1000 to get
select object_name, object_type,CURRENT_OBJ#, event, (sum(time_waited)/1000)/1000 from v$active_session_history, dba_objects
where session_id =134
and object_id = CURRENT_OBJ#
group by object_name, object_type,CURRENT_OBJ#, event
order by 1
CREATE SEQUENCE slow_perf
MINVALUE 1
START WITH 1
INCREMENT BY 1
NOCACHE
insert into order_details_stage (ORDER_SEQ_KEY,ORDER_DETAIL_CODE ,ORDER_NUMBER,SHIP_DATE,PRODUCT_NUMBER,
PROMOTION_CODE,QUANTITY,UNIT_COST,UNIT_PRICE,UNIT_SALE_PRICE) select slow_perf.nextval
,od.* from gosales.order_details od
------------------------------------------------------------
Query to monitor session got from sql developer ( only if you are not finding option to monitor session in tool you are using)
with vs as (select rownum rnum,
sid,
serial#,
status,
username,
last_call_et,
command,
machine,
osuser,
module,
action,
resource_consumer_group,
client_info,
client_identifier,
type,
terminal
from v$session)
select vs.sid ,serial# serial,
vs.username "Username",
case when vs.status = 'ACTIVE'
then last_call_et
else null end "Seconds in Wait",
decode(vs.command,
0,null,
1,'CRE TAB',
2,'INSERT',
3,'SELECT',
4,'CRE CLUSTER',
5,'ALT CLUSTER',
6,'UPDATE',
7,'DELETE',
8,'DRP CLUSTER',
9,'CRE INDEX',
10,'DROP INDEX',
11,'ALT INDEX',
12,'DROP TABLE',
13,'CRE SEQ',
14,'ALT SEQ',
15,'ALT TABLE',
16,'DROP SEQ',
17,'GRANT',
18,'REVOKE',
19,'CRE SYN',
20,'DROP SYN',
21,'CRE VIEW',
22,'DROP VIEW',
23,'VAL INDEX',
24,'CRE PROC',
25,'ALT PROC',
26,'LOCK TABLE',
28,'RENAME',
29,'COMMENT',
30,'AUDIT',
31,'NOAUDIT',
32,'CRE DBLINK',
33,'DROP DBLINK',
34,'CRE DB',
35,'ALTER DB',
36,'CRE RBS',
37,'ALT RBS',
38,'DROP RBS',
39,'CRE TBLSPC',
40,'ALT TBLSPC',
41,'DROP TBLSPC',
42,'ALT SESSION',
43,'ALT USER',
44,'COMMIT',
45,'ROLLBACK',
46,'SAVEPOINT',
47,'PL/SQL EXEC',
48,'SET XACTN',
49,'SWITCH LOG',
50,'EXPLAIN',
51,'CRE USER',
52,'CRE ROLE',
53,'DROP USER',
54,'DROP ROLE',
55,'SET ROLE',
56,'CRE SCHEMA',
57,'CRE CTLFILE',
58,'ALTER TRACING',
59,'CRE TRIGGER',
60,'ALT TRIGGER',
61,'DRP TRIGGER',
62,'ANALYZE TAB',
63,'ANALYZE IX',
64,'ANALYZE CLUS',
65,'CRE PROFILE',
66,'DRP PROFILE',
67,'ALT PROFILE',
68,'DRP PROC',
69,'DRP PROC',
70,'ALT RESOURCE',
71,'CRE SNPLOG',
72,'ALT SNPLOG',
73,'DROP SNPLOG',
74,'CREATE SNAP',
75,'ALT SNAP',
76,'DROP SNAP',
79,'ALTER ROLE',
79,'ALTER ROLE',
85,'TRUNC TAB',
86,'TRUNC CLUST',
88,'ALT VIEW',
91,'CRE FUNC',
92,'ALT FUNC',
93,'DROP FUNC',
94,'CRE PKG',
95,'ALT PKG',
96,'DROP PKG',
97,'CRE PKG BODY',
98,'ALT PKG BODY',
99,'DRP PKG BODY',
to_char(vs.command)) "Command",
vs.machine "Machine",
vs.osuser "OS User",
lower(vs.status) "Status",
vs.module "Module",
vs.action "Action",
vs.resource_consumer_group,
vs.client_info,
vs.client_identifier
from vs
where vs.USERNAME is not null
and nvl(vs.osuser,'x') <> 'SYSTEM'
and vs.type <> 'BACKGROUND'
order by 1
select event, total_waits, time_waited from v$session_event where sid = 135
select event, sum(time_waited) from v$active_session_history where session_id =135 group by event
If you run the sql multiple times the session id will remain same so you will need to use sample time to distinguish old run and new run
select event, sum(time_waited),sample_time from v$active_session_history where session_id =135 group by event,sample_time
select owner, object_name, object_type
from dba_objects
where object_id in ( select ROW_WAIT_OBJ#
from v$session
where sid = 134 )
My Super Query to see where is the Bottleneck
select object_name, object_type,CURRENT_OBJ#, event, sum(time_waited) from v$active_session_history, dba_objects
where session_id =134
and object_id = CURRENT_OBJ#
group by object_name, object_type,CURRENT_OBJ#, event
order by 1
You can try to divide it by 1000 to get
select object_name, object_type,CURRENT_OBJ#, event, (sum(time_waited)/1000)/1000 from v$active_session_history, dba_objects
where session_id =134
and object_id = CURRENT_OBJ#
group by object_name, object_type,CURRENT_OBJ#, event
order by 1
CREATE SEQUENCE slow_perf
MINVALUE 1
START WITH 1
INCREMENT BY 1
NOCACHE
insert into order_details_stage (ORDER_SEQ_KEY,ORDER_DETAIL_CODE ,ORDER_NUMBER,SHIP_DATE,PRODUCT_NUMBER,
PROMOTION_CODE,QUANTITY,UNIT_COST,UNIT_PRICE,UNIT_SALE_PRICE) select slow_perf.nextval
,od.* from gosales.order_details od
------------------------------------------------------------
Query to monitor session got from sql developer ( only if you are not finding option to monitor session in tool you are using)
with vs as (select rownum rnum,
sid,
serial#,
status,
username,
last_call_et,
command,
machine,
osuser,
module,
action,
resource_consumer_group,
client_info,
client_identifier,
type,
terminal
from v$session)
select vs.sid ,serial# serial,
vs.username "Username",
case when vs.status = 'ACTIVE'
then last_call_et
else null end "Seconds in Wait",
decode(vs.command,
0,null,
1,'CRE TAB',
2,'INSERT',
3,'SELECT',
4,'CRE CLUSTER',
5,'ALT CLUSTER',
6,'UPDATE',
7,'DELETE',
8,'DRP CLUSTER',
9,'CRE INDEX',
10,'DROP INDEX',
11,'ALT INDEX',
12,'DROP TABLE',
13,'CRE SEQ',
14,'ALT SEQ',
15,'ALT TABLE',
16,'DROP SEQ',
17,'GRANT',
18,'REVOKE',
19,'CRE SYN',
20,'DROP SYN',
21,'CRE VIEW',
22,'DROP VIEW',
23,'VAL INDEX',
24,'CRE PROC',
25,'ALT PROC',
26,'LOCK TABLE',
28,'RENAME',
29,'COMMENT',
30,'AUDIT',
31,'NOAUDIT',
32,'CRE DBLINK',
33,'DROP DBLINK',
34,'CRE DB',
35,'ALTER DB',
36,'CRE RBS',
37,'ALT RBS',
38,'DROP RBS',
39,'CRE TBLSPC',
40,'ALT TBLSPC',
41,'DROP TBLSPC',
42,'ALT SESSION',
43,'ALT USER',
44,'COMMIT',
45,'ROLLBACK',
46,'SAVEPOINT',
47,'PL/SQL EXEC',
48,'SET XACTN',
49,'SWITCH LOG',
50,'EXPLAIN',
51,'CRE USER',
52,'CRE ROLE',
53,'DROP USER',
54,'DROP ROLE',
55,'SET ROLE',
56,'CRE SCHEMA',
57,'CRE CTLFILE',
58,'ALTER TRACING',
59,'CRE TRIGGER',
60,'ALT TRIGGER',
61,'DRP TRIGGER',
62,'ANALYZE TAB',
63,'ANALYZE IX',
64,'ANALYZE CLUS',
65,'CRE PROFILE',
66,'DRP PROFILE',
67,'ALT PROFILE',
68,'DRP PROC',
69,'DRP PROC',
70,'ALT RESOURCE',
71,'CRE SNPLOG',
72,'ALT SNPLOG',
73,'DROP SNPLOG',
74,'CREATE SNAP',
75,'ALT SNAP',
76,'DROP SNAP',
79,'ALTER ROLE',
79,'ALTER ROLE',
85,'TRUNC TAB',
86,'TRUNC CLUST',
88,'ALT VIEW',
91,'CRE FUNC',
92,'ALT FUNC',
93,'DROP FUNC',
94,'CRE PKG',
95,'ALT PKG',
96,'DROP PKG',
97,'CRE PKG BODY',
98,'ALT PKG BODY',
99,'DRP PKG BODY',
to_char(vs.command)) "Command",
vs.machine "Machine",
vs.osuser "OS User",
lower(vs.status) "Status",
vs.module "Module",
vs.action "Action",
vs.resource_consumer_group,
vs.client_info,
vs.client_identifier
from vs
where vs.USERNAME is not null
and nvl(vs.osuser,'x') <> 'SYSTEM'
and vs.type <> 'BACKGROUND'
order by 1