Hi Guys,
This is Part 3 of Oracle database concepts .For Part 1 and Part 2 use links below
http://cognossimplified.blogspot.in/2013/07/oracle-database-concepts.html
http://cognossimplified.blogspot.in/2013/07/oracle-database-concepts-ii.html
Read this detailed article from oracle
http://docs.oracle.com/cd/A87860_01/doc/server.817/a76956/partiti.htm
There are 3 types for each
Difference between a index and rowid
What are hash joins
In HASH JOIN method of oracle, HASH TABLE will be built on one of the tables and other will be joined depending on the values in the hash table.
So, what is it? it's just a bunch of key-value pairs. The data is stored as a linked list with head nodes that group the data by the use of something called HashCode to find things faster. Something like this:
a -> b -> c
Any Bitter Class
Array Bold Count
A hash table is a table where you can store stuff by the use of a key. It is like an array but stores things differently
what are bind variables(They are generally for OLTP not Datawarehouse)
To understand bind variables, consider an application that generates thousands of SELECT statements against a table; for example:
SELECT fname, lname, pcode FROM cust WHERE id = 674;
SELECT fname, lname, pcode FROM cust WHERE id = 234;
SELECT fname, lname, pcode FROM cust WHERE id = 332;
Each time the query is submitted, Oracle first checks in the shared pool to see whether this statement has been submitted before. If it has, the execution plan that this statement previously used is retrieved, and the SQL is executed. If the statement cannot be found in the shared pool, Oracle has to go through the process of parsing the statement, working out the various execution paths and coming up with an optimal access plan before it can be executed. This process is know as a «hard parse» and for OLTP applications can actually take longer to carry out that the DML instruction itself.
When looking for a matching statement in the shared pool, only statements that exactly match the text of the statements are considered; so, if every SQL statement you submit is unique (in that the predicate changes each time, from id = 674 to id=234 and so on) then you'll never get a match, and every statement you submit will need to be hard parsed. Hard parsing is very CPU intensive, and involves obtaining latches on key shared memory areas, which whilst it might not affect a single program running against a small set of data, can bring a multi-user system to it's knees if hundreds of copies of the program are trying to hard parse statements at the same time. The extra bonus with this problem is that contention caused by hard parsing is pretty much immune to measures such as increasing available memory, numbers of processors and so on, as hard parsing statements is one thing Oracle can't do concurrently with many other operations, and it's a problem that often only comes to light when trying to scale up a development system from a single user working on subset of records to many hundreds of users working on a full data set.
The way to get Oracle to reuse the execution plans for these statements is to use bind variables. Bind variables are «substituion» variables that are used in place of literals (such as 674, 234, 332) and that have the effect of sending exactly the same SQL to Oracle every time the query is executed. For example, in our application, we would just submit
SELECT fname, lname, pcode FROM cust WHERE id = :cust_no;
and this time we would be able to reuse the execution plan every time, reducing the latch activity in the SGA, and therefore the total CPU activity, which has the effect of allowing our application to scale up to many users on a large dataset
Conclusion on Bind Variables
Lastly, it's worth bearing in mind that there are some instances where bind variables are probably not appropriate, usually where instead of your query being executed many times a second (as with OLTP systems) your query in fact actually takes several seconds, or minutes, or hours to execute - a situation you get in decision support and data warehousing. In this instance, the time taken to hard parse your query is only a small proportion of the total query execution time, and the benefit of avoiding a hard parse is probably outweighed by the reduction in important information you're making available to the query optimizer - by substituting the actual predicate with a bind variable, you're removing the ability for the optimiser to compare your value with the data distribution in the column, which might make it opt for a full table scan or an index when this isn't appropriate. Oracle 9i helps deal with this using a feature known as bind variable peeking, which allows Oracle to look at the value behind a bind variable to help choose the best execution plan.
Another potential drawback with bind variables and data warehousing queries is that the use of bind variables disallows the potential for star transformations, taking away this powerful option for efficiently joining fact and dimension tables in a star schema.
TableSpace
Oracle has tablespaces inside which tables , index , materialised views are created
I am not clear how data can be kept in different tablespaces for a partitions
Below are types of tablespaces
Permanent tablespaces
Permanent tablespaces are used to store user data and user created objects like tables, indexes and materialized views. Sample create statements:
Temp tablespaces
Temp or temporary tablespaces are used to store data with short lifespan (transient data), for example: global temporarily tables or sort results.
Undo tablespaces
Undo tablespaces are used to store "before image" data that can be used to undo transactions. See ROLLBACK.This needs to be understood in more detail as in datawarehousing tools we have commit intervals which are related to this. You have set commit interval such that you are making maximum use of your undo tablespace.
Locking on Tables -- Have you ever noticed ORA-00054 -No wait on resource.
consider you are loading a table through ETL tool and you try to drop it you will get this error. You can also get this error while loading data when you try to drop a index. I will try to update more on locking later
How to create a dummy table
INSERT INTO
T1
SELECT
ROWNUM,
DECODE(MOD(ROWNUM,5),0,NULL,ROWNUM),
DECODE(MOD(ROWNUM,5),0,NULL,ROWNUM),
TRUNC(SYSDATE)+TRUNC(ROWNUM/100),
DECODE(MOD(ROWNUM,5),0,NULL,TRUNC(SYSDATE)+TRUNC(ROWNUM/100)),
DECODE(MOD(ROWNUM,5),0,NULL,TRUNC(SYSDATE)+TRUNC(ROWNUM/100)),
RPAD(CHR(MOD(ROWNUM-1,26)+65)||
CHR(MOD(ROWNUM,26)+65)||
CHR(MOD(ROWNUM+1,26)+65)||
CHR(MOD(ROWNUM+2,26)+65),20,'A'),
DECODE(MOD(ROWNUM,5),0,NULL,
RPAD(CHR(MOD(ROWNUM-1,26)+65)||
CHR(MOD(ROWNUM,26)+65)||
CHR(MOD(ROWNUM+1,26)+65)||
CHR(MOD(ROWNUM+2,26)+65),20,'A')),
DECODE(MOD(ROWNUM,5),0,NULL,
RPAD(CHR(MOD(ROWNUM-1,26)+65)||
CHR(MOD(ROWNUM,26)+65)||
CHR(MOD(ROWNUM+1,26)+65)||
CHR(MOD(ROWNUM+2,26)+65),20,'A'))
FROM
DUAL
CONNECT BY
LEVEL<=200000;
CREATE TABLE dummy (USERID NUMBER,
USERNAME VARCHAR2(20),
USERPWD VARCHAR2(10),
EMAILID VARCHAR2(20),
FIRSTNAME VARCHAR2(10),
LASTNAME VARCHAR2(10),
ISACTIVE NUMBER)
select count(*) from dummy
commit
INSERT INTO dummy
SELECT srl,
name,
pwd,
LOWER(SUBSTR(name, 1, 10)) || '@abc.com',
SUBSTR(name, 1, 10),
SUBSTR(name, 11, 20),
1
FROM (
SELECT level srl,
dbms_random.string('U', 20) name,
dbms_random.string('A', 10) pwd
FROM DUAL
CONNECT BY LEVEL <= 10000000)
Disclaimer and Citations
The content here is taken from various sources found by googling. I have given links wherever possible.For me i dont need in detail information so i have copy pasted the basic information for my use.Also taken are comments from blogs , forum. I have added lot of information according to my understanding of subjects. If anyone finds anything objectionable please leave a comment.
This is Part 3 of Oracle database concepts .For Part 1 and Part 2 use links below
http://cognossimplified.blogspot.in/2013/07/oracle-database-concepts.html
http://cognossimplified.blogspot.in/2013/07/oracle-database-concepts-ii.html
Partition in Oracle
They are very important in datawarehouse.Especially for facts as most of the time we are playing with date ranges and partition can speed up the query
Read this detailed article from oracle
http://docs.oracle.com/cd/A87860_01/doc/server.817/a76956/partiti.htm
There are two type of partitioning
- Single level
- Composite
There are 3 types for each
- Range
- Hash
- List
Hash is used when there is no obvious partitioning key available makes sure each partition has same amount of data
Range to be used when you have a fact which keeps montly data so Jan 2013 , Feb2013 and so on each partition for each of the months.
what is partition pruning
Partition pruning dramatically reduces the amount of data retrieved from disk and shortens processing time, thus improving query performance and optimizing resource utilization. If you partition the index and table on different columns (with a global partitioned index), then partition pruning also eliminates index partitions even
when the partitions of the underlying table cannot be eliminated.
Depending upon the actual SQL statement, Oracle Database may use static or dynamic pruning. Static pruning occurs at compile-time, with the information about the partitions accessed beforehand. Dynamic pruning occurs at run-time, meaning that the exact partitions to be accessed by a statement are not known beforehand. A sample scenario for static pruning is a SQL statement containing a WHERE condition with a constant literal on the partition key column. An example of dynamic pruning is the use of operators or functions in the WHERE condition.Partition pruning affects the statistics of the objects where pruning occurs and also affects the execution plan of a statement.
Below sql will give which tables in your schema are partitioned
select * from USER_PART_TABLES
select * from USER_PART_TABLES
Very good article on partition pruning
http://www.orafaq.com/tuningguide/partition%20prune.html
Types of dimensions in datawarehouse
create table
part_trial (col1 int , col2 int )
partition by range (col1)
( partition p1 values less than (1)
)
alter table
part_trial add
partition p2 values less than (3)
While creating table
you need to have a table with partition then only you can add one more
partition to it. And note in the alter statement look we are not specifying the
range keyword
insert into
part_trial values (1,1);
insert into
part_trial values (1,2);
insert into
part_trial values (1,3);
insert into
part_trial values (1,4);
insert into
part_trial values (1,5);
insert into
part_trial values (1,6);
explain plan for
select * from part_trial
where col1 = 1
Notice that it
accesses only the single partition which is required
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows
| Bytes | Cost (%CPU)| Time |
Pstart| Pstop |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6 |
156 | 14 (0)| 00:00:01 | |
|
| 1 |
PARTITION RANGE SINGLE|
| 6 | 156 |
14 (0)| 00:00:01 | 2 |
2 |
|* 2 |
TABLE ACCESS FULL | PART_TRIAL
| 6 | 156 |
14 (0)| 00:00:01 | 2 |
2 |
-----------------------------------------------------------------------------------------------------
execute
dbms_stats.gather_table_stats(user,'trial')
select
partition_position, partition_name, num_rows
from user_tab_partitions where
table_name = 'TRIAL'
alter table TRIAL add partition
Below 7 points are taken from ASK tom question on hash partitioning
1) Hash partitions number is always in power of two, yes - 2, 4, 8, 16, 32, 64, ... and NO other numbers in between.
2) Else the data will be skewed across partitions with most of the data in the "middle" partitions and little data on the end partitions.
3)Also, a hash key should be fairly unique - lots and lots and lots of distinct values, else, you won't achieve even distribution of data
4) it is not record driven. 1,000,000 records might be 10mb, 100mb, 1tb - it is about volume.
5)What is the biggest segment you want to have?
What are you trying to accomplish with partitioning - what is your reason for using it - that will drive your partition sizes and your partitioning scheme.
6)understand WHY you are partitioning. then and only then APPLY it
7) when you APPLY it, make sure you understand the ramifications of doing so and consider each and ever index independent of every other index. eg: say this is an employee table and you hash partition into 16 partitions by empno.
Say you have an index on last_name. Say you just locally partition that index on last_name
when you query: select * from t where last_name=:x
you will do 16 index range scans - you will do 16 times the work!
you would have wanted to either
a) not partition that index
b) partition that index by range - A-K go into this partition, L-Z go into this one
c) hash partition it (if you always use equals)
Below 7 points are taken from ASK tom question on hash partitioning
1) Hash partitions number is always in power of two, yes - 2, 4, 8, 16, 32, 64, ... and NO other numbers in between.
2) Else the data will be skewed across partitions with most of the data in the "middle" partitions and little data on the end partitions.
3)Also, a hash key should be fairly unique - lots and lots and lots of distinct values, else, you won't achieve even distribution of data
4) it is not record driven. 1,000,000 records might be 10mb, 100mb, 1tb - it is about volume.
5)What is the biggest segment you want to have?
What are you trying to accomplish with partitioning - what is your reason for using it - that will drive your partition sizes and your partitioning scheme.
6)understand WHY you are partitioning. then and only then APPLY it
7) when you APPLY it, make sure you understand the ramifications of doing so and consider each and ever index independent of every other index. eg: say this is an employee table and you hash partition into 16 partitions by empno.
Say you have an index on last_name. Say you just locally partition that index on last_name
when you query: select * from t where last_name=:x
you will do 16 index range scans - you will do 16 times the work!
you would have wanted to either
a) not partition that index
b) partition that index by range - A-K go into this partition, L-Z go into this one
c) hash partition it (if you always use equals)
Types of dimensions in datawarehouse
Before analysing explain plan indexes please have a look at the type of dimension implemented in your datawarehouse.
Below is a wikipedia link. I cant explain better than that
Type 1 -- No history information
Type 2 -- Has a column star date and End date
Type 3 - Adds one addition column last value , last to last value.Its number of columns will increase with each addition of history information
Difference between a index and rowid
- Rowid is a part of table it gives the direct address of the row.
- whereas indexes are separate from table they contain the block address of a row.
- Oracle needs to read the full block and discard the unnecessary rows .
What are hash joins
In HASH JOIN method of oracle, HASH TABLE will be built on one of the tables and other will be joined depending on the values in the hash table.
So, what is it? it's just a bunch of key-value pairs. The data is stored as a linked list with head nodes that group the data by the use of something called HashCode to find things faster. Something like this:
a -> b -> c
Any Bitter Class
Array Bold Count
A hash table is a table where you can store stuff by the use of a key. It is like an array but stores things differently
what are bind variables(They are generally for OLTP not Datawarehouse)
To understand bind variables, consider an application that generates thousands of SELECT statements against a table; for example:
SELECT fname, lname, pcode FROM cust WHERE id = 674;
SELECT fname, lname, pcode FROM cust WHERE id = 234;
SELECT fname, lname, pcode FROM cust WHERE id = 332;
Each time the query is submitted, Oracle first checks in the shared pool to see whether this statement has been submitted before. If it has, the execution plan that this statement previously used is retrieved, and the SQL is executed. If the statement cannot be found in the shared pool, Oracle has to go through the process of parsing the statement, working out the various execution paths and coming up with an optimal access plan before it can be executed. This process is know as a «hard parse» and for OLTP applications can actually take longer to carry out that the DML instruction itself.
When looking for a matching statement in the shared pool, only statements that exactly match the text of the statements are considered; so, if every SQL statement you submit is unique (in that the predicate changes each time, from id = 674 to id=234 and so on) then you'll never get a match, and every statement you submit will need to be hard parsed. Hard parsing is very CPU intensive, and involves obtaining latches on key shared memory areas, which whilst it might not affect a single program running against a small set of data, can bring a multi-user system to it's knees if hundreds of copies of the program are trying to hard parse statements at the same time. The extra bonus with this problem is that contention caused by hard parsing is pretty much immune to measures such as increasing available memory, numbers of processors and so on, as hard parsing statements is one thing Oracle can't do concurrently with many other operations, and it's a problem that often only comes to light when trying to scale up a development system from a single user working on subset of records to many hundreds of users working on a full data set.
The way to get Oracle to reuse the execution plans for these statements is to use bind variables. Bind variables are «substituion» variables that are used in place of literals (such as 674, 234, 332) and that have the effect of sending exactly the same SQL to Oracle every time the query is executed. For example, in our application, we would just submit
SELECT fname, lname, pcode FROM cust WHERE id = :cust_no;
and this time we would be able to reuse the execution plan every time, reducing the latch activity in the SGA, and therefore the total CPU activity, which has the effect of allowing our application to scale up to many users on a large dataset
Conclusion on Bind Variables
Lastly, it's worth bearing in mind that there are some instances where bind variables are probably not appropriate, usually where instead of your query being executed many times a second (as with OLTP systems) your query in fact actually takes several seconds, or minutes, or hours to execute - a situation you get in decision support and data warehousing. In this instance, the time taken to hard parse your query is only a small proportion of the total query execution time, and the benefit of avoiding a hard parse is probably outweighed by the reduction in important information you're making available to the query optimizer - by substituting the actual predicate with a bind variable, you're removing the ability for the optimiser to compare your value with the data distribution in the column, which might make it opt for a full table scan or an index when this isn't appropriate. Oracle 9i helps deal with this using a feature known as bind variable peeking, which allows Oracle to look at the value behind a bind variable to help choose the best execution plan.
Another potential drawback with bind variables and data warehousing queries is that the use of bind variables disallows the potential for star transformations, taking away this powerful option for efficiently joining fact and dimension tables in a star schema.
TableSpace
Oracle has tablespaces inside which tables , index , materialised views are created
I am not clear how data can be kept in different tablespaces for a partitions
Below are types of tablespaces
Permanent tablespaces
Permanent tablespaces are used to store user data and user created objects like tables, indexes and materialized views. Sample create statements:
Temp tablespaces
Temp or temporary tablespaces are used to store data with short lifespan (transient data), for example: global temporarily tables or sort results.
Undo tablespaces
Undo tablespaces are used to store "before image" data that can be used to undo transactions. See ROLLBACK.This needs to be understood in more detail as in datawarehousing tools we have commit intervals which are related to this. You have set commit interval such that you are making maximum use of your undo tablespace.
Locking on Tables -- Have you ever noticed ORA-00054 -No wait on resource.
consider you are loading a table through ETL tool and you try to drop it you will get this error. You can also get this error while loading data when you try to drop a index. I will try to update more on locking later
How to create a dummy table
INSERT INTO
T1
SELECT
ROWNUM,
DECODE(MOD(ROWNUM,5),0,NULL,ROWNUM),
DECODE(MOD(ROWNUM,5),0,NULL,ROWNUM),
TRUNC(SYSDATE)+TRUNC(ROWNUM/100),
DECODE(MOD(ROWNUM,5),0,NULL,TRUNC(SYSDATE)+TRUNC(ROWNUM/100)),
DECODE(MOD(ROWNUM,5),0,NULL,TRUNC(SYSDATE)+TRUNC(ROWNUM/100)),
RPAD(CHR(MOD(ROWNUM-1,26)+65)||
CHR(MOD(ROWNUM,26)+65)||
CHR(MOD(ROWNUM+1,26)+65)||
CHR(MOD(ROWNUM+2,26)+65),20,'A'),
DECODE(MOD(ROWNUM,5),0,NULL,
RPAD(CHR(MOD(ROWNUM-1,26)+65)||
CHR(MOD(ROWNUM,26)+65)||
CHR(MOD(ROWNUM+1,26)+65)||
CHR(MOD(ROWNUM+2,26)+65),20,'A')),
DECODE(MOD(ROWNUM,5),0,NULL,
RPAD(CHR(MOD(ROWNUM-1,26)+65)||
CHR(MOD(ROWNUM,26)+65)||
CHR(MOD(ROWNUM+1,26)+65)||
CHR(MOD(ROWNUM+2,26)+65),20,'A'))
FROM
DUAL
CONNECT BY
LEVEL<=200000;
CREATE TABLE dummy (USERID NUMBER,
USERNAME VARCHAR2(20),
USERPWD VARCHAR2(10),
EMAILID VARCHAR2(20),
FIRSTNAME VARCHAR2(10),
LASTNAME VARCHAR2(10),
ISACTIVE NUMBER)
select count(*) from dummy
commit
INSERT INTO dummy
SELECT srl,
name,
pwd,
LOWER(SUBSTR(name, 1, 10)) || '@abc.com',
SUBSTR(name, 1, 10),
SUBSTR(name, 11, 20),
1
FROM (
SELECT level srl,
dbms_random.string('U', 20) name,
dbms_random.string('A', 10) pwd
FROM DUAL
CONNECT BY LEVEL <= 10000000)
Disclaimer and Citations
The content here is taken from various sources found by googling. I have given links wherever possible.For me i dont need in detail information so i have copy pasted the basic information for my use.Also taken are comments from blogs , forum. I have added lot of information according to my understanding of subjects. If anyone finds anything objectionable please leave a comment.
No comments:
Post a Comment