What is Dynamic
sampling and its impact on the Optimizer and how it helps generating better
Execution Plan?
The Oracle optimizer makes
decisions based on the statistics stored in the data dictionary for the objects
referenced by a given query. Without valid statistics, the optimizer often
makes poor choices when picking an execution plan.
The main objective of dynamic sampling is to
create more accurate selectivity and cardinality estimates, which, in turn,
helps the CBO generate faster execution plans.
Oracle optimizer use the default statistic
values when table does not have any statistics and dynamic sampling query feature
is disabled. Optimizer Dynamic Sampling has 11 levels between 0 to 10. The default setting is 2. This setting ensures that the
optimizer will have a good estimate whenever it optimizes a query that accesses
a table with no statistics. Database will perform dynamic sampling when
existing statistics are missing, unusable or insufficient (unless dynamic
sampling has been turned off entirely).
How Dynamic Sampling Works
There are two ways to use dynamic
sampling:
- The OPTIMIZER_DYNAMIC_SAMPLING parameter can be set at the database instance level and can also be overridden at the session level with the ALTER SESSION command.
- The DYNAMIC_SAMPLING query hint can be added to specific queries. DYNAMIC_SAMPLING hint can also be used to invoke dynamic sampling for a single statement.
Level
|
When
Dynamic Sampling will be used
|
Sample
size (blocks)
|
0
|
Do
not use dynamic sampling.
|
N/A
|
1
|
Use
dynamic sampling with a 32-block sample size for all tables that have not
been analysed, but only if the unanalysed table has more than 32 blocks, no
indexes and is not partitioned.
|
32
|
2 (default)
|
Use
dynamic sampling with a 64-block sample size if at least one table in the
statement has no statistics. This is the default.
|
64
|
3
|
Use
dynamic sampling with a 64-block sample size if the statement meets level 2
criteria or has one or more expressions used in the WHERE clause predicates.
Example: Where SUBSTR(CUSTLASTNAME,1,3) or Where a + b =5
|
64
|
4
|
Use
dynamic sampling with a 64-block sample size if the statement meets level 3
criteria or uses complex predicates (an OR or AND operator between multiple
predicates on the same table).
|
64
|
5
|
Any statement that meets level 4 criteria
|
128
|
6
|
Any statement that meets level 4 criteria
|
256
|
7
|
Any statement that meets level 4 criteria
|
512
|
8
|
Any statement that meets level 4 criteria
|
1024
|
9
|
Any statement that meets level 4 criteria
|
4086
|
10
|
Use
dynamic sampling of all blocks for all statements.
|
All Blocks
|
Consider
below Table:
SQL> CREATE Table MyTemp AS
SELECT OWNER, OBJECT_TYPE
FROM ALL_OBJECTS;
Table created.
SQL> SELECT Count(*) From MyTemp;
COUNT(*)
----------
71879
Let’s disable the Dynamic Sampling by DYNAMIC_SAMPLING
hint, with a level of zero and see the Execution Plan.
Disabling
dynamic sampling to see default cardinalities
SQL> SET AUTOTRACE TRACEONLY EXP
SQL> SELECT /*+ Dynamic_Sampling(MyTemp,0) */ *
2 From MyTemp;
Execution Plan
----------------------------------------------------------
Plan hash value: 2234051319
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 16908 | 462K| 58 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| MYTEMP | 16908 | 462K| 58 (0)| 00:00:01 |
As
you can see, the estimated cardinality is 16,906, which is very far off from
the real cardinality, about 71,879. If I permit dynamic sampling, I get a
much more realistic cardinality estimate.
More-realistic
cardinalities with dynamic sampling enabled
SQL> SELECT * From MyTemp;
Execution Plan
----------------------------------------------------------
Plan hash value: 2234051319
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 82504 | 2255K| 59 (2)| 00:00:01 |
| 1 | TABLE ACCESS FULL| MYTEMP | 82504 | 2255K| 59 (2)| 00:00:01 |
----------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Now,
82504 is much closer to 71,879. An inaccurate cardinality estimate can swing
either way. We have seen the optimizer radically underestimating the
cardinality, but it can overestimate as well.
Overestimating cardinalities
SQL> DELETE From MyTemp;
71879 rows deleted.
SQL> COMMIT;
Commit complete.
SQL> SET AUTOTRACE TRACEONLY EXP
SQL> SELECT /*+ Dynamic_Sampling(MyTemp,0) */ *
2 From MyTemp;
Execution Plan
----------------------------------------------------------
Plan hash value: 2234051319
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 16908 | 462K| 58 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| MYTEMP | 16908 | 462K| 58 (0)| 00:00:01 |
----------------------------------------------------------------------------
SQL> SELECT * From MyTemp;
Execution Plan
----------------------------------------------------------
Plan hash value: 2234051319
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 28 | 58 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| MYTEMP | 1 | 28 | 58 (0)| 00:00:01 |
----------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Think
about what might happen if the optimizer guessed 16,908 rows instead of 1 row
in this case. For queries accessing table MyTemp, the optimizer would grossly
overestimate the rows that will be returned from MyTemp and generate incorrect
plans as a result.
Poor Cardinality
Estimates
Creating
the “FLAG” table and gathering statistics
SQL> CREATE Table MyTable
2 AS SELECT
3 DECODE(MOD(ROWNUM,2),0,'N','Y') Flag1,
4 DECODE(MOD(ROWNUM,2),0,'Y','N') Flag2, AO.*
5 FROM ALL_OBJECTS AO;
Table created.
SQL> SELECT Count(*) From MyTable;
COUNT(*)
----------
71880
SQL> CREATE INDEX T_IND ON MyTable(Flag1,Flag2);
Index created.
SQL> BEGIN
2 DBMS_STATS.GATHER_TABLE_STATS(USER,'MYTABLE',
3 METHOD_OPT=>'For All Indexed Columns Size 254');
4 END;
5 /
PL/SQL procedure successfully completed.
SQL> SET AUTOTRACE TRACEONLY EXP
SQL> SELECT * From MyTable Where Flag1='N' And Flag2='N';
Execution Plan
----------------------------------------------------------
Plan hash value: 1015944200
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17969 | 1772K| 299 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| MYTABLE | 17969 | 1772K| 299 (1)| 00:00:04 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("FLAG1"='N' AND "FLAG2"='N')
Let’s use Dynamic
Sampling
SQL> SELECT /*+ Dynamic_Sampling(MyTable 3) */ * From MyTable Where Flag1='N' And Flag2='N';
Execution Plan
----------------------------------------------------------
Plan hash value: 270124127
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7 | 707 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| MYTABLE | 7 | 707 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T_IND | 7 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("FLAG1"='N' AND "FLAG2"='N')
Note
-----
- dynamic sampling used for this statement (level=2)
Note
the much better row estimate (7 instead of more than 17,959) , also note the
very different (and now optimal) plan. Instead of a full table scan, the
optimizer has decided to use the index, and this execution plan will definitely
be optimal compared to the original full table scan, because in reality, no
data will be returned. The index will discover that instantly.
So, not
all queries can benefit from Oracle dynamic sampling. Queries with very fast
execution times or ad-hoc queries that will not be repeated are not good
candidates. Here are the situations that can benefit most from dynamic
sampling:
· A query
with complex predicates that is generating a poor execution plan.
· Queries
with execution times that are significantly longer than the time required to
perform dynamic sampling.
· Queries
that will be repeatedly executed on the database.
· If the
available statistics are not enough to generate a good execution plan, dynamic
sampling will be used.
Get
involved and leave your Comments in the Box Below. The more people get
involved, the more we all benefit. So, leave your thoughts before you leave
the page.
Source- AskTom
Nice blog, Thanks For Sharing this informative Article.
ReplyDeleteOracle Fusion SCM Online Training
Oracle Fusion Financials Online Training