Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aminsource/6eabcb4d2653862f215d263e1b9faf4e to your computer and use it in GitHub Desktop.
Save aminsource/6eabcb4d2653862f215d263e1b9faf4e to your computer and use it in GitHub Desktop.
Oracle SQL Performance Tuning Cheatsheet

Oracle SQL Performance Tuning Cheatsheet

F:\WorkingKnowles\Oracle\Utility\12i\PT\PerformanceTuning_CheatSheet.md

Reference

Common Command

Flush Shared Pool, Buffer Cache

alter system flush shared_pool;
alter system flush buffer_cache;

SQL Developer

-- run with hint to turn on monitoring for short(quick run) SQL
select /*+MONITOR*/ * FROM apps.oe_order_lines_all;

-- turn off monitoring for SQL
select /*+NO_MONITOR*/ * FROM apps.oe_order_lines_all;

General Tip

  • CARDINALITY * LAST_STARTS ~ LAST_OUTPUT_ROWS
    • if roughly in the same magnitude, optimizer most likely choose the good plan
  • column correlation and data skew can cause optimizer choose wrong plan
    • column correlation ==> creates extended statistics
    • data skew
      • 2 forms: range skew and value skew
      • creates histogram
        • Frequency
        • Height-balanced
        • Hybrid
        • Top-Frequency
  • Make sure table stats is collected, especially after you do a truncate/delete and import data for the table
  • If there is a corelation among columns, create EXTENDED STATS to help optimizer to estimate # of rows
  • if data is skew (uneven distribution), create histogram
  • Inefficient access
    • creating index
  • Repeating Work
    • rewrite the SQL
  • Reading Many Rows returning Few
    • creating materialized view (enable query rewrite)
  • Index tips
    • keep # indexes small
    • when creating multiple column indexes
      • the leading column of the index should be the one most likely to be used in WHERE clauses and also the most selective column of the set
      • compressible columns first
  • Use hints as temporary solution or last option (if you run out of clue)
    • Some common hints:
      • INDEX, ORDERED, USE_HASH, PARALLEL, ALL_ROWS, FIRST_ROWS, APPEND, CACHE,RESULT_CACHE
  • Trigger on tables can cause performance issue for DML (insert/update/delete)

General Concept

What is a good plan for the Optimizer

The optimizer has 2 different goals

  • Serial execution: it is all about cost
    • the cheap, the better
  • Parallel execution: it is all about performance
    • the faster, the better
Term Description
Cost - The optimizer's cost model accounts for the IO, CPU and network resources that will be used by the query
- Estimate of amount of CPU, disk I/O, used to perform an operation
- The cost is an internal unit and is only displayed to allow for plan comparisons
- Used for compare different plans for the same query
Performance - it is fastest possible response time for a query
- Goal is to complete the query as quickly as possible
- Optimizer does not focus on resources needed to execute the plan
AWR Automatic Workload Repository
STS SQL Tuning Set
SPM SQL Plan Baseline
ROWID Datafile ==> Data block ==> location of row
ASSM Automatic Segment Space Management tablespace type, which is designed to avoid contention by concurrent inserts from different sessions

What to Look For in Execution Plan

  • Full table scan (FST)
  • Join Methods, Join Order
  • Index Access Methods
  • Filters
  • Parallel Operations
  • Partition Processing
  • Dynamic Statistics
  • Cost

Execution Plan Component

Component Description
Cardinality - Estimate of the number of rows coming out of each of the operations
- Cardinality for a single column eqality predicate = (total # of rows)/(num of distinct values)
- It influences everything: access method, join type, join order etc
Access method the way in which the data is being accessed, via either a table or index access
Join method the method (e.g. hash, sort-merge etc) used to join tables with each other
Join type the type of join (e.g. outer, anti, semi etc)
Join order the order in which the tables are joined to each other
Partition pruning are only the necessary partitions being accessed to answer the query?
Parallel Execution In case of parallel execution, is each operation in the plan being conducted in parallel? Is the right data redistribution method being used?

Data Access Method

Ref: https://techiedba.wordpress.com/2011/08/12/data-access-methods-in-oracle/

  • Full Table Scan (FTS)
  • Table Access by ROWID
  • Index Unique Scan
  • Index Range Scan
  • Index Full Scan
    • In certain circumstances it is possible for the whole index to be scanned where no constraining
    • it will perform single block i/o's and so it may prove to be inefficient
  • Index Fast Full Scan
    • Scans all the block in the index Rows are not returned in sorted order
    • Use multiblock i/o and can be executed in parallel
  • Index Skip Scan
    • The optimizer can perform skip scans to retrieve rowids for values that do not use the prefix of a concatenated index
    • Initiated by probing the index for distinct values of the prefix column. Each of these distinct values is then used as a starting point for a regular index search
  • Index Joins
  • Hash Access
  • Cluster Access
  • Bit Map Index

Join Method

Ref: https://www.youtube.com/watch?v=pJWCwfv983Q

  • Hash join
  • Nested Loop
  • (Sort) Merge join
  • Cartesian join

Nested loop join is better for joining small result sets where the joni columns are indexed Hash join is better for joining larger result sets where once or more indexes are missing on the join columns (Sort) Merge join is better than nested loop joins for joining larger result sets but typically not as good as hash joins

Job Method Description
Hash join - Used for joining large data sets.
- The optimizer uses the smaller of the two tables or data sources to build a hash table, based on the join key, in memory.
-It then scans the larger table, and performs the same hashing algorithm on the join columns(s). It then probes the previously built hash table for each value and if they match., it returns a row
Nested Loop -It is useful when small subsets of data are being joined and if there is an efficient way of accessing the 2nd table (for example, an index lookup).
-For every row in the first table (the outer table), Oracle access all the rows in the second table (the inner table)
Sort Merge - It is useful when the join condition between two tables is an in-equality condition such as, <, <=, > or >=.
- Sorge merge joins can perform better than the nested loop joins for large data sets.
- A sort merge join is more likely to be chosen if there is an index on one of the tables that will eliminate one of the sorts
Cartesian join - The optimizer joins every row from one data source with every row from the other data source.
- Typically this is only chosen if the tables involves are small or if one or more of the tables doesn't not have a join conditions to any other table in the statement

Nested Loop

  • First we return all the rows from row source 1 (Driving Table), typically the smaller of the two row sources
  • Then we probe row source 2 (Inner Table) once for each row returned from row source 1
  • Good for joining smaller row sources
  • Best used within indexed foreign key columns

The optimizer may choose nested loops when:

  • Joining small data sets
  • Fetching a small fraction of large data sets
  • The query of the inner table is efficient (e.g. there is an index on its join columns)
  • All the join criteria are inequalities (not equals)

Hash Join

The steps to do a hash join are:

  1. Return all the rows from the smaller data set
  2. Build a hash table using the join columns for these rows
  3. Read the rows in the larger table
  4. Probe the hash table built at step 2 by applying the same hash function to the join columns of the second table
  5. If this finds a matching entry in the hash table, the database passes the joined rows to the next step of the plan
  • Smallest row source is chosen and used to build a hash table (in memory) and a bitmap
  • The second row source is hashed and checked against the hash table looking for joins. The bitmap is used as a quick lookup to check if rows are in the hash table
  • Good for joining larger row sources
  • Needs PGA memory

There must be equality "=" conditions in the join for the optimizer to choose a hash join. When there is, it may choose this join method when joining either:

  • Large data sets OR
  • Most of the rows in small tables OR
  • There are no indexes on the join columns for either table

Hash Join vs. Nested Loop

This highlights a key difference between hash joins and nested loops. A hash join must read all the rows in the first data set to build the hash table. Then start reading the second table.

Nested loops can read rows in the inner table after reading just one row from the outer table. Provided the lookup of the inner table is fast, this means it can start returning rows faster than a hash join.

This makes nested loop operations the best way to get a small fraction of large data sets. For example, when doing top-N queries or master-detail joins, such as orders to order items.

Merge Join

  • Sort the rows in the first data set
  • Sort the rows in the second data set
  • For each row in the first data set, find a starting row in the second data set
  • Read the second data set until you find a row with a value greater than the current row in the first data set
  • Read the next row in the first data set and repeat

The optimizer may choose a merge join when:

  • The join uses range comparisons (<, >=, between, etc.)
  • The join has an equality (=) and one of the data sets is already ordered, enabling the database to avoid a sort
  • Sorting is slow, so it's rare for the optimizer to use a merge join.

Indexes are ordered data structures. So if there is an index on the join columns for one table, the optimizer can use this to avoid a sort. Oracle Database will always sort the second data set, even if it there are indexes on the second table's join columns.

Merge Join Cartesian

The optimizer considers a merge join Cartesian if:

  • It's joining every row in one table to every row in another (a cross join)
  • It expects to get exactly one row from the first table and join it to every row in the second table
  • It may generate the Cartesian product of two tiny tables before joining these to a huge table

It's rare to see this join method. And it's often a sign of a problem. For example missing join criteria or out-of-date statistics.

Adaptive Plan

  • It can be tough to choose between nested loops and hash joins. Provided there is an index on the join columns, nested loops are fast when you fetch few rows from the first table.

  • But nested loops query the inner table once for each row from the outer table. As you fetch more rows from the first table, you quickly reach a point where it's faster to use a hash join. This fetches all the rows from the second table once.

  • To handle this, Oracle Database 12c added adaptive plans. This is one plan which considers two different join methods. You can see this query uses an adaptive plan by looking at the Note section:

  • Full dynamic plan displayed when format parameter '+all_dyn_plan' is used

  • New adaptive optimization section shown when format parameter '+adaptive' is set

Note
-----
  - this is an adaptive plan

select * from table(dbms_xplan.display_cursor(format => '+all_dyn_plan +adaptive'));

The database can then choose the best join method at runtime. It does this by looking at the actual number of rows the query processes.

To see which joins the optimizer considered, get the plan with the +ADAPTIVE format:

select /*+ gather_plan_statistics */*
from   card_deck d1
join   card_deck d2
on     d1.val = d2.val
where  d1.notes = 'SQL is awesome!';

select * 
from   table(dbms_xplan.display_cursor(:LIVESQL_LAST_SQL_ID, format => 'IOSTATS LAST +ADAPTIVE'));

This has some key differences to a standard plan:

  • The plan includes both NESTED LOOPS and HASH JOIN operations
  • There is a STATISTICS COLLECTOR operation above the first table in the plan
  • A minus ('-') appears before some of the steps in the plan

Statistics Collector

  • This operation watches the number of rows flowing out of the table below it. Provided this number stays under a threshold, the optimizer will use nested loops. But if the number of rows exceeds this threshold, it'll switch to a hash join.

Join Type

  • Inner join
  • Left/right outer join
  • Full outer join
  • Anti join
  • Semi join
  • Grouped outer join

Filters

  • Restrictions in the WHERE clause
  • Optimizer will try to filter rows before performing joins and this is the goal

Parallel Operations

  • Some SQL processing can be broken into separate parallel processing steps
  • Parallel processing can be hinted
  • Some operations that might be seen in the execution plan:
    • PX COORDINATOR

    • PX BLOCK ITERATOR

    • PX SEND

    • PX RECEIVE

      PX: Parallel Transaction

Partition Processing

  • Common practice to partition tables that are expected to contain a large volume of rows
  • The optimizer is partition-smart, and its plans should reflect this (Partition Pruning)
  • Partition access in the execution plan
    • PARTITION LIST SINGLE
    • PARTITION LIST

Dynamic Statistics

  • indicates object statistics are missing

Important View

Oracle Database Reference 19c

Name Description
V$SQL_PLAN - A dictionary view introduced in Oracle 9i that shows the execute plan for a SQL statement that has been compiled into a cursor cache
- A dynamic performance view that shows the execution plan for a SQL statement that has been compiled into a cursor and stored in the cursor cache
Its definition is similar to the PLAN_TABLE
V$SQL_PLAN_STATISTICS_ALL It contains memory usage statstics for row sources that use SQL memory (sort or hash-join). This view concatenates information in V$SQL_PLAN with execution statistics from V$SQL_PLAN_STATISTICS and V$SQL_WORKAREA
V$SQL V$SQL lists statistics on shared SQL areas without the GROUP BY clause and contains one row for each child of the original SQL text entered. Statistics displayed in V$SQL are normally updated at the end of query execution. However, for long running queries, they are updated every 5 seconds. This makes it easy to see the impact of long running SQL statements while they are still in progress.
V$SQLAREA V$SQLAREA displays statistics on shared SQL areas and contains one row per SQL string. It provides statistics on SQL statements that are in memory, parsed, and ready for execution.
V$SQL_SHARED_CURSOR This view explains why a particular child cursor is not shared with existing child cursors which caused more than one child cursor to be created for same Parent cursor.
Each column in this view identifies a specific reason why the cursor cannot be shared.
The columns describe the various reasons with “Y” or “N” for the value.
You should focus on the column which has value as ‘Y’.
V$SESSION
V$SESSTAT
V$STATNAME
V$MYSTAT
ALL_TAB_STATS_HISTORY

Logical I/Os

  • Consistent (read) gets

    • These access blocks as they existed at some point in the past. This is usually the time the query started. This ensures that the query only returns values that existed when the query started. Any changes made after this are ignored
  • Concurrent mode gets

    • This gets the block as it exists right now. You will see these gets when changing data in INSERT, UPDATE and DELETE statements.
  • When reading a block, whether in consistent or concurrent mode, if it is not cached in memory, the data will also do a physical read. This fetches the block from disk.

  • you can view the I/O for a statement with the IOSTATS format in DBMS_XPLAN. This adds a buffers column to the plan. This sums the consistent and concurrent mode gets for the operations

    • The IOSTATS format reports the logical I/O in the buffers column for each operation in the plan. These figures are cumulative.
select /*+ gather_plan_statistics */count(*) from bricks;

select *  
from   table(dbms_xplan.display_cursor(format => 'IOSTATS LAST'));

Memory

  • Sorting and joining operations need to read and write data too. Often these are able to process the rows in memory. You can see how much memory an operation needed by getting the plan with the MEMSTATS format.
    • this adds OMem, 1Mem and Used-Mem columns to the plan
    • Unlike the I/O figures (which is cumulative), these values are specific to each operation in the plan
    • The OMem and 1Mem figures are estimates. Used-Mem reports how much memory it actually used.
    • If memory is limited or when processin ghubge data sets all the rows may not fit in memory. In this case the db neess to write the data to disk to complete the operation. Oracle db uses the temporary tablespace to do this.
  • DBMS_XPlan only returns metrics which have non-zero values for at least one operation in the plan. If a column is missing (e.g. Used-Mem) this means the database needed no extra memory to process the query!
select /*+ gather_plan_statistics */* from bricks
order by brick_id
fetch first 1 rows only;

select *  
from   table(dbms_xplan.display_cursor(format => 'MEMSTATS'));

Disk Reads and Writes

  • When the database is unable to sort rows in memory it writes them to temporary disk. You can see the amount of disk needed in the Used-Tmp column with the MEMSTATS format.

  • To view the number of disk reads and writes, check the Reads and Writes columns with the IOSTATS format.

  • An easy way to display all this information is with the ALLSTATS format. This is shorthand for MEMSTATS IOSTATS ROWSTATS.

-- The alter session statements in this section limit the amount of memory available for joins and sorts. 
-- So these operations will need to write to disk
alter session set workarea_size_policy = manual;
alter session set sort_area_size = 25000;

select /*+ gather_plan_statistics */ row_count from complex_query;

select *  
from   table(dbms_xplan.display_cursor(format => 'ALLSTATS LAST'));

Below statistics in execution plan are cumulative

  • buffer/consistent gets
  • disk reads
  • disk writes
  • time

Below statistics in execution plan are specific for each operation (NOT cumulative)

  • Rows
  • Memory Used

Clustering Factor

  • By default, tables are heap organized in Oracle DB. This means that the database is free to place rows wherever it wants
  • Index are ordered data structures
  • The closer the physical order of rows matches the logical order of an index, the more effective that index will be
  • The smallest unit of I/O in Oracle db is a block. The more consecutive index entries there are that point to the same block,the more rows you can fetch in one I/O
  • When determing how effective an index is, it is the number of I/O operations that matter. Not how many rows it accesses
  • The clustering factor is a measure of how closely the logical index order matches the physical table order for rows. The database calculates this when gathering stats.
    • Is the row for the current index entry in the same block as the previous entry or a different one?
    • Each time consecutive index entries are in different blocks, the optimizer increments a counter.
    • The clustering factor's value ranging from the number of blocks in the table to the number of rows
    • The default calculation for the clustering factor is pessimistic. This means that the optimizer is unlikely to use indexes on mostly clustered values, even when they are faster than a full table scan.
    • The lower this value, the better clustered the rows are and the more likely the database is to use the index
  • You can view the clustering factor by querying the *_INDEXES views
  • Increasing the value TABLE_CACHED_BLOCKS preference helps the optimizer spot mostly clustered values.
  • the TABLE_CACHED_BLOCKS preference tells the optimizer to keep track of how recently it saw a value.
    • default value is 1

    • Use below to change it

      select dbms_stats.get_prefs ( 'table_cached_blocks', null, 'bricks' )
      from   dual;
      
      begin 
      dbms_stats.set_table_prefs ( 
          null, 'bricks', 'table_cached_blocks', 16
      );
      end;
      /
      • You need to regather stats for th is change (TABLE_CACHED_BLOCKS) to take effect
      select index_name, clustering_factor 
      from   user_indexes
      where  table_name = 'BRICKS';
      
      exec dbms_stats.gather_table_stats ( null, 'bricks' ) ;
      
      select index_name, clustering_factor
      from   user_indexes
      where  table_name = 'BRICKS';
    • It's likely many tables in your database will benefit from increasing TABLE_CACHED_BLOCKS. You can set a default for all tables in a schema or database with:

      dbms_stats.set_schema_prefs ( '<schema name>', 'table_cached_blocks', <value> );
      dbms_stats.set_database_prefs ( 'table_cached_blocks', <value> );
    • Values you set at the schema level override database settings. And table level settings take priority over both. So you can set the default for the current schema to 16 with:

      exec dbms_stats.set_schema_prefs ( user, 'table_cached_blocks', 16 ); 
    • And increase or decrease this for specific tables as needed with DBMS_stats.set_table_prefs.

    • Avoid set TABLE_CACHED_BLOCKS too high. Sometimes you have to physically re-order rows in the table

Change Physical Data Model

Re-ordering rows in table is impractical in most applications. Luckily Oracle DB has several data structures you can use to force physical order on data

  • Index-organized tables
  • Partitioning
  • Table clusters

These all have various pros and cons. And require a rebuild to add to existing tables. Making it impractical to change most existing tables. There is another, more lightweight technique available from 12.2:

  • Attribute Clustering
    • Attribute clustering defines a physical order for rows. Using a linear order sorts the values in the same way as ORDER BY; it works through the columns left-to-right, only using the next column to resolve ties.

Important DB parameters

>show parameters optimizer
Name Description
DB_FILE_MULTIBLOCK_READ_COUNT It specifies the maximum number of blocks read in one I/O operation during a sequential scan.
It is one of the parameters you can use to minimize I/O during table scans. The total number of I/Os needed to perform a full table scan depends on such factors as the size of the table, the multiblock read count, and whether parallel execution is being utilized for the operation.
TABLE_CACHED_BLOCKS DB 12.1+, the average number of blocks cached in the buffer cache for any table we can assume when gathering the index clustering factor
OPTIMIZER_INDEX_CACHING
OPTIMIZER_COST_ADJ
OPTIMIZER_MODE

Related package

Procedure Arguments
dbms_xplan.display() - plan table name (default 'PLAN_TABLE')
- statement_id (default null means the last statement inserted into the plan table)
- format: control the amount of information displayed (default is 'TYPICAL')
dbms_xplan.display_cursor() - sql_id (default null, means the last SQL statement executed in this session)
- child_number (default 0)
- format: controls the amount of information displayed (default 'TYPICAL')
dbms_xplan - Formatting the execution plan There are three pre-defined formats available:
- BASIC The plan includes only the ID, operation, and the object name columns.
- TYPICAL Includes the information shown in BASIC plus additional optimizer-related internal information such as cost, cardinality estimates, etc. This information is shown for every operation in the plan and represents what the optimizer thinks is the operation cost, the number of rows produced, etc. It also shows the predicates evaluated by each operation. There are two types of predicates: ACCESS and FILTER. The ACCESS predicates for an index are used to fetch the relevant blocks by applying search criteria to the appropriate columns. The FILTER predicates are evaluated after the blocks have been fetched.
- ALL Includes the information shown in TYPICAL plus the lists of expressions (columns) produced by every operation, the hint alias and query block names where the operation belongs (the outline information). The last two pieces of information can be used as arguments to add hints to the statement
DBMS_ROWID The DBMS_ROWID package lets you create ROWIDs and obtain information about ROWIDs from PL/SQL programs and SQL statements. You can find the data block number, the object number, and other ROWID components without writing code to interpret the base-64 character external ROWID.
- dbms_rowid.rowid_block_number()
  • DBMS_XPLAN.DISPLAY takes 3 parameters

    • plan table name (default 'PLAN_TABLE')
    • statement id (default null)
    • format (default 'TYPICAL')
  • DBMS_XPLAN.DISPLAY_CURSOR takes 3 parameters

    • SQL_ID (default last statement executed in this session)
    • Child number (default 0)
    • format (default 'TYPICAL')
  • Format is highly customizable - Basic, Typical, All

How to generate an execution plan

  1. EXPLAIN PLAN command
    • Displays an execution plan for a SQL statement without actually executing the statement
  2. V$SQL_PLAN
    • A dictionary view introduced in Oracle 9i that shows the execution plan for a SLQ statement that has been compiled into a cursor in the cursor cache

Under certain conditions that plan shown with EXPLAIN PLAN can be different from the plan from using V$SQL_PLAN

  • binding value
    • EXPLAIN PLAN doesn't do bind peek and assume even distribution of value
    • V$SQL_PLAN will do bin peek

Explain Plan

  • Using EXPLAIN PLAN command & dbmx_xplan.display function
EXPLAIN PLAN FOR
SELECT
    *
FROM
    apps.ke_hr_employee_info_v
WHERE
    site = 'KEI';
    
SELECT * FROM TABLE(dbms_xplan.display('plan_table',NULL, 'basic'));
SELECT * FROM TABLE(dbms_xplan.display('plan_table',NULL, 'typical'));
SELECT * FROM TABLE(dbms_xplan.display('plan_table',NULL, 'all'));

Execution Plan

-- basic info
SELECT
    *
FROM
    apps.ke_hr_employee_info_v
WHERE
    site = 'KEI';

-- MAKE SURE 'set serveroutput off' before run below    
SELECT * FROM TABLE(dbms_xplan.display_cursor(NULL,NULL, 'basic'));

-- advanced info

-- ====================================================================
-- EITHER
-- ====================================================================
--set statistics_level to all in the session
alter session set statistics_level = all;               -- equivalent to using gather_plan_statistics hint (in below)
alter session set statistics_level = typical;
alter session set statistics_level = basic;

>show parameter statistics_level

-- ====================================================================
-- OR
-- ====================================================================
-- you can add below hint to your query
-- Moral of the story: Turn off serveroutput before using GATHER_PLAN_STATISTICS in SQL Plus.
-- https://connor-mcdonald.com/2016/01/29/common-gather_plan_statistic-confusion/
SELECT /*+ gather_plan_statistics  */
    *
FROM
    apps.ke_hr_employee_info_v
WHERE
    site = 'KEI';

-- The GATHER_PLAN_STATISTICS hint allows for the collection of extra metrics during the execution of the query. 
-- Specifically, it shows us the Optimizer's estimated number of rows (E-Rows) and the actual number of rows (A-Rows) for each row source. 
-- If the estimates are vastly different from the actual, one probably needs to investigate why.

/*
- ALLSTATS - Shorthand for IOSTATS MEMSTATS ROWSTATS. This includes details such as how many rows and consistent gets each step used.
- LAST - Only display the stats for the last execution. Otherwise this default to giving you the information for every execution.
- PARTITION - Include the Pstart and Pstop columns. This tells you which partitions Oracle accessed. Only needed if you're querying partitioned tables!
- PARALLEL - Again, only relevant for parallel statements!
- NOTE - includes the note section. This provides extra information, such as whether Oracle used features like dynamic statistics, adaptive plans, etc.
*/

-- outline: https://oracle-base.com/articles/misc/outlines

-- the "leading" hint in outline result tells you the JOIN ORDER

SELECT *
FROM TABLE(dbms_xplan.display_cursor(FORMAT=>'allstats last all +outline'));

SELECT *
FROM TABLE(dbms_xplan.display_cursor(sql_id => '5gr39rg66857y', FORMAT=>'allstats last all +outline'));

SELECT *
FROM TABLE(dbms_xplan.display_cursor(FORMAT=>'allstats last all +cost +bytes'));            

SELECT *
FROM TABLE(dbms_xplan.display_cursor(FORMAT=>'allstats last all +peeked_binds +cost +bytes'));

-- also possible to use the low level arguments to exclude information from the plan
-- customized TYPICAL plan with suppressed options COST and BYTES
SELECT *
FROM TABLE(dbms_xplan.display_cursor(FORMAT=>'typical -cost -bytes'));

Find SQL ID

SELECT /* KEVIN_EXAMPLE */
    *
FROM
    apps.ke_hr_employee_info_v
WHERE
    site = 'KEI';

SELECT 
  sql_id, 
  sql_text, 
  sql_fulltext,
  hash_value,
  plan_hash_value
FROM   
    v$SQL
WHERE  
    sql_text LIKE '%KEVIN_EXAMPLE%'
AND sql_text NOT LIKE '%v$sql%';

Gather Table Stats

BEGIN
    dbms_stats.gather_table_stats(
            ownname => 'repositoryapex',
            tabname => 'ke_cart_header');
END;

/*
Gathering stats on a table may change the optimizer's row estimates for queries against it. If you have lots of queries accessing a table, updating stats may cause the database to have to create a new plan for them all. This adds significant workload to your database!

So optimizer decides when to invalidate the cursors. This means you may not see plan changes immediately after gathering stats on a table.

To force the optimizer to check queries for new plans immediately, set NO_INVALIDATE to FALSE when calling gather stats:
*/
BEGIN
    dbms_stats.gather_table_stats(
        ownname         => 'repositoryapex',
        tabname         => 'ke_cart_header',
        no_invalidate   => false
    );
END;

/*
When getting plans in these tutorials you may find DBMS_XPlan reporting the old, "wrong" plan. In these cases making small changes to the query (e.g. adding column aliases) forces the optimizer to reparse the query.

Remember: invalidating cursors when gathering stats this will cause the optimizer to reparse all queries on this table. Use with caution in production environments!
*/

-- manually set numrows stat for table
BEGIN
    dbms_stats.set_table_stats(
    ownname => 'repositoryapex',
    tabname => 'ke_cart_header',
    numrows => 100
    );
END;

-- manually set stats for table column
BEGIN
    dbms_stats.set_column_stats(
        ownname => 'repositoryapex',
        tabname => 'ke_cart_header',
        colname => 'capital_type',
        distcnt => 200
    );
END;

-- set table stats preference
BEGIN
    dbms_stats.set_table_prefs(
        ownname => 'repositoryapex',
        tabname => 'ke_cart_header',
        pname   => 'stale_percent',
        pvalue  => 5                        -- 5%
    );
END;

-- get table stats preference
SELECT
    dbms_stats.get_prefs(
        ownname => 'repositoryapex',
        tabname => 'ke_cart_header',
        pname   => 'stale_percent'
    )
FROM dual;

Histogram

http://expertoracle.com/2017/11/28/db-tuning-basics-8-histograms/

Type of Histogram

  • Frequency
  • Height-balanced
  • Hybrid
  • Top-frequency

KEY DATABASE PARAMETERS RELATED TO HISTOGRAMS

  • METHOD_OPT

    This is the parameter which tells Oracle Engine to create histogram while collecting the statistics.

    The default value of METHOD_OPT from 10g onwards is ‘FOR ALL COLUMNS SIZE AUTO’. Below are possible values for the METHOD

    • AUTO : Oracle determines the columns to collect histograms based on data distribution and the workload of the columns.
    • REPEAT : Collects histograms only on the columns that already have histograms.
    • SKEWONLY : Oracle determines the columns to collect histograms based on the data distribution of the columns.
    • INTEGER VALUE : Number of histogram buckets. Any value between 1 -254. Note that setting 1 will disable the Histogram collection

    This basically implies that Oracle will automatically create histograms on those columns which have skewed data distribution and there are SQL statements referencing those columns.

When to create histogram

  • Create Histograms in below situation:

    1. Create for the column which are included in the WHERE clause
    2. AND also when the column is highly skewed.

==> Note that Histograms are not just for indexed column but also for the non-indexed columns

  • Do not create Histograms in below situation:

    1. Data in column is evenly distributed.
    2. Column is not used in a where clause.
    3. Do not create on every column of the table as it will increase the stats collection time, parsing time and also can cause optimize to generate bad plans.
    4. Do not create Histograms on primary key.
    5. Do not create histogram when the column is unique and is used only with equality predicates

Tips for Histograms

  • Frequency histograms are more precise, than Height Balanced histograms.
  • Histograms are stored in the dictionary and maintenance and space cost is there. So create histograms only when required.
  • Histograms are not gathered on columns for which there is no predicate information in the col_usage$ table.
  • col_usage$ is only populated when columns referenced in predicates and hard parsing occurred.
-- verify your current setting
select dbms_stats.get_prefs ('METHOD_OPT') from dual;

-- to change your global setting to "REPEAT"
exec dbms_stats.set_global_prefs ('METHOD_OPT','FOR ALL COLUMNS SIZE REPEAT');

-- disable histogram collection globally
exec dbms_stats.set_global_prefs ('METHOD_OPT','FOR ALL COLUMNS SIZE 1');

-- delete histogram for a column
BEGIN
    dbms_stats.delete_column_stats(
    ownname         => 'repositoryapex',
    tabname         => 'ke_spt_project',
    colname         => 'project_type',
    col_stat_type   => 'HISTOGRAM'
    )
END;

-- disable histogram for a column
BEGIN
    dbms_stats.set_table_prefs(
    ownname => 'repositoryapex',
    tabname => 'ke_spt_project',
    method_opt => 'for columns size 1 project_type'
    );
END;

-- enable histogram for a column
BEGIN
    dbms_stats.set_table_prefs(
    ownname => 'repositoryapex',
    tabname => 'ke_spt_project',
    method_opt => 'for columns size auto project_type'
    );
END;

-- enable histogram for multiple columns
BEGIN
    dbms_stats.set_table_prefs(
    ownname => 'repositoryapex',
    tabname => 'ke_spt_project',
    method_opt => 'for columns project_type, project_status'
    );
END;

-- disable histogram for all columns for a table
BEGIN
    dbms_stats.set_table_prefs(
    ownname => 'repositoryapex',
    tabname => 'ke_spt_project',
    method_opt => 'for all columns size 1'
    );
END;

-- delete/drop histogram for a column
BEGIN
    dbms_stats.delete_column_stats(
        ownname         => 'repositoryapex',
        tabname         => 'ke_spt_project',
        colname         => 'project_status',
        col_stat_type   => 'HISTOGRAM'
    );
END;

-- continue collecting histograms for all columns except project_type column
BEGIN
    dbms_stats.set_table_prefs(
    ownname => 'repositoryapex',
    tabname => 'ke_spt_project',
    method_opt => 'for all columns size auto, for columns size 1 project_type'
    );
END;

BEGIN
    dbms_stats.gather_table_stats(
    ownname => 'repositoryapex',
    tabname => 'ke_spt_project',
    method_opt => 'for columns (division, project_type)'
    );
END;

BEGIN
    dbms_stats.gather_table_stats(
    ownname => 'repositoryapex',
    tabname => 'ke_spt_project',
    method_opt => 'for all columns size skewonly for columns (division, project_type)'
    ); 
END;

Extended statistics

-- for example, some division is always creating certain project type
SELECT
    dbms_stats.create_extended_stats(
        ownname => 'repositoryapex',
        tabname => 'ke_spt_project',
        extension => '(division, project_type)'
    )
FROM dual;

-- create extended stats for functional column
select count(*) from bricks where upper(colour) = 'RED';

BEGIN
    dbms_stats.gather_table_stats(
    ownname     => 'xxx',
    tabname     => 'bricks',
    method_opt  => 'for columns (upper(colour))'
    ); 
END;

-- 
select 
    dbms_stats.create_extended_stats ( null, 'bricks', '(weight + 10)' )
from dual;

select /*+ gather_plan_statistics */count (*) from bricks
where  weight + 2 = 3;


-- DB 12.1+ : change TABLE_CACHED_BLOCKS value
BEGIN
    dbms_stats.set_table_prefs(
        ownname => 'xxx',
        tabname => 'brick',
        pname   => 'TABLE_CACHED_BLOCKS',
        pvalue  => 10
    );
END;

Autotrace

Autotrace Setting Result
SET AUTOTRACE OFF No AUTOTRACE report is generated. This is the default
SET AUTOTRACE ON EXPLAIN The AUTOTRACE report shows only the optimizer execution path
SET AUTOTRACE ON STATISTICS The AUTOTRACE REPORT shows only the SQL statement execution statistics
SET AUTOTRACE ON The AUTOTRACE report includes both the optimizer execution path and the SQL steatement execution statistics
SET AUTOTRACE TRACEONLY Like SET AUTOTRACE ON, but suppressed the printing of the user's query output, if any. If STATISTICS is enabled, query data is still fetched, but not printed
SET autotrace traceonly explain
-- SET autotrace traceonly stat
-- SET autotrace on explain
SELECT
    *
FROM
    apps.ke_hr_employee_info_v
WHERE
    site = 'KEI';

SET autotrace OFF    

SQL Monitor

Youtube - Get the Plan in SQL Developer

  • Not all queries appear by default. Oracle only captures those lasting longer than five seconds (5s) or those running in parallel
    • to get around of this: one way to do whit is by adding the monitor hint:
          SELECT /*+ monitor */ ... FROM ..
  • You need to have Diagnostics and Tuning packs licenses to use it
-- generate text output for SQL Monitor of a given sql_id
SELECT
    dbms_sqltune.report_sql_monitor(
        sql_id      => '&your_sql_id',
        type        => 'TEXT',
        report_level=> 'ALL'
    ) as sql_monitor_output
FROM
    DUAL;

TKPROF

  • TKPROF is a command line utility that analyzes trace files and turns them into readable form.
  • It gives you the execution stats for all the SQL in the file
  • Tracing adds overhead. So avoid enabling it for the whole database. And remember to disable it when you've finished!

1. Generate session trace

Method A: SQL Trace

-- A SQL trace captures run time statistics for individual SQL statements.
-- It doesn't give a breakdown of PL/SQL execution times.

-- turn on session SQL tracing
alter session set sql_trace = true;

-- turn off session SQL tracing
alter session set sql_trace = false;

-- In 11g and higher, sql_trace is also an event and can be set with event syntax. Example:

alter session set events 'sql_trace bind=true';
alter session set events 'sql_trace bind=true, wait=true';

-- 10046 is special EVENT code. The 10046 trace is the equivalent of setting SQL_TRACE=TRUE. 
-- The advantage of using the event is that extra details may be output to the trace file depending on the level specified with the event.

ALTER SESSION SET EVENTS='10046 trace name context forever, level 1';
...your statement here...
ALTER SESSION SET EVENTS '10046 trace name context off';

-- In 11g and higher, trace can be restricted to a specific SQL ID
alter system set events 'sql_trace [sql: sql_id=4k1jlmn567cr7] bind=true, wait=true';

-- multiple SQL IDs can be specified using the pipe symbol as a separator
ALTER SYSTEM SET EVENTS 'sql_trace [sql: sql_id=5t6ygtsa3d356|6fa43fgg0rrtp] bind=true, wait=true';


alter session set events '10046 trace name context forever';    -- LEVEL 1 BASIC TRACE
alter session set events '10046 trace name context forever, level 8'; -- LEVEL 8 (WAITS) trace
alter session set events '10046 trace name context forever,level 12'; -- LEVEL 12 (BINDS & WAITS) trace
alter session set events '10046 trace name context off';     -- SWITCH OFF THE TRACE

Method B: Tracing using DBMS_MONITOR/DBMS_SESSION

-- ref: https://docs.oracle.com/database/121/ARPLS/d_monitor.htm#ARPLS67178

DBMS_MONITOR
alter session set tracefile_identifier = KZHANG';

BEGIN
    dbms_monitor.session_trace_enable(
        session_id  => 7,
        serial_num  => 4634,
        waits       => TRUE,
        binds       => TRUE
    );
END;
/

BEGIN
    dbms_monitor.session_trace_disable(
        session_id  => 7,
        serial_num  => 4634
    );
END;
/

-- quick version to trace your own session

exec DBMS_monitor.session_trace_enable ( null, null, true, true );
***your code here***
exec DBMS_monitor.session_trace_disable;
DBMS_SESSION
  • DBMS_SESSION can also be used to trace any database session easily.

  • Whereas the DBMS_MONITOR package can only be invoked by a user with the DBA role

  • users can also enable SQL tracing for their own session by invoking the:

    sql DBMS_SESSION.SESSION_TRACE_ENABLE procedure

  1. Log on to the database SQL*Plus as system or sys user.

  2. Find out the serial#, session id from v$session table for the user and run below command to start tracing

    sql execute dbms_session.set_sql_trace_in_session(sid, seria#, TRUE);

  3. Let user perform required actions which need to be traced.

  4. Disable the trace using below command:

    sql execute the command dbms_session.set_sql_trace_in_session(sid, seria#, FALSE);

  5. This will generate a .trc file in the database’s trace directory.

2. Get the session trace file

Trace file lives on database server. 3 Steps to get it:

  • 2a: figure out the database server host name you are connecting
  • 2b: figure out which folder/directory the trace file is stored
  • 2c: figure out which file

2a: figure out the database server host name

-- either query will give you the answer
SELECT host_name from v$instance;
SELECT sys_context('USERENV', 'SERVER_HOST') from dual;

2b: figure out folder/directory

SELECT
    name,
    value
FROM
    v$diag_info

2c: figure out which file

  • The trace file name is normally: sql<instance>_ora_<ospid>_<identifier>.trc

  • To make it easier to identify your trace file (optional step, but recommended), you can do below and this identifier will be appended to the actual trace file name (e.g. ora_1234_KZHANG.trc)

alter session set tracefile_identifier = KZHANG';
-- this only gives the trace file name for **your own** current session
select 
    name,
    value
from 
    v$diag_info
where 
    name = 'Default Trace File';

-- to get trace file only for another session
SELECT
    p.tracefile
FROM
    v$session s,
    v$process p
WHERE
    s.paddr = p.addr
AND s.sid   = 1234;
  • Then you need ask DBA to fetch the trace file for you from DB server
  • Starting 12.2, you can spool the output of the trace file from v$diag_trace_file_contents so that you can get a local copy of a trace file for a session
select * from v$diag_trace_file_contents
where trace_filename = '&p_your_trace_filename';

Format trace file

  • Once you are having trace file (*.trc), you need parse it with tkprof so that you can read it
  • TKPROF also includes an explain otpion. This will show you explain, not execution plans. Use this with caution
  • It includes all the SQL statements run between you starting and stopping tracing. This includes recursive SQL i.e. statements inside triggers, functions, etc.
  • It breaks down the execution time into parse, execution and fetch times
tkprof <trace-file-name> <output-file-name>
  • The first benefit is great if you have a slow transaction that includes several statements. You can sort them in the output file from slowest to fastest. This helps you spot which is takes the longest. To do this, use the sort option
tkprof <trace_file_name> <output_file_name> sort=prsela,exeela,fchela

Materialized View

Query Rewrite

  • To allow existing query to leverage MV, you need to enable query rewrite for the MV

    alter materialized view brick_colours_mv
    enable query rewrite;
    • "MAT_VIEW REWRITE ACCESS FULL" operation in the execution plan will tell you if query rewrite is happening
  • To do query rewrite, by default there must be an exact match between the rows in the table and the data in the MV. If there are any changes to the base table, there is a mismatch. And the optimizer will no longer do a rewrite.

  • To check if an MV is stable, you can query *_MVIEWS

    select mview_name, staleness 
    from   user_mviews

Using Stale Data

  • There may be cases where you still want to use a stale MV. By tweaking session setting, you can allow the optimizer to use stale MVs. It's possible to let the optimizer use stale materialized views for query rewrite. Do this by changing the QUERY_REWRITE_INTEGRITY parameter. This supports these values:

    • ENFORCED - there must be an exact match between table and MV data for a rewrite to happen. This is the default.
    • TRUSTED - the database can do rewrites based on declared constraints that are disabled. This setting rarely affects MV rewrites.
    • STALE_TOLERATED - the optimizer can do a query rewrite using stale MVs
    alter session set query_rewrite_integrity = stale_tolerated;

Keeping MVs In Sync with the Table

exec dbms_mview.refresh('brick_colours_mv', 'C');

Incremental Materialized View Refresh

Instead of re-running the whole query in an MV, you can apply the changes since the last refresh. This is called a fast refresh. To do this, first you must create a materialized view log on the table

create materialized view log 
  on bricks 
  with primary key, rowid, sequence, commit scn ( 
    colour, shape, weight, insert_date 
  )
  including new values;

With the log in place, the database can bring an MV up-to-date by applying the changes in the log to the MV itself.

When defining an MV the refresh methods are:

  • COMPLETE - refresh the MV by running its query
  • FAST- apply changes in the MV log to bring it up-to-date
  • FORCE - use FAST refresh if possible and COMPLETE if not

The default refresh clause is:

REFRESH FORCE ON DEMAND

- This means you have to refresh the MV manually. When you do so, the database will pick fast refresh if it can. To ensure the database always applies changes from the MV log, set the MV to REFRESH FAST.
- You can get the database to do this automatically at the end of every transaction. Do this by setting the refresh to ON COMMIT. When you commit, the database will apply any changes in the log to the MV. To set this property the log must be empty, so first do a final complete refresh

    ```sql
    exec dbms_mview.refresh ( 'brick_colours_mv', 'C' );

    alter materialized view brick_colours_mv
    refresh fast on commit;
    ```

When you create ENABLE QUERY REWRITE REFRESH FAST ON COMMIT MVs, the optimizer automatically uses them where possible. So these can give huge performance gains with no code changes!

  • Sadly there are many restrictions on the types of query you can use in a fast refresh on commit MV. For example, you can't use COUNT ( DISTINCT )
  • A FAST REFRESH ON COMMIT MV also adds some overhead to every transaction using the tables in the MV. In write heavy applications this can put too much stress on the database.

Real-Time Materailized View (DB V12.2+ )

A real-time MV allows a query to use its stale data. At query run time, the database applies any changes in the MV logs to get current results. To do this, the MV must be:

  • FAST REFRESH ON DEMAND
  • Use the ENABLE ON QUERY COMPUTATION clause
alter materialized view brick_colours_mv
  refresh fast on demand;
  
alter materialized view brick_colours_mv
  enable on query computation;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment