PK
Doa, mimetypeapplication/epub+zipPK D iTunesMetadata.plistE
This chapter describes the recommended methodology for SQL tuning. This chapter contains the following topics:
This section contains the following topics:
Data modeling is important to successful application design. You must perform this modeling in a way that quickly represents the business practices. Heated debates may occur about the correct data model. The important thing is to apply greatest modeling efforts to those entities affected by the most frequent business transactions.
In the modeling phase, there is a great temptation to spend too much time modeling the non-core data elements, which results in increased development lead times. Use of modeling tools can then rapidly generate schema definitions and can be useful when a fast prototype is required.
In the design and architecture phase of any system development, care should be taken to ensure that the application developers understand SQL execution efficiency. To achieve this goal, the development environment must support the following characteristics:
Good database connection management
Connecting to the database is an expensive operation that is highly unscalable. Therefore, a best practice is to minimize the number of concurrent connections to the database. A simple system, where a user connects at application initialization, is ideal. However, in a web-based or multitiered application in which application servers multiplex database connections to users, this approach can be difficult. With these types of applications, design them to pool database connections, and not reestablish connections for each user request.
Good cursor usage and management
Maintaining user connections is equally important to minimizing the parsing activity on the system. Parsing is the process of interpreting a SQL statement and creating an execution plan for it. This process has many phases, including syntax checking, security checking, execution plan generation, and loading shared structures into the shared pool. There are two types of parse operations:
A SQL statement is submitted for the first time, and no match is found in the shared pool. Hard parses are the most resource-intensive and unscalable, because they perform all the operations involved in a parse.
A SQL statement is submitted for the first time, and a match is found in the shared pool. The match can be the result of previous execution by another user. The SQL statement is shared, which is optimal for performance. However, soft parses are not ideal, because they still require syntax and security checking, which consume system resources.
Because parsing should be minimized as much as possible, application developers should design their applications to parse SQL statements once and execute them many times. This is done through cursors. Experienced SQL programmers should be familiar with the concept of opening and re-executing cursors.
Effective use of bind variables
Application developers must also ensure that SQL statements are shared within the shared pool. To achieve this goal, use bind variables to represent the parts of the query that change from execution to execution. If this is not done, then the SQL statement is likely to be parsed once and never re-used by other users. To ensure that SQL is shared, use bind variables and do not use string literals with SQL statements. For example:
Statement with string literals:
SELECT * FROM employees WHERE last_name LIKE 'KING';
Statement with bind variables:
SELECT * FROM employees WHERE last_name LIKE :1;
The following example shows the results of some tests on a simple OLTP application:
Test #Users Supported No Parsing all statements 270 Soft Parsing all statements 150 Hard Parsing all statements 60 Re-Connecting for each Transaction 30
These tests were performed on a four-CPU computer. The differences increase as the number of CPUs on the system increase.
This section contains the following topics:
The testing process mainly consists of functional and stability testing. At some point in the process, performance testing is performed.
The following list describes some simple rules for performance testing an application. If correctly documented, then this list provides important information for the production application and the capacity planning process after the application has gone live.
Use the Automatic Database Diagnostic Monitor (ADDM) and SQL Tuning Advisor for design validation
Test with realistic data volumes and distributions
All testing must be done with fully populated tables. The test database should contain data representative of the production system in terms of data volume and cardinality between tables. All the production indexes should be built and the schema statistics should be populated correctly.
Use the correct optimizer mode
Perform all testing with the optimizer mode that you plan to use in production.
Test a single user performance
Test a single user on an idle or lightly-used database for acceptable performance. If a single user cannot achieve acceptable performance under ideal conditions, then multiple users cannot achieve acceptable performance under real conditions.
Obtain and document plans for all SQL statements
Obtain an execution plan for each SQL statement. Use this process to verify that the optimizer is obtaining an optimal execution plan, and that the relative cost of the SQL statement is understood in terms of CPU time and physical I/Os. This process assists in identifying the heavy use transactions that require the most tuning and performance work in the future.
Attempt multiuser testing
This process is difficult to perform accurately, because user workload and profiles might not be fully quantified. However, transactions performing DML statements should be tested to ensure that there are no locking conflicts or serialization problems.
Test with the correct hardware configuration
Test with a configuration as close to the production system as possible. Using a realistic system is particularly important for network latencies, I/O subsystem bandwidth, and processor type and speed. Failing to use this approach may result in an incorrect analysis of potential performance problems.
Measure steady state performance
When benchmarking, it is important to measure the performance under steady state conditions. Each benchmark run should have a ramp-up phase, where users are connected to the application and gradually start performing work on the application. This process allows for frequently cached data to be initialized into the cache and single execution operations—such as parsing—to be completed before the steady state condition. Likewise, after a benchmark run, a ramp-down period is useful so that the system frees resources, and users cease work and disconnect.
When new applications are rolled out, two strategies are commonly adopted:
Big Bang approach - all users migrate to the new system at once
Trickle approach - users slowly migrate from existing systems to the new one
Both approaches have merits and disadvantages. The Big Bang approach relies on reliable testing of the application at the required scale, but has the advantage of minimal data conversion and synchronization with the old system, because it is simply switched off. The Trickle approach allows debugging of scalability issues as the workload increases, but might mean that data must be migrated to and from legacy systems as the transition takes place.
It is difficult to recommend one approach over the other, because each technique has associated risks that could lead to system outages as the transition takes place. Certainly, the Trickle approach allows profiling of real users as they are introduced to the new application, and allows the system to be reconfigured while only affecting the migrated users. This approach affects the work of the early adopters, but limits the load on support services. Thus, unscheduled outages only affect a small percentage of the user population.
The decision on how to roll out a new application is specific to each business. Any adopted approach has its own unique pressures and stresses. The more testing and knowledge that you derive from the testing process, the more you realize what is best for the rollout.
A SQL test case is a set of information that enables a developer to reproduce the execution plan for a specific SQL statement that has encountered a performance problem. SQL Test Case Builder is a tool that automatically gathers information needed to reproduce the problem in a different database instance.
This chapter contains the following topics:
In many cases, a reproducible test case makes it easier to resolve SQL-related problems. SQL Test Case Builder automates the sometimes difficult and time-consuming process of gathering and reproducing as much information as possible about a problem and the environment in which it occurred.
The output of SQL Test Case Builder is a set of scripts in a predefined directory. These scripts contain the commands required to re-create all the necessary objects and the environment. After the test case is ready, you can create a zip file of the directory and move it to another database, or upload the file to Oracle Support.
This section contains the following topics:
In the fault diagnosability infrastructure of Oracle Database, an incident is a single occurrence of a problem. A SQL incident is a SQL-related problem. When a problem (critical error) occurs multiple times, the database creates an incident for each occurrence. Incidents are timestamped and tracked in the Automatic Diagnostic Repository (ADR). Each incident is identified by a numeric incident ID, which is unique within the ADR.
SQL Test Case Builder is accessible any time on the command line. In Oracle Enterprise Manager Cloud Control (Cloud Control), the SQL Test Case pages are only available after a SQL incident is found.
See Also:
|
SQL Test Case Builder captures permanent information such as the query being executed, table and index definitions (but not the actual data), PL/SQL packages and program units, optimizer statistics, SQL plan baselines, and initialization parameter settings. Starting in Oracle Database 12c, SQL Test Case Builder also captures and replays transient information, including information only available as part of statement execution.
SQL Test Case Builder supports the following:
Adaptive plans
SQL Test Case Builder captures inputs to the decisions made regarding adaptive plans, and replays them at each decision point (see "Adaptive Plans"). For adaptive plans, the final statistics value at each buffering statistics collector is sufficient to decide on the final plan.
Automatic memory management
The database automatically handles the memory requested for each SQL operation. Actions such as sorting can affect performance significantly. SQL Test Case Builder keeps track of the memory activities, for example, where the database allocated memory and how much it allocated.
Dynamic statistics
Regathering dynamic statistics on a different database does not always generate the same results, for example, when data is missing (see "Dynamic Statistics"). To reproduce the problem, SQL Test Case Builder exports the dynamic statistics result from the source database. In the testing database, SQL Test Case Builder reuses the same values captured from the source database instead of regathering dynamic statistics.
Multiple execution support
SQL Test Case Builder can capture dynamic information accumulated during multiple executions of the query. This capability is important for automatic reoptimization (see "Automatic Reoptimization").
Compilation environment and bind values replay
The compilation environment setting is an important part of the query optimization context. SQL Test Case Builder captures nondefault settings altered by the user when running the problem query in the source database. If any nondefault parameter values are used, SQL Test Case Builder re-establishes the same values before running the query.
Object statistics history
The statistics history for objects is helpful to determine whether a plan change was caused by a change in statistics values. DBMS_STATS
stores the history in the data dictionary. SQL Test Case Builder stores this statistics data into a staging table during export. During import, SQL Test Case Builder automatically reloads the statistics history data into the target database from the staging table.
Statement history
The statement history is important for diagnosing problems related to adaptive cursor sharing, statistics feedback, and cursor sharing bugs. The history includes execution plans and compilation and execution statistics.
The output of the SQL Test Case Builder is a set of files that contains the commands required to re-create all the necessary objects and the environment. By default, SQL Test Case Builder stores the files in the following location, where incnum refers to the incident number and runnum refers to the run number:
$ADR_HOME/incident/incdir_incnum/SQLTCB_runnum
For example, a valid output file name could be as follows:
$ORACLE_HOME/log/diag/rdbms/dbsa/dbsa/incident/incdir_2657/SQLTCB_1
You can specify a nondefault location by creating an Oracle directory and invoking DBMS_SQLDIAG.EXPORT_SQL_TESTCASE
, as in the following example:
CREATE OR REPLACE DIRECTORY my_tcb_dir_exp '/tmp'; BEGIN DBMS_SQLDIAG.EXPORT_SQL_TESTCASE ( directory => 'my_tcb_dir_exp' , sql_text => 'SELECT COUNT(*) FROM sales' , testcase => tco ); END;
You can access SQL Test Case Builder either through Cloud Control or using PL/SQL on the command line.
Within Cloud Control, you can access SQL Test Case Builder from the Incident Manager page or the Support Workbench page.
This task explains how to navigate to the Incident Manager from the Incidents and Problems section on the Database Home page.
To access the Incident Manager:
Access the Database Home page, as described in "Accessing the Database Home Page in Cloud Control."
In the Incidents and Problems section, locate the SQL incident to be investigated.
In the following example, the ORA 600
error is a SQL incident.
Click the summary of the incident.
The Problem Details page of the Incident Manager appears.
The Support Workbench page appears, with the incidents listed in a table.
See Also:
|
This task explains how to navigate to the Incident Manager from the Oracle Database menu.
To access the Support Workbench:
Access the Database Home page, as described in "Accessing the Database Home Page in Cloud Control."
From the Oracle Database menu, select Diagnostics, then Support Workbench.
The Support Workbench page appears, with the incidents listed in a table.
See Also: Online help for Cloud Control |
You can use the DBMS_SQLDIAG
package to perform tasks relating to SQL Test Case Builder. This package consists of various subprograms for the SQL Test Case Builder, some of which are listed in Table 17-1.
Table 17-1 SQL Test Case Functions in DBMS_SQLDIAG
Procedure | Description |
---|---|
|
Exports a SQL test case to a user-specified directory |
|
Exports a SQL test case corresponding to the incident ID passed as an argument |
|
Exports a SQL test case corresponding to the SQL text passed as an argument |
|
Imports a SQL test case into a schema |
See Also: Oracle Database PL/SQL Packages and Types Reference to learn more about theDBMS_SQLDIAG package |
This tutorial explains how to run SQL Test Case Builder using Cloud Control.
Assumptions
This tutorial assumes the following:
You ran the following EXPLAIN PLAN
statement as user sh
, which causes an internal error:
EXPLAIN PLAN FOR SELECT unit_cost, sold FROM costs c, ( SELECT /*+ merge */ p.prod_id, SUM(quantity_sold) AS sold FROM products p, sales s WHERE p.prod_id = s.prod_id GROUP BY p.prod_id ) v WHERE c.prod_id = v.prod_id;
In the Incidents and Problems section on the Database Home page, a SQL incident generated by the internal error appears.
To run SQL Test Case Builder:
Access the Incident Details page, as explained in "Accessing the Incident Manager".
Click the Incidents tab.
The Problem Details page appears.
Click the summary for the incident.
The Incident Details page appears.
In Guided Resolution, click View Diagnostic Data.
The Incident Details: incident_number page appears.
In the Application Information section, click Additional Diagnostics.
The Additional Diagnostics subpage appears.
Select SQL Test Case Builder, and then click Run.
The Run User Action page appears.
Select a sampling percentage (optional), and then click Submit.
After processing completes, the Confirmation page appears.
Access the SQL Test Case files in the location described in "Output of SQL Test Case Builder".
See Also: Online help for Cloud Control |
This chapter contains the following topics:
A SQL profile is a database object that contains auxiliary statistics specific to a SQL statement. Conceptually, a SQL profile is to a SQL statement what object-level statistics are to a table or index. SQL profiles are created when a DBA invokes SQL Tuning Advisor (see "About SQL Tuning Advisor").
This section contains the following topics:
When profiling a SQL statement, SQL Tuning Advisor uses a specific set of bind values as input, and then compares the optimizer estimate with values obtained by executing fragments of the statement on a data sample. When significant variances are found, SQL Tuning Advisor bundles corrective actions together in a SQL profile, and then recommends its acceptance.
The corrected statistics in a SQL profile can improve optimizer cardinality estimates, which in turn leads the optimizer to select better plans. SQL profiles provide the following benefits over other techniques for improving plans:
Unlike hints and stored outlines, SQL profiles do not tie the optimizer to a specific plan or subplan. SQL profiles fix incorrect estimates while giving the optimizer the flexibility to pick the best plan in different situations.
Unlike hints, no changes to application source code are necessary when using SQL profiles. The use of SQL profiles by the database is transparent to the user.
A SQL profile is a collection of auxiliary statistics on a query, including all tables and columns referenced in the query. The profile stores this information in the data dictionary. The optimizer uses this information at optimization time to determine the correct plan.
Note: The SQL profile contains supplemental statistics for the entire statement, not individual plans. The profile does not itself determine a specific plan. |
A SQL profile contains, among other statistics, a set of cardinality adjustments. The cardinality measure is based on sampling the WHERE
clause rather than on statistical projection. A profile uses parts of the query to determine whether the estimated cardinalities are close to the actual cardinalities and, if a mismatch exists, uses the corrected cardinalities. For example, if a SQL profile exists for SELECT * FROM t WHERE x=5 AND y=10
, then the profile stores the actual number of rows returned.
When choosing plans, the optimizer has the following sources of information:
The environment, which contains the database configuration, bind variable values, optimizer statistics, data set, and so on
The supplemental statistics in the SQL profile
Figure 22-1 shows the relationship between a SQL statement and the SQL profile for this statement. The optimizer uses the SQL profile and the environment to generate an execution plan. In this example, the plan is in the SQL plan baseline for the statement.
If either the optimizer environment or SQL profile change, then the optimizer can create a new plan. As tables grow, or as indexes are created or dropped, the plan for a SQL profile can change. The profile continues to be relevant even if the data distribution or access path of the corresponding statement changes. In general, you do not need to refresh SQL profiles.
Over time, profile content can become outdated. In this case, performance of the SQL statement may degrade. The statement may appear as high-load or top SQL. In this case, the Automatic SQL Tuning task again captures the statement as high-load SQL. You can implement a new SQL profile for the statement.
Internally, a SQL profile is implemented using hints that address different types of problems. These hints do not specify any particular plan. Rather, the hints correct errors in the optimizer estimation algorithm that lead to suboptimal plans. For example, a profile may use the TABLE_STATS
hint to set object statistics for tables when the statistics are missing or stale.
As explained in "SQL Profiling", SQL Tuning Advisor invokes Automatic Tuning Optimizer to generate SQL profile recommendations. Recommendations to implement SQL profiles occur in a finding, which appears in a separate section of the SQL Tuning Advisor report.
When you implement (or accept) a SQL profile, the database creates the profile and stores it persistently in the data dictionary. However, the SQL profile information is not exposed through regular dictionary views.
Example 22-1 SQL Profile Recommendation
In this example, the database found a better plan for a SELECT
statement that uses several expensive joins. The database recommends running DBMS_SQLTUNE.ACCEPT_SQL_PROFILE
to implement the profile, which enables the statement to run 98.53% faster.
------------------------------------------------------------------------------- FINDINGS SECTION (2 findings) ------------------------------------------------------------------------------- 1- SQL Profile Finding (see explain plans section below) -------------------------------------------------------- A potentially better execution plan was found for this statement. Choose one of the following SQL profiles to implement. Recommendation (estimated benefit: 99.45%) ------------------------------------------ - Consider accepting the recommended SQL profile. execute dbms_sqltune.accept_sql_profile(task_name => 'my_task', object_id => 3, task_owner => 'SH', replace => TRUE); Validation results ------------------ The SQL profile was tested by executing both its plan and the original plan and measuring their respective execution statistics. A plan may have been only partially executed if the other could be run to completion in less time. Original Plan With SQL Profile % Improved ------------- ---------------- ---------- Completion Status: PARTIAL COMPLETE Elapsed Time(us): 15467783 226902 98.53 % CPU Time(us): 15336668 226965 98.52 % User I/O Time(us): 0 0 Buffer Gets: 3375243 18227 99.45 % Disk Reads: 0 0 Direct Writes: 0 0 Rows Processed: 0 109 Fetches: 0 109 Executions: 0 1 Notes ----- 1. The SQL profile plan was first executed to warm the buffer cache. 2. Statistics for the SQL profile plan were averaged over next 3 executions.
Sometimes SQL Tuning Advisor may recommend implementing a profile that uses the Automatic Degree of Parallelism (Auto DOP) feature. A parallel query profile is only recommended when the original plan is serial and when parallel execution can significantly reduce the elapsed time for a long-running query.
When it recommends a profile that uses Auto DOP, SQL Tuning Advisor gives details about the performance overhead of using parallel execution for the SQL statement in the report. For parallel execution recommendations, SQL Tuning Advisor may provide two SQL profile recommendations, one using serial execution and one using parallel.
The following example shows a parallel query recommendation. In this example, a degree of parallelism of 7 improves response time significantly at the cost of increasing resource consumption by almost 25%. You must decide whether the reduction in database throughput is worth the increase in response time.
Recommendation (estimated benefit: 99.99%) ------------------------------------------ - Consider accepting the recommended SQL profile to use parallel execution for this statement. execute dbms_sqltune.accept_sql_profile(task_name => 'gfk_task', object_id => 3, task_owner => 'SH', replace => TRUE, profile_type => DBMS_SQLTUNE.PX_PROFILE); Executing this query parallel with DOP 7 will improve its response time 82.22% over the SQL profile plan. However, there is some cost in enabling parallel execution. It will increase the statement's resource consumption by an estimated 24.43% which may result in a reduction of system throughput. Also, because these resources are consumed over a much smaller duration, the response time of concurrent statements might be negatively impacted if sufficient hardware capacity is not available. The following data shows some sampled statistics for this SQL from the past week and projected weekly values when parallel execution is enabled. Past week sampled statistics for this SQL ----------------------------------------- Number of executions 0 Percent of total activity .29 Percent of samples with #Active Sessions > 2*CPU 0 Weekly DB time (in sec) 76.51 Projected statistics with Parallel Execution -------------------------------------------- Weekly DB time (in sec) 95.21
See Also:
|
You can use SQL profiles with or without SQL plan management. No strict relationship exists between the SQL profile and the plan baseline. If a statement has multiple plans in a SQL plan baseline, then a SQL profile is useful because it enables the optimizer to choose the lowest-cost plan in the baseline.
Oracle Enterprise Manager Cloud Control (Cloud Control) usually handles SQL profiles as part of automatic SQL tuning.
On the command line, you can manage SQL profiles with the DBMS_SQLTUNE
package. To use the APIs, you must have the ADMINISTER SQL MANAGEMENT OBJECT
privilege.
See Also:
|
This section explains the basic tasks involved in managing SQL profiles. Figure 22-2 shows the basic workflow for implementing, altering, and dropping SQL profiles.
Typically, you manage SQL profiles in the following sequence:
Implement a recommended SQL profile.
"Implementing a SQL Profile" describes this task.
Obtain information about SQL profiles stored in the database.
"Listing SQL Profiles" describes this task.
Optionally, modify the implemented SQL profile.
"Altering a SQL Profile" describes this task.
Drop the implemented SQL profile when it is no longer needed.
"Dropping a SQL Profile" describes this task.
To tune SQL statements on another database, you can transport both a SQL tuning set and a SQL profile to a separate database. "Transporting a SQL Profile" describes this task.
See Also: Oracle Database PL/SQL Packages and Types Reference for information about theDBMS_SQLTUNE package |
Implementing (also known as accepting) a SQL profile means storing it persistently in the database. A profile must be implemented before the optimizer can use it as input when generating plans.
As a rule of thumb, implement a SQL profile recommended by SQL Tuning Advisor. If the database recommends both an index and a SQL profile, then either use both or use the SQL profile only. If you create an index, then the optimizer may need the profile to pick the new index.
In some situations, SQL Tuning Advisor may find an improved serial plan in addition to an even better parallel plan. In this case, the advisor recommends both a standard and a parallel SQL profile, enabling you to choose between the best serial and best parallel plan for the statement. Implement a parallel plan only if the increase in response time is worth the decrease in throughput.
To implement a SQL profile, execute the DBMS_SQLTUNE.ACCEPT_SQL_PROFILE
procedure. Some important parameters are as follows:
profile_type
Set this parameter to REGULAR_PROFILE
for a SQL profile without a change to parallel execution, or PX_PROFLE
for a SQL profile with a change to parallel execution.
force_match
This parameter controls statement matching. Typically, an accepted SQL profile is associated with the SQL statement through a SQL signature that is generated using a hash function. This hash function changes the SQL statement to upper case and removes all extra whites spaces before generating the signature. Thus, the same SQL profile works for all SQL statements in which the only difference is case and white spaces.
By setting force_match
to true
, the SQL profile additionally targets all SQL statements that have the same text after the literal values in the WHERE
clause have been replaced by bind variables. This setting may be useful for applications that use only literal values because it enables SQL with text differing only in its literal values to share a SQL profile. If both literal values and bind variables are in the SQL text, or if force_match
is set to false
(default), then the literal values in the WHERE
clause are not replaced by bind variables.
See Also: Oracle Database PL/SQL Packages and Types Reference for information about theACCEPT_SQL_PROFILE procedure |
This section shows how to use the ACCEPT_SQL_PROFILE
procedure to implement a SQL profile.
Assumptions
This tutorial assumes the following:
The SQL Tuning Advisor task STA_SPECIFIC_EMP_TASK
includes a recommendation to create a SQL profile.
The name of the SQL profile is my_sql_profile
.
The PL/SQL block accepts a profile that uses parallel execution (profile_type
).
The profile uses force matching.
To implement a SQL profile:
Connect SQL*Plus to the database with the appropriate privileges, and then execute the ACCEPT_SQL_PROFILE
function.
For example, execute the following PL/SQL:
DECLARE my_sqlprofile_name VARCHAR2(30); BEGIN my_sqlprofile_name := DBMS_SQLTUNE.ACCEPT_SQL_PROFILE ( task_name => 'STA_SPECIFIC_EMP_TASK' , name => 'my_sql_profile' , profile_type => DBMS_SQLTUNE.PX_PROFILE , force_match => true ); END; /
See Also: Oracle Database PL/SQL Packages and Types Reference to learn about theDBMS_SQLTUNE.ACCEPT_SQL_PROFILE procedure |
The data dictionary view DBA_SQL_PROFILES
stores SQL profiles persistently in the database. The statistics are in an Oracle internal format, so you cannot query profiles directly. However, you can list profiles.
To list SQL profiles:
Connect SQL*Plus to the database with the appropriate privileges, and then query the DBA_SQL_PROFILES
view.
For example, execute the following query:
COLUMN category FORMAT a10 COLUMN sql_text FORMAT a20 SELECT NAME, SQL_TEXT, CATEGORY, STATUS FROM DBA_SQL_PROFILES;
Sample output appears below:
NAME SQL_TEXT CATEGORY STATUS ------------------------------ -------------------- ---------- -------- SYS_SQLPROF_01285f6d18eb0000 select promo_name, c DEFAULT ENABLED ount(*) c from promo tions p, sales s whe re s.promo_id = p.pr omo_id and p.promo_c ategory = 'internet' group by p.promo_na me order by c desc
You can alter attributes of an existing SQL profile using the attribute_name
parameter of the ALTER_SQL_PROFILE
procedure.
The CATEGORY
attribute determines which sessions can apply a profile. View the CATEGORY
attribute by querying DBA_SQL_PROFILES.CATEGORY
. By default, all profiles are in the DEFAULT
category, which means that all sessions in which the SQLTUNE_CATEGORY
initialization parameter is set to DEFAULT
can use the profile.
By altering the category of a SQL profile, you determine which sessions are affected by profile creation. For example, by setting the category to DEV
, only sessions in which the SQLTUNE_CATEGORY
initialization parameter is set to DEV
can use the profile. Other sessions do not have access to the SQL profile and execution plans for SQL statements are not impacted by the SQL profile. This technique enables you to test a profile in a restricted environment before making it available to other sessions.
The example in this section assumes that you want to change the category of the SQL profile so it is used only by sessions with the SQL profile category set to TEST
, run the SQL statement, and then change the profile category back to DEFAULT
.
To alter a SQL profile:
Connect SQL*Plus to the database with the appropriate privileges, and then use the ALTER_SQL_PROFILE
procedure to set the attribute_name
.
For example, execute the following code to set the attribute CATEGORY
to TEST
:
VARIABLE pname my_sql_profile BEGIN DBMS_SQLTUNE.ALTER_SQL_PROFILE ( name => :pname , attribute_name => 'CATEGORY' , value => 'TEST' ); END;
Change the initialization parameter setting in the current database session.
For example, execute the following SQL:
ALTER SESSION SET SQLTUNE_CATEGORY = 'TEST';
Test the profiled SQL statement.
Use the ALTER_SQL_PROFILE
procedure to set the attribute_name
.
For example, execute the following code to set the attribute CATEGORY
to DEFAULT
:
VARIABLE pname my_sql_profile BEGIN DBMS_SQLTUNE.ALTER_SQL_PROFILE ( name => :pname , attribute_name => 'CATEGORY' , value => 'DEFAULT' ); END;
See Also:
|
You can drop a SQL profile with the DROP_SQL_PROFILE
procedure.
Assumptions
This section assumes the following:
You want to drop my_sql_profile
.
You want to ignore errors raised if the name does not exist.
To drop a SQL profile:
Connect SQL*Plus to the database with the appropriate privileges, call the DBMS_SQLTUNE.DROP_SQL_PROFILE
procedure.
The following example drops the profile named my_sql_profile
:
BEGIN DBMS_SQLTUNE.DROP_SQL_PROFILE ( name => 'my_sql_profile' ); END; /
See Also:
|
Yo"u can transport SQL profiles. This operation involves exporting the SQL profile from the SYS
schema in one database to a staging table, and then importing the SQL profile from the staging table into another database. You can transport a SQL profile to any Oracle database created in the same release or later.
Table 22-1 shows the main procedures and functions for managing SQL profiles.
Table 22-1 APIs for Transporting SQL Profiles
Procedure or Function | Description |
---|---|
|
Creates the staging table used for copying SQL profiles from one system to another. |
|
Moves profile data out of the |
|
Uses the profile data stored in the staging table to create profiles on this system. |
The following graphic shows the basic workflow of transporting SQL profiles:
Assumptions
This tutorial assumes the following:
You want to transport my_profile
from a production database to a test database.
You want to create the staging table in the dba1
schema.
To transport a SQL profile:
Connect SQL*Plus to the database with the appropriate privileges, and then use the CREATE_STGTAB_SQLPROF
procedure to create a staging table to hold the SQL profiles.
The following example creates my_staging_table
in the dba1
schema:
BEGIN DBMS_SQLTUNE.CREATE_STGTAB_SQLPROF ( table_name => 'my_staging_table' , schema_name => 'dba1' ); END; /
Use the PACK_STGTAB_SQLPROF
procedure to export SQL profiles into the staging table.
The following example populates dba1.my_staging_table
with the SQL profile my_profile
:
BEGIN DBMS_SQLTUNE.PACK_STGTAB_SQLPROF ( profile_name => 'my_profile' , staging_table_name => 'my_staging_table' , staging_schema_owner => 'dba1' ); END; /
Move the staging table to the database where you plan to unpack the SQL profiles.
Move the table using your utility of choice. For example, use Oracle Data Pump or a database link.
On the database where you plan to import the SQL profiles, use UNPACK_STGTAB_SQLPROF
to unpack SQL profiles from the staging table.
The following example shows how to unpack SQL profiles in the staging table:
BEGIN DBMS_SQLTUNE.UNPACK_STGTAB_SQLPROF( replace => true , staging_table_name => 'my_staging_table' ); END; /
See Also:
|
This chapter describes the most important concepts relating to the query optimizer. This chapter contains the following topics:
The query optimizer (called simply the optimizer) is built-in database software that determines the most efficient method for a SQL statement to access requested data.
This section contains the following topics:
The optimizer attempts to generate the best execution plan for a SQL statement. The best execution plan is defined as the plan with the lowest cost among all considered candidate plans. The cost computation accounts for factors of query execution such as I/O, CPU, and communication.
The best method of execution depends on myriad conditions including how the query is written, the size of the data set, the layout of the data, and which access structures exist. The optimizer determines the best plan for a SQL statement by examining multiple access methods, such as full table scan or index scans, and different join methods such as nested loops and hash joins.
Because the database has many internal statistics and tools at its disposal, the optimizer is usually in a better position than the user to determine the best method of statement execution. For this reason, all SQL statements use the optimizer.
Consider a user who queries records for employees who are managers. If the database statistics indicate that 80% of employees are managers, then the optimizer may decide that a full table scan is most efficient. However, if statistics indicate that few employees are managers, then reading an index followed by a table access by rowid may be more efficient than a full table scan.
Query optimization is the overall process of choosing the most efficient means of executing a SQL statement. SQL is a nonprocedural language, so the optimizer is free to merge, reorganize, and process in any order.
The database optimizes each SQL statement based on statistics collected about the accessed data. When generating execution plans, the optimizer considers different access paths and join methods. Factors considered by the optimizer include:
System resources, which includes I/O, CPU, and memory
Number of rows returned
Size of the initial data sets
The cost is a number that represents the estimated resource usage for an execution plan. The optimizer assigns a cost to each possible plan, and then chooses the plan with the lowest cost. For this reason, the optimizer is sometimes called the cost-based optimizer (CBO) to contrast it with the legacy rule-based optimizer (RBO).
Note: The optimizer may not make the same decisions from one version of Oracle Database to the next. In recent versions, the optimizer might make different decision because better information is available and more optimizer transformations are possible. |
An execution plan describes a recommended method of execution for a SQL statement. The plans shows the combination of the steps Oracle Database uses to execute a SQL statement. Each step either retrieves rows of data physically from the database or prepares them for the user issuing the statement.
In Figure 4-1, the optimizer generates two possible execution plans for an input SQL statement, uses statistics to calculate their costs, compares their costs, and chooses the plan with the lowest cost.
As shown in Figure 4-1, the input to the optimizer is a parsed representation of a SQL statement. Each SELECT
block in the original SQL statement is represented internally by a query block. A query block can be a top-level statement, subquery, or unmerged view (see "View Merging").
In Example 4-1, the SQL statement consists of two query blocks. The subquery in parentheses is the inner query block. The outer query block, which is the rest of the SQL statement, retrieves names of employees in the departments whose IDs were supplied by the subquery.
Example 4-1 Query Blocks
SELECT first_name, last_name FROM hr.employees WHERE department_id IN (SELECT department_id FROM hr.departments WHERE location_id = 1800);
The query form determines how query blocks are interrelated.
For each query block, the optimizer generates a query subplan. The database optimizes query blocks separately from the bottom up. Thus, the database optimizes the innermost query block first and generates a subplan for it, and then generates the outer query block representing the entire query.
The number of possible plans for a query block is proportional to the number of objects in the FROM
clause. This number rises exponentially with the number of objects. For example, the possible plans for a join of five tables are significantly higher than the possible plans for a join of two tables.
One analogy for the optimizer is an online trip advisor. A cyclist wants to know the most efficient bicycle route from point A to point B. A query is like the directive "I need the most efficient route from point A to point B" or "I need the most efficient route from point A to point B by way of point C." The trip advisor uses an internal algorithm, which relies on factors such as speed and difficulty, to determine the most efficient route. The cyclist can influence the trip advisor's decision by using directives such as "I want to arrive as fast as possible" or "I want the easiest ride possible."
In this analogy, an execution plan is a possible route generated by the trip advisor. Internally, the advisor may divide the overall route into several subroutes (subplans), and calculate the efficiency for each subroute separately. For example, the trip advisor may estimate one subroute at 15 minutes with medium difficulty, an alternative subroute at 22 minutes with minimal difficulty, and so on.
The advisor picks the most efficient (lowest cost) overall route based on user-specified goals and the available statistics about roads and traffic conditions. The more accurate the statistics, the better the advice. For example, if the advisor is not frequently notified of traffic jams, road closures, and poor road conditions, then the recommended route may turn out to be inefficient (high cost).
The optimizer contains three main components, which are shown in Figure 4-2.
A set of query blocks represents a parsed query, which is the input to the optimizer. The optimizer performs the following operations:
Query transformer
The optimizer determines whether it is helpful to change the form of the query so that the optimizer can generate a better execution plan. See "Query Transformer".
Estimator
The optimizer estimates the cost of each plan based on statistics in the data dictionary. See "Estimator".
Plan Generator
The optimizer compares the costs of plans and chooses the lowest-cost plan, known as the execution plan, to pass to the row source generator. See "Plan Generator".
For some statements, the query transformer determines whether it is advantageous to rewrite the original SQL statement into a semantically equivalent SQL statement with a lower cost. When a viable alternative exists, the database calculates the cost of the alternatives separately and chooses the lowest-cost alternative. Chapter 5, "Query Transformations" describes the different types of optimizer transformations.
Figure 4-3 shows the query transformer rewriting an input query that uses OR
into an output query that uses UNION
ALL
.
The estimator is the component of the optimizer that determines the overall cost of a given execution plan. The estimator uses three different types of measures to achieve this goal:
The percentage of rows in the row set that the query selects, with 0
meaning no rows and 1
meaning all rows. Selectivity is tied to a query predicate, such as WHERE last_name LIKE 'A%'
, or a combination of predicates. A predicate becomes more selective as the selectivity value approaches 0
and less selective (or more unselective) as the value approaches 1
.
Note: Selectivity is an internal calculation that is not visible in the execution plans. |
The cardinality is the estimated number of rows returned by each operation in an execution plan. This input, which is crucial to obtaining an optimal plan, is common to all cost functions. Cardinality can be derived from the table statistics collected by DBMS_STATS
, or derived after accounting for effects from predicates (filter, join, and so on), DISTINCT
or GROUP BY
operations, and so on.
This measure represents units of work or resource used. The query optimizer uses disk I/O, CPU usage, and memory usage as units of work.
As shown in Figure 4-4, if statistics are available, then the estimator uses them to compute the measures. The statistics improve the degree of accuracy of the measures.
For the query shown in Example 4-1, the estimator uses selectivity, cardinality, and cost measures to produce its total cost estimate of 3:
----------------------------------------------------------------------------------
| Id| Operation |Name |Rows|Bytes|Cost(%CPU)| Time|
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10| 250| 3 (0)| 00:00:01|
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 10| 250| 3 (0)| 00:00:01|
|*3 | TABLE ACCESS FULL |DEPARTMENTS | 1| 7| 2 (0)| 00:00:01|
|*4 | INDEX RANGE SCAN |EMP_DEPARTMENT_IX| 10| | 0 (0)| 00:00:01|
| 5 | TABLE ACCESS BY INDEX ROWID|EMPLOYEES | 10| 180| 1 (0)| 00:00:01|
----------------------------------------------------------------------------------
The selectivity represents a fraction of rows from a row set. The row set can be a base table, a view, or the result of a join. The selectivity is tied to a query predicate, such as last_name
= 'Smith'
, or a combination of predicates, such as last_name
= 'Smith'
AND
job_id
= 'SH_CLERK'
.
Note: Selectivity is an internal calculation that is not visible in execution plans. |
A predicate filters a specific number of rows from a row set. Thus, the selectivity of a predicate indicates how many rows pass the predicate test. Selectivity ranges from 0.0 to 1.0. A selectivity of 0.0 means that no rows are selected from a row set, whereas a selectivity of 1.0 means that all rows are selected. A predicate becomes more selective as the value approaches 0.0 and less selective (or more unselective) as the value approaches 1.0.
The optimizer estimates selectivity depending on whether statistics are available:
Statistics not available
Depending on the value of the OPTIMIZER_DYNAMIC_SAMPLING
initialization parameter, the optimizer either uses dynamic statistics or an internal default value. The database uses different internal defaults depending on the predicate type. For example, the internal default for an equality predicate (last_name
= 'Smith'
) is lower than for a range predicate (last_name >
'Smith'
) because an equality predicate is expected to return a smaller fraction of rows.
Statistics available
When statistics are available, the estimator uses them to estimate selectivity. Assume there are 150 distinct employee last names. For an equality predicate last_name =
'Smith'
, selectivity is the reciprocal of the number n
of distinct values of last_name
, which in this example is .006 because the query selects rows that contain 1 out of 150 distinct values.
If a histogram exists on the last_name
column, then the estimator uses the histogram instead of the number of distinct values. The histogram captures the distribution of different values in a column, so it yields better selectivity estimates, especially for columns that have data skew. See Chapter 11, "Histograms."
The cardinality is the estimated number of rows returned by each operation in an execution plan. For example, if the optimizer estimate for the number of rows returned by a full table scan is 100, then the cardinality for this operation is 100. The cardinality value appears in the Rows
column of the execution plan.
The optimizer determines the cardinality for each operation based on a complex set of formulas that use both table and column level statistics, or dynamic statistics, as input. The optimizer uses one of the simplest formulas when a single equality predicate appears in a single-table query, with no histogram. In this case, the optimizer assumes a uniform distribution and calculates the cardinality for the query by dividing the total number of rows in the table by the number of distinct values in the column used in the WHERE
clause predicate.
For example, user hr
queries the employees
table as follows:
SELECT first_name, last_name FROM employees WHERE salary='10200';
The employees
table contains 107 rows. The current database statistics indicate that the number of distinct values in the salary
column is 58
. Thus, the optimizer calculates the cardinality of the result set as 2
, using the formula 107/58=1.84
.
Cardinality estimates must be as accurate as possible because they influence all aspects of the execution plan. Cardinality is important when the optimizer determines the cost of a join. For example, in a nested loops join of the employees
and departments
tables, the number of rows in employees
determines how often the database must probe the departments
table. Cardinality is also important for determining the cost of sorts.
The optimizer cost model accounts for the I/O, CPU, and network resources that a query is predicted to use. The cost is an internal numeric measure that represents the estimated resource usage for a plan. The lower the cost, the more efficient the plan.
The execution plan displays the cost of the entire plan, which is indicated on line 0
, and each individual operation. For example, the following plan shows a cost of 14
.
EXPLAINED SQL STATEMENT: ------------------------ SELECT prod_category, AVG(amount_sold) FROM sales s, products p WHERE p.prod_id = s.prod_id GROUP BY prod_category Plan hash value: 4073170114 ---------------------------------------------------------------------- | Id | Operation | Name | Cost (%CPU)| ---------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 14 (100)| | 1 | HASH GROUP BY | | 14 (22)| | 2 | HASH JOIN | | 13 (16)| | 3 | VIEW | index$_join$_002 | 7 (15)| | 4 | HASH JOIN | | | | 5 | INDEX FAST FULL SCAN| PRODUCTS_PK | 4 (0)| | 6 | INDEX FAST FULL SCAN| PRODUCTS_PROD_CAT_IX | 4 (0)| | 7 | PARTITION RANGE ALL | | 5 (0)| | 8 | TABLE ACCESS FULL | SALES | 5 (0)| ----------------------------------------------------------------------
The cost is an internal unit that you can use for plan comparisons. You cannot tune or change it.
The access path determines the number of units of work required to get data from a base table. The access path can be a table scan, a fast full index scan, or an index scan.
Table scan or fast full index scan
During a table scan or fast full index scan, the database reads multiple blocks from disk in a single I/O. Therefore, the cost of the scan depends on the number of blocks to be scanned and the multiblock read count value.
Index scan
The cost of an index scan depends on the levels in the B-tree, the number of index leaf blocks to be scanned, and the number of rows to be fetched using the rowid in the index keys. The cost of fetching rows using rowids depends on the index clustering factor.
The join cost represents the combination of the individual access costs of the two row sets being joined, plus the cost of the join operation.
The plan generator explores various plans for a query block by trying out different access paths, join methods, and join orders. Many plans are possible because of the various combinations that the database can use to produce the same result. The optimizer picks the plan with the lowest cost.
Figure 4-5 shows the optimizer testing different plans for an input query.
The following snippet from an optimizer trace file shows some computations that the optimizer performs:
GENERAL PLANS *************************************** Considering cardinality-based initial join order. Permutations for Starting Table :0 Join order[1]: DEPARTMENTS[D]#0 EMPLOYEES[E]#1 *************** Now joining: EMPLOYEES[E]#1 *************** NL Join Outer table: Card: 27.00 Cost: 2.01 Resp: 2.01 Degree: 1 Bytes: 16 Access path analysis for EMPLOYEES . . . Best NL cost: 13.17 . . . SM Join SM cost: 6.08 resc: 6.08 resc_io: 4.00 resc_cpu: 2501688 resp: 6.08 resp_io: 4.00 resp_cpu: 2501688 . . . SM Join (with index on outer) Access Path: index (FullScan) . . . HA Join HA cost: 4.57 resc: 4.57 resc_io: 4.00 resc_cpu: 678154 resp: 4.57 resp_io: 4.00 resp_cpu: 678154 Best:: JoinMethod: Hash Cost: 4.57 Degree: 1 Resp: 4.57 Card: 106.00 Bytes: 27 . . . *********************** Join order[2]: EMPLOYEES[E]#1 DEPARTMENTS[D]#0 . . . *************** Now joining: DEPARTMENTS[D]#0 *************** . . . HA Join HA cost: 4.58 resc: 4.58 resc_io: 4.00 resc_cpu: 690054 resp: 4.58 resp_io: 4.00 resp_cpu: 690054 Join order aborted: cost > best plan cost ***********************
The trace file shows the optimizer first trying the departments
table as the outer table in the join. The optimizer calculates the cost for three different join methods: nested loops join (NS), sort merge (SM), and hash join (HJ). The optimizer picks the hash join as the most efficient method:
Best:: JoinMethod: Hash
Cost: 4.57 Degree: 1 Resp: 4.57 Card: 106.00 Bytes: 27
The optimizer then tries a different join order, using employees
as the outer table. This join order costs more than the previous join order, so it is abandoned.
The optimizer uses an internal cutoff to reduce the number of plans it tries when finding the lowest-cost plan. The cutoff is based on the cost of the current best plan. If the current best cost is large, then the optimizer explores alternative plans to find a lower cost plan. If the current best cost is small, then the optimizer ends the search swiftly because further cost improvement is not significant.
The optimizer performs different operations depending on how it is invoked. The database provides the following types of optimization:
Normal optimization
The optimizer compiles the SQL and generates an execution plan. The normal mode generates a reasonable plan for most SQL statements. Under normal mode, the optimizer operates with strict time constraints, usually a fraction of a second, during which it must find an optimal plan.
SQL Tuning Advisor optimization
When SQL Tuning Advisor invokes the optimizer, the optimizer is known as Automatic Tuning Optimizer. In this case, the optimizer performs additional analysis to further improve the plan produced in normal mode. The optimizer output is not an execution plan, but a series of actions, along with their rationale and expected benefit for producing a significantly better plan.
In Oracle Database, adaptive query optimization is a set of capabilities that enables the optimizer to make run-time adjustments to execution plans and discover additional information that can lead to better statistics. Adaptive optimization is helpful when existing statistics are not sufficient to generate an optimal plan.
The following graphic shows the feature set for adaptive query optimization:
An adaptive plan enables the optimizer to defer the final plan decision for a statement until execution time. The ability of the optimizer to adapt a plan, based on information learned during execution, can greatly improve query performance.
Adaptive plans are useful because the optimizer occasionally picks a suboptimal default plan because of a cardinality misestimate. The ability to adapt the plan at run time based on actual execution statistics results in a more optimal final plan. After choosing the final plan, the optimizer uses it for subsequent executions, thus ensuring that the suboptimal plan is not reused.
An adaptive plan contains multiple predetermined subplans, and an optimizer statistics collector. A subplan is a portion of a plan that the optimizer can switch to as an alternative at run time. For example, a nested loops join could be switched to a hash join during execution. An optimizer statistics collector is a row source inserted into a plan at key points to collect run-time statistics. These statistics help the optimizer make a final decision between multiple subplans.
During statement execution, the statistics collector gathers information about the execution, and buffers some rows received by the subplan. Based on the information observed by the collector, the optimizer chooses a subplan. At this point, the collector stops collecting statistics and buffering rows, and permits rows to pass through instead. On subsequent executions of the child cursor, the optimizer continues to use the same plan unless the plan ages out of the cache, or a different optimizer feature (for example, adaptive cursor sharing or statistics feedback) invalidates the plan.
The database uses adaptive plans when OPTIMIZER_FEATURES_ENABLE
is 12.1.0.1
or later, and the OPTIMIZER_ADAPTIVE_REPORTING_ONLY
initialization parameter is set to the default of false
(see "Controlling Adaptive Optimization").
Example 4-2 shows a join of the order_items
and product_information
tables. An adaptive plan for this statement shows two possible plans, one with a nested loops join and the other with a hash join.
Example 4-2 Join of order_items and product_information
SELECT product_name FROM order_items o, product_information p WHERE o.unit_price = 15 AND quantity > 1 AND p.product_id = o.product_id
A nested loops join is preferable if the database can avoid scanning a significant portion of product_information
because its rows are filtered by the join predicate. If few rows are filtered, however, then scanning the right table in a hash join is preferable.
The following graphic shows the adaptive process. For the query in Example 4-2, the adaptive portion of the default plan contains two subplans, each of which uses a different join method. The optimizer automatically determines when each join method is optimal, depending on the cardinality of the left side of the join.
The statistics collector buffers enough rows coming from the order_items
table to determine which join method to use. If the row count is below the threshold determined by the optimizer, then the optimizer chooses the nested loops join; otherwise, the optimizer chooses the hash join. In this case, the row count coming from the order_items
table is above the threshold, so the optimizer chooses a hash join for the final plan, and disables buffering.
After the optimizer determines the final plan, DBMS_XPLAN.DISPLAY_CURSOR
displays the hash join. The Note
section of the execution plan indicates whether the plan is adaptive, as shown in the following sample plan:
---------------------------------------------------------------------------------------------------------------- |Id | Operation | Name |Starts|E-Rows|A-Rows| A-Time |Buffers|Reads|OMem|1Mem|O/1/M| ---------------------------------------------------------------------------------------------------------------- | 0| SELECT STATEMENT | | 1 | | 13 |00:00:00.10 | 21 | 17 | | | | |* 1| HASH JOIN | | 1 | 4 | 13 |00:00:00.10 | 21 | 17 | 2061K| 2061K| 1/0/0| |* 2| TABLE ACCESS FULL| ORDER_ITEMS | 1 | 4 | 13 |00:00:00.07 | 5 | 4 | | | | | 3| TABLE ACCESS FULL| PRODUCT_INFORMATION | 1 | 1 | 288 |00:00:00.03 | 16 | 13 | | | | ---------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("P"."PRODUCT_ID"="O"."PRODUCT_ID") 2 - filter(("O"."UNIT_PRICE"=15 AND "QUANTITY">1)) PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------------- Note ----- - this is an adaptive plan
Typically, parallel execution requires data redistribution to perform operations such as parallel sorts, aggregations, and joins. Oracle Database can use many different data distributions methods. The database chooses the method based on the number of rows to be distributed and the number of parallel server processes in the operation.
For example, consider the following alternative cases:
Many parallel server processes distribute few rows.
The database may choose the broadcast distribution method. In this case, the entire result set is sent to all of the parallel server processes.
Few parallel server processes distribute many rows.
If a data skew is encountered during the data redistribution, then it could adversely effect the performance of the statement. The database is more likely to pick a hash distribution to ensure that each parallel server process receives an equal number of rows.
The hybrid hash distribution technique is an adaptive parallel data distribution that does not decide the final data distribution method until execution time. The optimizer inserts statistic collectors in front of the parallel server processes on the producer side of the operation. If the actual number of rows is less than a threshold, defined as twice the degree of parallelism chosen for the operation, then the data distribution method switches from hash to broadcast. Otherwise, the data distribution method is a hash.
The following graphic shows a hybrid hash join between the departments
and employees
tables. A statistics collector is inserted in front of the parallel server processes scanning the departments
table. The distribution method is based on the run-time statistics. In this example, the number of rows is less than the threshold of twice the degree of parallelism, so the optimizer chooses a broadcast technique for the departments
table.
In the following alternative example, the threshold is 16, or twice the specified DOP of 8. Because the number of rows (27) is greater than the threshold (16), the optimizer chooses a hash rather than a broadcast distribution. Note the statistics collector in step 10 of the plan.
EXPLAIN PLAN FOR SELECT /*+ parallel(8) full(e) full(d) */ department_name, sum(salary) FROM employees e, departments d WHERE d.department_id=e.department_id GROUP BY department_name; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------------- Plan hash value: 3213496516 ---------------------------------------------------------------------------------------------------------------- |Id | Operation | Name |Rows|Bytes|Cost | Time | TQ |IN-OUT| PQ Distrib | ---------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 27 | 621 | 6 (34)| 00:00:01 | | | | | 1 | PX COORDINATOR | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10003 | 27 | 621 | 6 (34)| 00:00:01 | Q1,03 | P->S | QC (RAND)| | 3 | HASH GROUP BY | | 27 | 621 | 6 (34)| 00:00:01 | Q1,03 | PCWP | | | 4 | PX RECEIVE | | 27 | 621 | 6 (34)| 00:00:01 | Q1,03 | PCWP | | | 5 | PX SEND HASH | :TQ10002 | 27 | 621 | 6 (34)| 00:00:01 | Q1,02 | P->P | HASH| | 6 | HASH GROUP BY | | 27 | 621 | 6 (34)| 00:00:01 | Q1,02 | PCWP | | |*7 | HASH JOIN | |106 |2438 | 5 (20)| 00:00:01 | Q1,02 | PCWP | | | 8 | PX RECEIVE | | 27 | 432 | 2 (0)| 00:00:01 | Q1,02 | PCWP | | | 9 | PX SEND HYBRID HASH | :TQ10000 | 27 | 432 | 2 (0)| 00:00:01 | Q1,00 | P->P | HYBRID HASH| |10 | STATISTICS COLLECTOR | | | | | | Q1,00 | PCWC | | |11 | PX BLOCK ITERATOR | | 27 | 432 | 2 (0)| 00:00:01 | Q1,00 | PCWC | | |12 | TABLE ACCESS FULL |DEPARTMENTS| 27 | 432 | 2 (0)| 00:00:01 | Q1,00 | PCWP | | |13 | PX RECEIVE | |107 | 749 | 2 (0)| 00:00:01 | Q1,02 | PCWP | | |14 | PX SEND HYBRID HASH (SKEW)| :TQ10001 |107 | 749 | 2 (0)| 00:00:01 | Q1,01 | P->P | HYBRID HASH| |15 | PX BLOCK ITERATOR | |107 | 749 | 2 (0)| 00:00:01 | Q1,01 | PCWC | | |16 | TABLE ACCESS FULL | EMPLOYEES |107 | 749 | 2 (0)| 00:00:01 | Q1,01 | PCWP | | PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 7 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID") Note ----- - Degree of Parallelism is 8 because of hint 32 rows selected.
See Also: Oracle Database VLDB and Partitioning Guide to learn more about parallel data redistribution techniques |
The quality of the plans that the optimizer generates depends on the quality of the statistics. Some query predicates become too complex to rely on base table statistics alone, so the optimizer augments these statistics with adaptive statistics.
The following topics describe types of adaptive statistics:
During the compilation of a SQL statement, the optimizer decides whether to use dynamic statistics by considering whether the available statistics are sufficient to generate an optimal execution plan. If the available statistics are insufficient, then the optimizer uses dynamic statistics to augment the statistics. One type of dynamic statistics is the information gathered by dynamic sampling. The optimizer can use dynamic statistics for table scans, index access, joins, and GROUP BY
operations, thus improving the quality of optimizer decisions.
See Also: "Dynamic Statistics" to learn more about dynamic statistics and optimizer statistics in general |
Whereas adaptive plans help decide between multiple subplans, they are not feasible for all kinds of plan changes. For example, a query with an inefficient join order might perform suboptimally, but adaptive plans do not support adapting the join order during execution. In these cases, the optimizer considers automatic reoptimization. In contrast to adaptive plans, automatic reoptimization changes a plan on subsequent executions after the initial execution.
At the end of the first execution of a SQL statement, the optimizer uses the information gathered during execution to determine whether automatic reoptimization is worthwhile. If execution informations differs significantly from optimizer estimates, then the optimizer looks for a replacement plan on the next execution. The optimizer uses the information gathered during the previous execution to help determine an alternative plan. The optimizer can reoptimize a query several times, each time learning more and further improving the plan.
A form of reoptimization known as statistics feedback (formerly known as cardinality feedback) automatically improves plans for repeated queries that have cardinality misestimates. The optimizer can estimate cardinalities incorrectly for many reasons, such as missing statistics, inaccurate statistics, or complex predicates.
The basic process of reoptimization using statistics feedback is as follows:
During the first execution of a SQL statement, the optimizer generates an execution plan.
The optimizer may enable monitoring for statistics feedback for the shared SQL area in the following cases:
Tables with no statistics
Multiple conjunctive or disjunctive filter predicates on a table
Predicates containing complex operators for which the optimizer cannot accurately compute selectivity estimates
At the end of execution, the optimizer compares its initial cardinality estimates to the actual number of rows returned by each operation in the plan during execution. If estimates differ significantly from actual cardinalities, then the optimizer stores the correct estimates for subsequent use. The optimizer also creates a SQL plan directive so that other SQL statements can benefit from the information obtained during this initial execution.
After the first execution, the optimizer disables monitoring for statistics feedback.
If the query executes again, then the optimizer uses the corrected cardinality estimates instead of its usual estimates.
Example 4-3 Statistics Feedback
This example shows how the database uses statistics feedback to adjust incorrect estimates.
The user oe
runs the following query of the orders
, order_items
, and product_information
tables:
SELECT o.order_id, v.product_name FROM orders o, ( SELECT order_id, product_name FROM order_items o, product_information p WHERE p.product_id = o.product_id AND list_price < 50 AND min_price < 40 ) v WHERE o.order_id = v.order_id
Querying the plan in the cursor shows that the estimated rows (E-Rows
) is far fewer than the actual rows (A-Rows
).
Example 4-4 Actual Rows and Estimated Rows
------------------------------------------------------------------------------------------------------------
| Id| Operation | Name |Starts|E-Rows|A-Rows| A-Time |Buffers|OMem|1Mem|O/1/M|
------------------------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1| | 269 |00:00:00.10| 1338| | | |
| 1| NESTED LOOPS | | 1| 1| 269 |00:00:00.10| 1338| | | |
| 2| MERGE JOIN CARTESIAN| | 1| 4| 9135 |00:00:00.04| 33| | | |
|* 3| TABLE ACCESS FULL | PRODUCT_INFORMATION | 1| 1| 87 |00:00:00.01| 32| | | |
| 4| BUFFER SORT | | 87| 105| 9135 |00:00:00.01| 1|4096|4096|1/0/0|
| 5| INDEX FULL SCAN | ORDER_PK | 1| 105| 105 |00:00:00.01| 1| | | |
|* 6| INDEX UNIQUE SCAN | ORDER_ITEMS_UK | 9135| 1| 269 |00:00:00.03| 1305| | | |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(("MIN_PRICE"<40 AND "LIST_PRICE"<50))
6 - access("O"."ORDER_ID"="ORDER_ID" AND "P"."PRODUCT_ID"="O"."PRODUCT_ID")
The user oe
reruns the following query of the orders
, order_items
, and product_information
tables:
SELECT o.order_id, v.product_name FROM orders o, ( SELECT order_id, product_name FROM order_items o, product_information p WHERE p.product_id = o.product_id AND list_price < 50 AND min_price < 40 ) v WHERE o.order_id = v.order_id;
Querying the plan in the cursor shows that the optimizer used statistics feedback (as shown in the Note
section) for the second execution, and also chose a different plan.
Example 4-5 Actual Rows and Estimated Rows
----------------------------------------------------------------------------------------------------------------
| Id| Operation |Name |Starts|E-Rows|A-Rows| A-Time |Buffers|Reads|OMem |1Mem |O/1/M|
----------------------------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1| | 269|00:00:00.03| 60 | 1 | | | |
| 1| NESTED LOOPS | | 1| 269| 269|00:00:00.03| 60 | 1 | | | |
|* 2| HASH JOIN | | 1| 313| 269|00:00:00.03| 39 | 1 |1321K|1321K|1/0/0|
|* 3| TABLE ACCESS FULL |PRODUCT_INFORMATION| 1| 87| 87|00:00:00.01| 15 | 0 | | | |
| 4| INDEX FAST FULL SCAN|ORDER_ITEMS_UK | 1| 665| 665|00:00:00.02| 24 | 1 | | | |
|* 5| INDEX UNIQUE SCAN |ORDER_PK |269| 1| 269|00:00:00.01| 21 | 0 | | | |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("P"."PRODUCT_ID"="O"."PRODUCT_ID")
3 - filter(("MIN_PRICE"<40 AND "LIST_PRICE"<50))
5 - access("O"."ORDER_ID"="ORDER_ID")
Note
-----
- statistics feedback used for this statement
In the preceding output, the estimated number of rows (269
) matches the actual number of rows.
Another form of reoptimization is performance feedback. This reoptimization helps improve the degree of parallelism automatically chosen for repeated SQL statements when PARALLEL_DEGREE_POLICY
is set to ADAPTIVE
.
The basic process of reoptimization using performance feedback is as follows:
During the first execution of a SQL statement, when PARALLEL_DEGREE_POLICY
is set to ADAPTIVE
, the optimizer determines whether to execute the statement in parallel, and if so, which degree of parallelism to use.
The optimizer chooses the degree of parallelism based on the estimated performance of the statement. Additional performance monitoring is enabled for all statements.
At the end of the initial execution, the optimizer compares the following:
The degree of parallelism chosen by the optimizer
The degree of parallelism computed based on the performance statistics (for example, the CPU time) gathered during the actual execution of the statement
If the two values vary significantly, then the database marks the statement for reparsing, and stores the initial execution statistics as feedback. This feedback helps better compute the degree of parallelism for subsequent executions.
If the query executes again, then the optimizer uses the performance statistics gathered during the initial execution to better determine a degree of parallelism for the statement.
Note: Even ifPARALLEL_DEGREE_POLICY is not set to ADAPTIVE , statistics feedback may influence the degree of parallelism chosen for a statement. |
A SQL plan directive is additional information that the optimizer uses to generate a more optimal plan. For example, during query optimization, when deciding whether the table is a candidate for dynamic statistics, the database queries the statistics repository for directives on a table. If the query joins two tables that have a data skew in their join columns, a SQL plan directive can direct the optimizer to use dynamic statistics to obtain an accurate cardinality estimate.The optimizer collects SQL plan directives on query expressions rather than at the statement level. In this way, the optimizer can apply directives to multiple SQL statements. The database automatically maintains directives, and stores them in the SYSAUX
tablespace. You can manage directives using the package DBMS_SPD
.
SQL plan management is a mechanism that enables the optimizer to automatically manage execution plans, ensuring that the database uses only known or verified plans (see Chapter 23, "Managing SQL Plan Baselines"). This mechanism can build a SQL plan baseline, which contains one or more accepted plans for each SQL statement.
The optimizer can access and manage the plan history and SQL plan baselines of SQL statements. This capability is central to the SQL plan management architecture. In SQL plan management, the optimizer has the following main objectives:
Identify repeatable SQL statements
Maintain plan history, and possibly SQL plan baselines, for a set of SQL statements
Detect plans that are not in the plan history
Detect potentially better plans that are not in the SQL plan baseline
The optimizer uses the normal cost-based search method.
SQL Tuning Guide
12c Release 1 (12.1)
E49106-05
June 2014
Oracle Database SQL Tuning Guide, 12c Release 1 (12.1)
E49106-05
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Primary Author: Lance Ashdown
Contributing Authors: Maria Colgan, Tom Kyte
Contributors: Pete Belknap, Ali Cakmak, Sunil Chakkappen, Immanuel Chan, Deba Chatterjee, Chris Chiappa, Dinesh Das, Leonidas Galanis, William Endress, Bruce Golbus, Katsumi Inoue, Kevin Jernigan, Shantanu Joshi, Adam Kociubes, Allison Lee, Sue Lee, David McDermid, Colin McGregor, Ted Persky, Ekrem Soylemez, Hong Su, Murali Thiyagarajah, Mark Townsend, Randy Urbano, Bharath Venkatakrishnan, Hailing Yu
Contributor: The Oracle Database 12c documentation is dedicated to Mark Townsend, who was an inspiration to all who worked on this release.
This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information on content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services.
accepted plan
In the context of SQL plan management, a plan that is in a SQL plan baseline for a SQL statement and thus available for use by the optimizer. An accepted plan contains a set of hints, a plan hash value, and other plan-related information.
access path
The means by which the database retrieves data from a database. For example, a query using an index and a query using a full table scan use different access paths.
adaptive cursor sharing
A feature that enables a single statement that contains bind variables to use multiple execution plans. Cursor sharing is "adaptive" because the cursor adapts its behavior so that the database does not always use the same plan for each execution or bind variable value.
adaptive optimizer
A feature of the optimizer that enables it to adapt plans based on run-time statistics.
adaptive plan
An execution plan that changes after optimization because run-time conditions indicate that optimizer estimates are inaccurate. An adaptive plan has different built-in plan options. During the first execution, before a specific subplan becomes active, the optimizer makes a a final decision about which option to use. The optimizer bases its choice on observations made during the execution up to this point. Thus, an adaptive plan enables the final plan for a statement to differ from the default plan.
adaptive query optimization
A set of capabilities that enables the adaptive optimizer to make run-time adjustments to execution plans and discover additional information that can lead to better statistics. Adaptive optimization is helpful when existing statistics are not sufficient to generate an optimal plan.
antijoin
A join that returns rows that fail to match the subquery on the right side. For example, an antijoin can list departments with no employees. Antijoins use the NOT EXISTS
or NOT IN
constructs.
Automatic Database Diagnostic Monitor (ADDM)
ADDM is self-diagnostic software built into Oracle Database. ADDM examines and analyzes data captured in Automatic Workload Repository (AWR) to determine possible database performance problems.
automatic optimizer statistics collection
The automatic running of the DBMS_STATS
package to collect optimizer statistics for all schema objects for which statistics are missing or stale.
automatic initial plan capture
The mechanism by which the database automatically creates a SQL plan baseline for any repeatable SQL statement executed on the database. Enable automatic initial plan capture by setting the OPTIMIZER_CAPTURE_SQL_PLAN_BASELINES
initialization parameter to true
(the default is false
).
automatic reoptimization
The ability of the optimizer to automatically change a plan on subsequent executions of a SQL statement. Automatic reoptimization can fix any suboptimal plan chosen due to incorrect optimizer estimates, from a suboptimal distribution method to an incorrect choice of degree of parallelism.
automatic SQL tuning
The work performed by Automatic SQL Tuning Advisor it runs as an automated task within system maintenance windows.
Automatic SQL Tuning Advisor
SQL Tuning Advisor when run as an automated maintenance task. Automatic SQL Tuning runs during system maintenance windows as an automated maintenance task, searching for ways to improve the execution plans of high-load SQL statements.
See SQL Tuning Advisor.
Automatic Tuning Optimizer
The optimizer when invoked by SQL Tuning Advisor. In SQL tuning mode, the optimizer performs additional analysis to check whether it can further improve the plan produced in normal mode. The optimizer output is not an execution plan, but a series of actions, along with their rationale and expected benefit for producing a significantly better plan.
Automatic Workload Repository (AWR)
The infrastructure that provides services to Oracle Database components to collect, maintain, and use statistics for problem detection and self-tuning.
AWR snapshot
A set of data for a specific time that is used for performance comparisons. The delta values captured by the snapshot represent the changes for each statistic over the time period. Statistics gathered by are queried from memory. You can display the gathered data in both reports and views.
baseline
In the context of AWR, the interval between two AWR snapshots that represent the database operating at an optimal level.
bind-aware cursor
A bind-sensitive cursor that is eligible to use different plans for different bind values. After a cursor has been made bind-aware, the optimizer chooses plans for future executions based on the bind value and its cardinality estimate.
bind-sensitive cursor
A cursor whose optimal plan may depend on the value of a bind variable. The database monitors the behavior of a bind-sensitive cursor that uses different bind values to determine whether a different plan is beneficial.
bind variable
A placeholder in a SQL statement that must be replaced with a valid value or value address for the statement to execute successfully. By using bind variables, you can write a SQL statement that accepts inputs or parameters at run time. The following query uses v_empid
as a bind variable:
SELECT * FROM employees WHERE employee_id = :v_empid;
bind variable peeking
The ability of the optimizer to look at the value in a bind variable during a hard parse. By peeking at bind values, the optimizer can determine the selectivity of a WHERE
clause condition as if literals had been used, thereby improving the plan.
bitmap piece
A subcomponent of a single bitmap index entry. Each indexed column value may have one or more bitmap pieces. The database uses bitmap pieces to break up an index entry that is large in relation to the size of a block.
B-tree index
An index organized like an upside-down tree. A B-tree index has two types of blocks: branch blocks for searching and leaf blocks that store values. The leaf blocks contain every indexed data value and a corresponding rowid used to locate the actual row. The "B" stands for "balanced" because all leaf blocks automatically stay at the same depth.
cardinality
The number of rows that is expected to be or actually is returned by an operation in an execution plan. Data has low cardinality when the number of distinct values in a column is low in relation to the total number of rows.
Cartesian join
A join in which one or more of the tables does not have any join conditions to any other tables in the statement. The optimizer joins every row from one data source with every row from the other data source, creating the Cartesian product of the two sets.
child cursor
The cursor containing the plan, compilation environment, and other information for a statement whose text is stored in a parent cursor. The parent cursor is number 0
, the first child is number 1
, and so on. Child cursors reference the same SQL text as the parent cursor, but are different. For example, two queries with the text SELECT * FROM t
use different cursors when they reference two different tables named t
.
cluster scan
An access path for a table cluster. In an indexed table cluster, Oracle Database first obtains the rowid of one of the selected rows by scanning the cluster index. Oracle Database then locates the rows based on this rowid.
column statistics
Statistics about columns that the optimizer uses to determine optimal execution plans. Column statistics include the number of distinct column values, low value, high value, and number of nulls.
composite database operation
In a database operation, the activity between two points in time in a database session, with each session defining its own beginning and end points. A session can participate in at most one composite database operation at a time.
concurrency
Simultaneous access of the same data by many users. A multiuser database management system must provide adequate concurrency controls so that data cannot be updated or changed improperly, compromising data integrity.
concurrent statistics gathering mode
A mode that enables the database to simultaneously gather optimizer statistics for multiple tables in a schema, or multiple partitions or subpartitions in a table. Concurrency can reduce the overall time required to gather statistics by enabling the database to fully use multiple CPUs.
condition
A combination of one or more expressions and logical operators that returns a value of TRUE
, FALSE
, or UNKNOWN
.
cost
A numeric internal measure that represents the estimated resource usage for an execution plan. The lower the cost, the more efficient the plan.
cost-based optimizer (CBO)
The legacy name for the optimizer. In earlier releases, the cost-based optimizer was an alternative to the rule-based optimizer (RBO).
cost model
The internal optimizer model that accounts for the cost of the I/O, CPU, and network resources that a query is predicted to use.
cumulative statistics
A count such as the number of block reads. Oracle Database generates many types of cumulative statistics for the system, sessions, and individual SQL statements.
cursor
A handle or name for a private SQL area in the PGA. Because cursors are closely associated with private SQL areas, the terms are sometimes used interchangeably.
cursor merging
Combining cursors to save space in the shared SQL area. If the optimizer creates a plan for a bind-aware cursor, and if this plan is the same as an existing cursor, then the optimizer can merge the cursors.
database operation
A set of database tasks defined by end users or application code, for example, a batch job or ETL processing.
default plan
For an adaptive plan, the execution plan initially chosen by the optimizer using the statistics from the data dictionary. The default plan can differ from the final plan.
disabled plan
A plan that a database administrator has manually marked as ineligible for use by the optimizer.
degree of parallelism
The number of parallel execution servers associated with a single operation. Parallel execution is designed to effectively use multiple CPUs. Oracle Database parallel execution framework enables you to either explicitly choose a specific degree of parallelism or to rely on Oracle Database to automatically control it.
dense grouping key
A key that represents all grouping keys whose grouping columns come from a particular fact table or dimension.
dense join key
A key that represents all join keys whose join columns come from a particular fact table or dimension.
density
A decimal number between 0
and 1
that measures the selectivity of a column. Values close to 1
indicate that the column is unselective, whereas values close to 0
indicate that this column is more selective.
driving table
The table to which other tables are joined. An analogy from programming is a for loop that contains another for loop. The outer for loop is the analog of a driving table, which is also called an outer table.
dynamic performance view
A view created on dynamic performance tables, which are virtual tables that record current database activity. The dynamic performance views are called fixed views because they cannot be altered or removed by the database administrator. They are also called V$
views because they begin with the string V$
(GV$
in Oracle RAC).
dynamic statistics
An optimization technique in which the database executes a recursive SQL statement to scan a small random sample of a table's blocks to estimate predicate selectivities.
dynamic statistics level
The level that controls both when the database gathers dynamic statistics, and the size of the sample that the optimizer uses to gather the statistics. Set the dynamic statistics level using either the OPTIMIZER_DYNAMIC_SAMPLING
initialization parameter or a statement hint.
endpoint number
A number that uniquely identifies a bucket in a histogram. In frequency and hybrid histograms, the endpoint number is the cumulative frequency of endpoints. In height-balanced histograms, the endpoint number is the bucket number.
endpoint repeat count
In a hybrid histogram, the number of times the endpoint value is repeated, for each endpoint (bucket) in the histogram. By using the repeat count, the optimizer can obtain accurate estimates for almost popular values.
estimator
The component of the optimizer that determines the overall cost of a given execution plan.
execution plan
The combination of steps used by the database to execute a SQL statement. Each step either retrieves rows of data physically from the database or prepares them for the user issuing the statement. You can override execution plans by using hints.
execution tree
A tree diagram that shows the flow of row sources from one step to another in an execution plan.
expression
A combination of one or more values, operators, and SQL functions that evaluates to a value. For example, the expression 2*2
evaluates to 4
. In general, expressions assume the data type of their components.
expression statistics
A type of extended statistics that improves optimizer estimates when a WHERE
clause has predicates that use expressions.
extended statistics
A type of optimizer statistics that improves estimates for cardinality when multiple predicates exist or when predicates contain expressions.
extensible optimizer
An optimizer capability that enables authors of user-defined functions and indexes to create statistics collection, selectivity, and cost functions that the optimizer uses when choosing an execution plan. The optimizer cost model is extended to integrate information supplied by the user to assess CPU and I/O cost.
extension
A column group or an expression. The statistics collected for column groups and expressions are called extended statistics.
external table
A read-only table whose metadata is stored in the database but whose data in stored in files outside the database. The database uses the metadata describing external tables to expose their data as if they were relational tables.
filter condition
A WHERE
clause component that eliminates rows from a single object referenced in a SQL statement.
final plan
In an adaptive plan, the plan that executes to completion. The default plan can differ from the final plan.
fixed object
A dynamic performance table or its index. The fixed objects are owned by SYS
. Fixed object tables have names beginning with X$
and are the base tables for the V$
views.
fixed plan
An accepted plan that is marked as preferred, so that the optimizer considers only the fixed plans in the SQL plan baseline. You can use fixed plans to influence the plan selection process of the optimizer.
frequency histogram
A type of histogram in which each distinct column value corresponds to a single bucket. An analogy is sorting coins: all pennies go in bucket 1, all nickels go in bucket 2, and so on.
full outer join
A combination of a left and right outer join. In addition to the inner join, the database uses nulls to preserve rows from both tables that have not been returned in the result of the inner join. In other words, full outer joins join tables togethe r, yet show rows with no corresponding rows in the joined tables.
full table scan
A scan of table data in which the database sequentially reads all rows from a table and filters out those that do not meet the selection criteria. All data blocks under the high water mark are scanned.
global temporary table
A special temporary table that stores intermediate session-private data for a specific duration.
hard parse
The steps performed by the database to build a new executable version of application code. The database must perform a hard parse instead of a soft parse if the parsed representation of a submitted statement does not exist in the shared SQL area.
hash cluster
A type of table cluster that is similar to an indexed cluster, except the index key is replaced with a hash function. No separate cluster index exists. In a hash cluster, the data is the index.
hash function
A function that operates on an arbitrary-length input value and returns a fixed-length hash value.
hash join
A method for joining large data sets. The database uses the smaller of two data sets to build a hash table on the join key in memory. It then scans the larger data set, probing the hash table to find the joined rows.
hash scan
An access path for a table cluster. The database uses a hash scan to locate rows in a hash cluster based on a hash value. In a hash cluster, all rows with the same hash value are stored in the same data block. To perform a hash scan, Oracle Database first obtains the hash value by applying a hash function to a cluster key value specified by the statement, and then scans the data blocks containing rows with that hash value.
hash table
An in-memory data structure that associates join keys with rows in a hash join. For example, in a join of the employees
and departments
tables, the join key might be the department ID. A hash function uses the join key to generate a hash value. This hash value is an index in an array, which is the hash table.
hash value
In a hash cluster, a unique numeric ID that identifies a bucket. Oracle Database uses a hash function that accepts an infinite number of hash key values as input and sorts them into a finite number of buckets. Each hash value maps to the database block address for the block that stores the rows corresponding to the hash key value (department 10, 20, 30, and so on).
hashing
A mathematical technique in which an infinite set of input values is mapped to a finite set of output values, called hash values. Hashing is useful for rapid lookups of data in a hash table.
heap-organized table
A table in which the data rows are stored in no particular order on disk. By default, CREATE TABLE
creates a heap-organized table.
height-balanced histogram
A histogram in which column values are divided into buckets so that each bucket contains approximately the same number of rows.
hint
An instruction passed to the optimizer through comments in a SQL statement. The optimizer uses hints to choose an execution plan for the statement.
histogram
A special type of column statistic that provides more detailed information about the data distribution in a table column.
hybrid hash distribution technique
An adaptive parallel data distribution that does not decide the final data distribution method until execution time.
hybrid histogram
An enhanced height-based histogram that stores the exact frequency of each endpoint in the sample, and ensures that a value is never stored in multiple buckets.
implicit query
A component of a DML statement that retrieves data without a subquery. An UPDATE
, DELETE
, or MERGE
statement that does not explicitly include a SELECT
statement uses an implicit query to retrieve the rows to be modified.
incremental statistics maintenance
The ability of the database to generate global statistics for a partitioned table by aggregating partition-level statistics.
index
Optional schema object associated with a nonclustered table, table partition, or table cluster. In some cases indexes speed data access.
index cluster
An table cluster that uses an index to locate data. The cluster index is a B-tree index on the cluster key.
index clustering factor
A measure of row order in relation to an indexed value such as employee last name. The more scattered the rows among the data blocks, the lower the clustering factor.
index fast full scan
A scan of the index blocks in unsorted order, as they exist on disk. This scan reads the index instead of the table.
index-organized table
A table whose storage organization is a variant of a primary B-tree index. Unlike a heap-organized table, data is stored in primary key order.
index range scan
An index range scan is an ordered scan of an index that has the following characteristics:
One or more leading columns of an index are specified in conditions.
0, 1, or more values are possible for an index key.
index range scan descending
An index range scan in which the database returns rows in descending order.
index skip scan
An index scan occurs in which the initial column of a composite index is "skipped" or not specified in the query. For example, if the composite index key is (cust_gender,cust_email)
, then the query predicate does not reference the cust_gender
column.
index statistics
Statistics about indexes that the optimizer uses to determine whether to perform a full table scan or an index scan. Index statistics include B-tree levels, leaf block counts, the index clustering factor, distinct keys, and number of rows in the index.
inner join
A join of two or more tables that returns only those rows that satisfy the join condition.
inner table
In a nested loops join, the table that is not the outer table (driving table). For every row in the outer table, the database accesses all rows in the inner table. The outer loop is for every row in the outer table and the inner loop is for every row in the inner table.
join
A statement that retrieves data from multiple tables specified in the FROM
clause of a SQL statement. Join types include inner joins, outer joins, and Cartesian joins.
join condition
A condition that compares two row sources using an expression. The database combines pairs of rows, each containing one row from each row source, for which the join condition evaluates to true
.
join elimination
The removal of redundant tables from a query. A table is redundant when its columns are only referenced in join predicates, and those joins are guaranteed to neither filter nor expand the resulting rows.
join factorization
A cost-based transformation that can factorize common computations from branches of a UNION ALL
query. Without join factorization, the optimizer evaluates each branch of a UNION ALL
query independently, which leads to repetitive processing, including data access and joins. Avoiding an extra scan of a large base table can lead to a huge performance improvement.
join method
A method of joining a pair of row sources. The possible join methods are nested loop, sort merge, and hash joins. A Cartesian join requires one of the preceding join methods
join order
The order in which multiple tables are joined together. For example, for each row in the employees
table, the database can read each row in the departments
table. In an alternative join order, for each row in the departments
table, the database reads each row in the employees
table.
To execute a statement that joins more than two tables, Oracle Database joins two of the tables and then joins the resulting row source to the next table. This process continues until all tables are joined into the result.
join predicate
A predicate in a WHERE
or JOIN
clause that combines the columns of two tables in a join.
latch
A low-level serialization control mechanism used to protect shared data structures in the SGA from simultaneous access.
left table
In an outer join, the table specified on the left side of the OUTER JOIN
keywords (in ANSI SQL syntax).
library cache
An area of memory in the shared pool. This cache includes the shared SQL areas, private SQL areas (in a shared server configuration), PL/SQL procedures and packages, and control structures such as locks and library cache handles.
library cache miss
During SQL processing, the act of searching for a usable plan in the library cache and not finding it.
maintenance window
A contiguous time interval during which automated maintenance tasks run. The maintenance windows are Oracle Scheduler windows that belong to the window group named MAINTENANCE_WINDOW_GROUP
.
materialized view
A schema object that stores a query result. All materialized views are either read-only or updatable.
multiblock read
An I/O call that reads multiple database blocks. Multiblock reads can significantly speed up full table scans.
nested loops join
A type of join method. A nested loops join determines the outer table that drives the join, and for every row in the outer table, probes each row in the inner table. The outer loop is for each row in the outer table and the inner loop is for each row in the inner table. An analogy from programming is a for
loop inside of another for
loop.
nonpopular value
In a histogram, any value that does not span two or more endpoints. Any value that is not nonpopular is a popular value.
on-demand SQL tuning
The manual invocation of SQL Tuning Advisor. Any invocation of SQL Tuning Advisor that is not the result of an Automatic SQL Tuning task is on-demand tuning.
optimizer
Built-in database software that determines the most efficient way to execute a SQL statement by considering factors related to the objects referenced and the conditions specified in the statement.
optimizer cost model
The model that the optimizer uses to select an execution plan. The optimizer selects the execution plan with the lowest cost, where cost represents the estimated resource usage for that plan. The optimizer cost model accounts for the I/O, CPU, and network resources that the query will use.
optimizer environment
The totality of session settings that can affect execution plan generation, such as the work area size or optimizer settings (for example, the optimizer mode).
optimizer goal
The prioritization of resource usage by the optimizer. Using the OPTIMIZER_MODE
initialization parameter, you can set the optimizer goal best throughput or best response time.
optimizer statistics
Details about the database its object used by the optimizer to select the best execution plan for each SQL statement. Categories include table statistics such as numbers of rows, index statistics such as B-tree levels, system statistics such as CPU and I/O performance, and column statistics such as number of nulls.
optimizer statistics collection
The gathering of optimizer statistics for database objects. The database can collect these statistics automatically, or you can collect them manually by using the system-supplied DBMS_STATS
package.
optimizer statistics collector
A row source inserted into an execution plan at key points to collect run-time statistics for use in adaptive plans.
optimizer statistics preferences
The default values of the parameters used by automatic statistics collection and the DBMS_STATS
statistics gathering procedures.
outer join
A join condition using the outer join operator (+
) with one or more columns of one of the tables. The database returns all rows that meet the join condition. The database also returns all rows from the table without the outer join operator for which there are no matching rows in the table with the outer join operator.
parallel execution
The application of multiple CPU and I/O resources to the execution of a single database operation.
parallel query
A query in which multiple processes work together simultaneously to run a single SQL query. By dividing the work among multiple processes, Oracle Database can run the statement more quickly. For example, four processes retrieve rows for four different quarters in a year instead of one process handling all four quarters by itself.
parent cursor
The cursor that stores the SQL text and other minimal information for a SQL statement. The child cursor contains the plan, compilation environment, and other information. When a statement first executes, the database creates both a parent and child cursor in the shared pool.
parse call
A call to Oracle to prepare a SQL statement for execution. The call includes syntactically checking the SQL statement, optimizing it, and then building or locating an executable form of that statement.
parsing
The stage of SQL processing that involves separating the pieces of a SQL statement into a data structure that can be processed by other routines.
A hard parse occurs when the SQL statement to be executed is either not in the shared pool, or it is in the shared pool but it cannot be shared. A soft parse occurs when a session attempts to execute a SQL statement, and the statement is already in the shared pool, and it can be used.
partition maintenance operation
A partition-related operation such as adding, exchanging, merging, or splitting table partitions.
partition-wise join
A join optimization that divides a large join of two tables, one of which must be partitioned on the join key, into several smaller joins.
pending statistics
Unpublished optimizer statistics. By default, the optimizer uses published statistics but does not use pending statistics.
performance feedback
This form of automatic reoptimization helps improve the degree of parallelism automatically chosen for repeated SQL statements when PARALLEL_DEGREE_POLICY
is set to ADAPTIVE
.
pipelined table function
A PL/SQL function that accepts a collection of rows as input. You invoke the table function as the operand of the table operator in the FROM
list of a SELECT
statement.
plan evolution
The manual change of an unaccepted plan in the SQL plan history into an accepted plan in the SQL plan baseline.
plan generator
The part of the optimizer that tries different access paths, join methods, and join orders for a given query block to find the plan with the lowest cost.
plan selection
The optimizer's attempt to find a matching plan in the SQL plan baseline for a statement after performing a hard parse.
plan verification
Comparing the performance of an unaccepted plan to a plan in a SQL plan baseline and ensuring that it performs better.
popular value
In a histogram, any value that spans two or more endpoints. Any value that is not popular is an nonpopular value.
predicate pushing
A transformation technique in which the optimizer "pushes" the relevant predicates from the containing query block into the view query block. For views that are not merged, this technique improves the subplan of the unmerged view because the database can use the pushed-in predicates to access indexes or to use as filters.
private SQL area
An area in memory that holds a parsed statement and other information for processing. The private SQL area contains data such as bind variable values, query execution state information, and query execution work areas.
proactive SQL tuning
Using SQL tuning tools to identify SQL statements that are candidates for tuning before users have complained about a performance problem.
projection view
An optimizer-generated view that appear in queries in which a DISTINCT
view has been merged, or a GROUP BY
view is merged into an outer query block that also contains GROUP BY
, HAVING
, or aggregates.
query
An operation that retrieves data from tables or views. For example, SELECT * FROM employees
is a query.
reactive SQL tuning
Diagnosing and fixing SQL-related performance problems after users have complained about them.
recursive SQL
Additional SQL statements that the database must issue to execute a SQL statement issued by a user. The generation of recursive SQL is known as a recursive call. For example, the database generates recursive calls when data dictionary information is not available in memory and so must be retrieved from disk.
repeatable SQL statement
A statement that the database parses or executes after recognizing that it is tracked in the SQL statement log.
right join tree
A join tree in which the right input of every join is the result of a previous join.
right table
In an outer join, the table specified on the right side of the OUTER JOIN
keywords (in ANSI SQL syntax).
row source
An iterative control structure that processes a set of rows in an iterated manner and produces a row set.
row source generator
Software that receives the optimal plan from the optimizer and outputs the execution plan for the SQL statement.
row source tree
A collection of row sources produced by the row source generator. The row source tree for a SQL statement shows information such as table order, access methods, join methods, and data operations such as filters and sorts.
sample table scan
A scan that retrieves a random sample of data from a simple table or a complex SELECT
statement, such as a statement involving joins and views.
selectivity
A value indicating the proportion of a row set retrieved by a predicate or combination of predicates, for example, WHERE last_name = 'Smith'
. A selectivity of 0
means that no rows pass the predicate test, whereas a value of 1
means that all rows pass the test.
The adjective selective means roughly "choosy." Thus, a highly selective query returns a low proportion of rows (selectivity close to 0
), whereas an unselective query returns a high proportion of rows (selectivity close to 1
).
semijoin
A join that returns rows from the first table when at least one match exists in the second table. For example, you list departments with at least one employee. The difference between a semijoin and a conventional join is that rows in the first table are returned at most once. Semijoins use the EXISTS
or IN
constructs.
shared SQL area
An area in the shared pool that contains the parse tree and execution plan for a SQL statement. Only one shared SQL area exists for a unique statement. The shared SQL area is sometimes referred to as the cursor cache.
simple database operation
A database operation consisting of a single SQL statement or PL/SQL procedure or function.
simple view merging
The merging of select-project-join views. For example, a query joins the employees
table to a subquery that joins the departments
and locations
tables.
soft parse
Any parse that is not a hard parse. If a submitted SQL statement is the same as a reusable SQL statement in the shared pool, then Oracle Database reuses the existing code. This reuse of code is also called a library cache hit.
sort merge join
A type of join method. The join consists of a sort join, in which both inputs are sorted on the join key, followed by a merge join, in which the sorted lists are merged.
SQL Access Advisor
SQL Access Advisor is internal diagnostic software that recommends which materialized views, indexes, and materialized view logs to create, drop, or retain.
SQL compilation
In the context of Oracle SQL processing, this term refers collectively to the phases of parsing, optimization, and plan generation.
SQL plan directive
Additional information and instructions that the optimizer can use to generate a more optimal plan. For example, a SQL plan directive might instruct the optimizer to collect missing statistics or gather dynamic statistics.
SQL handle
A string value derived from the numeric SQL signature. Like the signature, the handle uniquely identifies a SQL statement. It serves as a SQL search key in user APIs.
SQL ID
For a specific SQL statement, the unique identifier of the parent cursor in the library cache. A hash function applied to the text of the SQL statement generates the SQL ID. The V$SQL.SQL_ID
column displays the SQL ID.
SQL incident
In the fault diagnosability infrastructure of Oracle Database, a single occurrence of a SQL-related problem. When a problem (critical error) occurs multiple times, the database creates an incident for each occurrence. Incidents are timestamped and tracked in the Automatic Diagnostic Repository (ADR).
SQL management base (SMB)
A logical repository that stores statement logs, plan histories, SQL plan baselines, and SQL profiles. The SMB is part of the data dictionary and resides in the SYSAUX
tablespace.
SQL plan baseline
A set of one or more accepted plans for a repeatable SQL statement. Each accepted plan contains a set of hints, a plan hash value, and other plan-related information. SQL plan management uses SQL plan baselines to record and evaluate the execution plans of SQL statements over time.
SQL plan capture
Techniques for capturing and storing relevant information about plans in the SQL management base (SMB) for a set of SQL statements. Capturing a plan means making SQL plan management aware of this plan.
SQL plan directive
Additional information that the optimizer uses to generate a more optimal plan. The optimizer collects SQL plan directives on query expressions rather than at the statement level. In this way, the directives are usable for multiple SQL statements.
SQL plan history
The set of plans generated for a repeatable SQL statement over time. The history contains both SQL plan baselines and unaccepted plans.
SQL plan management
SQL plan management is a preventative mechanism that records and evaluates the execution plans of SQL statements over time. SQL plan management can prevent SQL plan regressions caused by environmental changes such as a new optimizer version, changes to optimizer statistics, system settings, and so on.
SQL processing
The stages of parsing, optimization, row source generation, and execution of a SQL statement.
SQL profile
A set of auxiliary information built during automatic tuning of a SQL statement. A SQL profile is to a SQL statement what statistics are to a table. The optimizer can use SQL profiles to improve cardinality and selectivity estimates, which in turn leads the optimizer to select better plans.
SQL signature
A numeric hash value computed using a SQL statement text that has been normalized for case insensitivity and white space. It uniquely identifies a SQL statement. The database uses this signature as a key to maintain SQL management objects such as SQL profiles, SQL plan baselines, and SQL patches.
SQL statement log
When automatic SQL plan capture is enabled, a log that contains the SQL ID of SQL statements that the optimizer has evaluated over time. A statement is tracked when it exists in the log.
SQL test case
A problematic SQL statement and related information needed to reproduce the execution plan in a different environment. A SQL test case is stored in an Oracle Data Pump file.
SQL test case builder
A database feature that gathers information related to a SQL statement and packages it so that a user can reproduce the problem on a different database. The DBMS_SQLDIAG
package is the interface for SQL test case builder.
SQL trace file
A server-generated file that provides performance information on individual SQL statements. For example, the trace file contains parse, execute, and fetch counts, CPU and elapsed times, physical reads and logical reads, and misses in the library cache.
SQL tuning set (STS)
A database object that includes one or more SQL statements along with their execution statistics and execution context.
star schema
A relational schema whose design represents a dimensional data model. The star schema consists of one or more fact tables and one or more dimension tables that are related through foreign keys.
statistics feedback
A form of automatic reoptimization that automatically improves plans for repeated queries that have cardinality misestimates. The optimizer may estimate cardinalities incorrectly for many reasons, such as missing statistics, inaccurate statistics, or complex predicates.
stored outline
A set of hints for a SQL statement. The hints in stored outlines direct the optimizer to choose a specific plan for the statement.
subplan
A portion of an adaptive plan that the optimizer can switch to as an alternative at run time. A subplan can consist of multiple operations in the plan. For example, the optimizer can treat a join method and the corresponding access path as one unit when determining whether to change the plan at run time.
subquery
A query nested within another SQL statement. Unlike implicit queries, subqueries use a SELECT
statement to retrieve data.
subquery unnesting
A transformation technique in which the optimizer transforms a nested query into an equivalent join statement, and then optimizes the join.
synopsis
A set of auxiliary statistics gathered on a partitioned table when the INCREMENTAL
value is set to true
.
system statistics
Statistics that enable the optimizer to use CPU and I/O characteristics. Index statistics include B-tree levels, leaf block counts, clustering factor, distinct keys, and number of rows in the index.
table cluster
A schema object that contains data from one or more tables, all of which have one or more columns in common. In table clusters, the database stores together all the rows from all tables that share the same cluster key.
table expansion
A transformation technique that enables the optimizer to generate a plan that uses indexes on the read-mostly portion of a partitioned table, but not on the active portion of the table.
table statistics
Statistics about tables that the optimizer uses to determine table access cost, join cardinality, join order, and so on. Table statistics include row counts, block counts, empty blocks, average free space per block, number of chained rows, average row length, and staleness of the statistics on the table.
top frequency histogram
A variation of a frequency histogram that ignores nonpopular values that are statistically insignificant, thus producing a better histogram for highly popular values.
tuning mode
One of the two optimizer modes. When running in tuning mode, the optimizer is known as the Automatic Tuning Optimizer. In tuning mode, the optimizer determines whether it can further improve the plan produced in normal mode. The optimizer output is not an execution plan, but a series of actions, along with their rationale and expected benefit for producing a significantly better plan.
unaccepted plan
A plan for a statement that is in the SQL plan history but has not been added to the SQL plan management.
unselective
A relatively large fraction of rows from a row set. A query becomes more unselective as the selectivity approaches 1
. For example, a query that returns 999,999 rows from a table with one million rows is unselective. A query of the same table that returns one row is selective.
user response time
The time between when a user submits a command and receives a response.
See throughput.
vector I/O
A type of I/O in which the database obtains a set of rowids, sends them batched in an array to the operating system, which performs the read.
view merging
The merging of a query block representing a view into the query block that contains it. View merging can improve plans by enabling the optimizer to consider additional join orders, access methods, and other transformations.
This chapter explains how Oracle Database processes SQL statements. Specifically, the section explains the way in which the database processes DDL statements to create objects, DML to modify data, and queries to retrieve data.
This chapter contains the following topics:
SQL processing is the parsing, optimization, row source generation, and execution of a SQL statement. Depending on the statement, the database may omit some of these stages. Figure 3-1 depicts the general stages of SQL processing.
As shown in Figure 3-1, the first stage of SQL processing is parsing. This stage involves separating the pieces of a SQL statement into a data structure that other routines can process. The database parses a statement when instructed by the application, which means that only the application, and not the database itself, can reduce the number of parses.
When an application issues a SQL statement, the application makes a parse call to the database to prepare the statement for execution. The parse call opens or creates a cursor, which is a handle for the session-specific private SQL area that holds a parsed SQL statement and other processing information. The cursor and private SQL area are in the program global area (PGA).
During the parse call, the database performs the following checks:
The preceding checks identify the errors that can be found before statement execution. Some errors cannot be caught by parsing. For example, the database can encounter deadlocks or errors in data conversion only during statement execution.
Oracle Database must check each SQL statement for syntactic validity. A statement that breaks a rule for well-formed SQL syntax fails the check. For example, the following statement fails because the keyword FROM
is misspelled as FORM
:
SQL> SELECT * FORM employees; SELECT * FORM employees * ERROR at line 1: ORA-00923: FROM keyword not found where expected
The semantics of a statement are its meaning. Thus, a semantic check determines whether a statement is meaningful, for example, whether the objects and columns in the statement exist. A syntactically correct statement can fail a semantic check, as shown in the following example of a query of a nonexistent table:
SQL> SELECT * FROM nonexistent_table; SELECT * FROM nonexistent_table * ERROR at line 1: ORA-00942: table or view does not exist
During the parse, the database performs a shared pool check to determine whether it can skip resource-intensive steps of statement processing. To this end, the database uses a hashing algorithm to generate a hash value for every SQL statement. The statement hash value is the SQL ID shown in V$SQL.SQL_ID
. This hash value is deterministic within a version of Oracle Database, so the same statement in a single instance or in different instances has the same SQL ID.
When a user submits a SQL statement, the database searches the shared SQL area to see if an existing parsed statement has the same hash value. The hash value of a SQL statement is distinct from the following values:
Memory address for the statement
Oracle Database uses the SQL ID to perform a keyed read in a lookup table. In this way, the database obtains possible memory addresses of the statement.
Hash value of an execution plan for the statement
A SQL statement can have multiple plans in the shared pool. Typically, each plan has a different hash value. If the same SQL ID has multiple plan hash values, then the database knows that multiple plans exist for this SQL ID.
Parse operations fall into the following categories, depending on the type of statement submitted and the result of the hash check:
If Oracle Database cannot reuse existing code, then it must build a new executable version of the application code. This operation is known as a hard parse, or a library cache miss.
Note: The database always perform a hard parse of DDL. |
During the hard parse, the database accesses the library cache and data dictionary cache numerous times to check the data dictionary. When the database accesses these areas, it uses a serialization device called a latch on required objects so that their definition does not change. Latch contention increases statement execution time and decreases concurrency.
A soft parse is any parse that is not a hard parse. If the submitted statement is the same as a reusable SQL statement in the shared pool, then Oracle Database reuses the existing code. This reuse of code is also called a library cache hit.
Soft parses can vary in how much work they perform. For example, configuring the session shared SQL area can sometimes reduce the amount of latching in the soft parses, making them "softer."
In general, a soft parse is preferable to a hard parse because the database skips the optimization and row source generation steps, proceeding straight to execution.
Figure 3-2 is a simplified representation of a shared pool check of an UPDATE
statement in a dedicated server architecture.
If a check determines that a statement in the shared pool has the same hash value, then the database performs semantic and environment checks to determine whether the statements have the same meaning. Identical syntax is not sufficient. For example, suppose two different users log in to the database and issue the following SQL statements:
CREATE TABLE my_table ( some_col INTEGER ); SELECT * FROM my_table;
The SELECT
statements for the two users are syntactically identical, but two separate schema objects are named my_table
. This semantic difference means that the second statement cannot reuse the code for the first statement.
Even if two statements are semantically identical, an environmental difference can force a hard parse. In this context, the optimizer environment is the totality of session settings that can affect execution plan generation, such as the work area size or optimizer settings (for example, the optimizer mode). Consider the following series of SQL statements executed by a single user:
ALTER SESSION SET OPTIMIZER_MODE=ALL_ROWS; ALTER SYSTEM FLUSH SHARED_POOL; # optimizer environment 1 SELECT * FROM sh.sales; ALTER SESSION SET OPTIMIZER_MODE=FIRST_ROWS; # optimizer environment 2 SELECT * FROM sh.sales; ALTER SESSION SET SQL_TRACE=true; # optimizer enviornment 3 SELECT * FROM sh.sales;
In the preceding example, the same SELECT
statement is executed in three different optimizer environments. Consequently, the database creates three separate shared SQL areas for these statements and forces a hard parse of each statement.
See Also:
|
During the optimization stage, Oracle Database must perform a hard parse at least once for every unique DML statement and performs the optimization during this parse. The database never optimizes DDL unless it includes a DML component such as a subquery that requires optimization. Chapter 4, "Query Optimizer Concepts" explains the optimization process in more detail.
The row source generator is software that receives the optimal execution plan from the optimizer and produces an iterative execution plan that is usable by the rest of the database. The iterative plan is a binary program that, when executed by the SQL engine, produces the result set.
The execution plan takes the form of a combination of steps. Each step returns a row set. The next step either uses the rows in this set, or the last step returns the rows to the application issuing the SQL statement.
A row source is a row set returned by a step in the execution plan along with a control structure that can iteratively process the rows. The row source can be a table, view, or result of a join or grouping operation.
The row source generator produces a row source tree, which is a collection of row sources. The row source tree shows the following information:
An ordering of the tables referenced by the statement
An access method for each table mentioned in the statement
A join method for tables affected by join operations in the statement
Data operations such as filter, sort, or aggregation
Example 3-1 shows the execution plan of a SELECT
statement when AUTOTRACE
is enabled. The statement selects the last name, job title, and department name for all employees whose last names begin with the letter A
. The execution plan for this statement is the output of the row source generator.
Example 3-1 Execution Plan
SELECT e.last_name, j.job_title, d.department_name FROM hr.employees e, hr.departments d, hr.jobs j WHERE e.department_id = d.department_id AND e.job_id = j.job_id AND e.last_name LIKE 'A%'; Execution Plan ---------------------------------------------------------- Plan hash value: 975837011 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 189 | 7 (15)| 00:00:01 | |* 1 | HASH JOIN | | 3 | 189 | 7 (15)| 00:00:01 | |* 2 | HASH JOIN | | 3 | 141 | 5 (20)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 3 | 60 | 2 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | EMP_NAME_IX | 3 | | 1 (0)| 00:00:01 | | 5 | TABLE ACCESS FULL | JOBS | 19 | 513 | 2 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL | DEPARTMENTS | 27 | 432 | 2 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID") 2 - access("E"."JOB_ID"="J"."JOB_ID") 4 - access("E"."LAST_NAME" LIKE 'A%') filter("E"."LAST_NAME" LIKE 'A%')
During execution, the SQL engine executes each row source in the tree produced by the row source generator. This step is the only mandatory step in DML processing.
Figure 3-3 is an execution tree, also called a parse tree, that shows the flow of row sources from one step to another in the plan in Example 3-1. In general, the order of the steps in execution is the reverse of the order in the plan, so you read the plan from the bottom up.
Each step in an execution plan has an ID number. The numbers in Figure 3-3 correspond to the Id
column in the plan shown in Example 3-1. Initial spaces in the Operation
column of the plan indicate hierarchical relationships. For example, if the name of an operation is preceded by two spaces, then this operation is a child of an operation preceded by one space. Operations preceded by one space are children of the SELECT
statement itself.
In Figure 3-3, each node of the tree acts as a row source, which means that each step of the execution plan in Example 3-1 either retrieves rows from the database or accepts rows from one or more row sources as input. The SQL engine executes each row source as follows:
Steps indicated by the black boxes physically retrieve data from an object in the database. These steps are the access paths, or techniques for retrieving data from the database.
Step 6 uses a full table scan to retrieve all rows from the departments
table.
Step 5 uses a full table scan to retrieve all rows from the jobs
table.
Step 4 scans the emp_name_ix
index in order, looking for each key that begins with the letter A
and retrieving the corresponding rowid. For example, the rowid corresponding to Atkinson
is AAAPzRAAFAAAABSAAe
.
Step 3 retrieves from the employees
table the rows whose rowids were returned by Step 4. For example, the database uses rowid AAAPzRAAFAAAABSAAe
to retrieve the row for Atkinson
.
Steps indicated by the clear boxes operate on row sources.
Step 2 performs a hash join, accepting row sources from Steps 3 and 5, joining each row from the Step 5 row source to its corresponding row in Step 3, and returning the resulting rows to Step 1.
For example, the row for employee Atkinson
is associated with the job name Stock Clerk
.
Step 1 performs another hash join, accepting row sources from Steps 2 and 6, joining each row from the Step 6 source to its corresponding row in Step 2, and returning the result to the client.
For example, the row for employee Atkinson
is associated with the department named Shipping
.
In some execution plans the steps are iterative and in others sequential. The hash join shown in Example 3-1 is sequential. The database completes the steps in their entirety based on the join order. The database starts with the index range scan of emp_name_ix
. Using the rowids that it retrieves from the index, the database reads the matching rows in the employees
table, and then scans the jobs
table. After it retrieves the rows from the jobs
table, the database performs the hash join.
During execution, the database reads the data from disk into memory if the data is not in memory. The database also takes out any locks and latches necessary to ensure data integrity and logs any changes made during the SQL execution. The final stage of processing a SQL statement is closing the cursor.
Most DML statements have a query component. In a query, execution of a cursor places the results of the query into a set of rows called the result set.
Result set rows can be fetched either a row at a time or in groups. In the fetch stage, the database selects rows and, if requested by the query, orders the rows. Each successive fetch retrieves another row of the result until the last row has been fetched.
In general, the database cannot determine for certain the number of rows to be retrieved by a query until the last row is fetched. Oracle Database retrieves the data in response to fetch calls, so that the more rows the database reads, the more work it performs. For some queries the database returns the first row as quickly as possible, whereas for others it creates the entire result set before returning the first row.
In general, a query retrieves data by using the Oracle Database read consistency mechanism. This mechanism, which uses undo data to show past versions of data, guarantees that all data blocks read by a query are consistent to a single point in time.
For an example of read consistency, suppose a query must read 100 data blocks in a full table scan. The query processes the first 10 blocks while DML in a different session modifies block 75. When the first session reaches block 75, it realizes the change and uses undo data to retrieve the old, unmodified version of the data and construct a noncurrent version of block 75 in memory.
DML statements that must change data use the read consistency mechanism to retrieve only the data that matched the search criteria when the modification began. Afterward, these statements retrieve the data blocks as they exist in their current state and make the required modifications. The database must perform other actions related to the modification of the data such as generating redo and undo data.
Oracle Database processes DDL differently from DML. For example, when you create a table, the database does not optimize the CREATE TABLE
statement. Instead, Oracle Database parses the DDL statement and carries out the command.
The database processes DDL differently because it is a means of defining an object in the data dictionary. Typically, Oracle Database must parse and execute many recursive SQL statements to execute a DDL statement. Suppose you create a table as follows:
CREATE TABLE mytable (mycolumn INTEGER);
Typically, the database would run dozens of recursive statements to execute the preceding statement. The recursive SQL would perform actions such as the following:
Issue a COMMIT
before executing the CREATE TABLE
statement
Verify that user privileges are sufficient to create the table
Determine which tablespace the table should reside in
Ensure that the tablespace quota has not been exceeded
Ensure that no object in the schema has the same name
Insert rows that define the table into the data dictionary
Issue a COMMIT
if the DDL statement succeeded or a ROLLBACK
if it did not
See Also: Oracle Database Development Guide to learn about processing DDL, transaction control, and other types of statements |
This manual explains how to tune Oracle SQL.
This preface contains the following topics:
This document is intended for database administrators and application developers who perform the following tasks:
Generating and interpreting SQL execution plans
Managing optimizer statistics
Influencing the optimizer through initialization parameters or SQL hints
Controlling cursor sharing for SQL statements
Monitoring SQL execution
Performing application tracing
Managing SQL tuning sets
Using SQL Tuning Advisor or SQL Access Advisor
Managing SQL profiles
Managing SQL baselines
For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at http://www.oracle.com/pls/topic/lookup?ctx=acc&id=docacc
.
Access to Oracle Support
Oracle customers have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=info
or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs
if you are hearing impaired.
This manual assumes that you are familiar with the following documents:
To learn how to tune data warehouse environments, see Oracle Database Data Warehousing Guide.
Many examples in this book use the sample schemas, which are installed by default when you select the Basic Installation option with an Oracle Database. See Oracle Database Sample Schemas for information on how these schemas were created and how you can use them.
To learn about Oracle Database error messages, see Oracle Database Error Messages Reference. Oracle Database error message documentation is only available in HTML. If you are accessing the error message documentation on the Oracle Documentation CD, then you can browse the error messages by range. After you find the specific range, use your browser's find feature to locate the specific message. When connected to the Internet, you can search for a specific error message using the error message search feature of the Oracle online documentation.
The following text conventions are used in this document:
Convention | Meaning |
---|---|
boldface | Boldface type indicates graphical user interface elements associated with an action, or terms defined in text or the glossary. |
italic | Italic type indicates book titles, emphasis, or placeholder variables for which you supply particular values. |
monospace | Monospace type indicates commands within a paragraph, URLs, code in examples, text that appears on the screen, or text that you enter. |