What is Database Tuning? (WDT)

There are various ways that can be used to perform tuning, one of which is database tuning. Here is a brief explanation of database tuning.

1. Understanding Database Tuning

According to Ramakrishnan and Gehrke (2005, p650), the definition of database tuning is improving the performance of a physical database design that includes relations and views according to user needs.

2. Purpose of Database Tuning

The goals of database tuning are:

  • Reduce response time from the system to the end user.
  • Reduce the resources required to perform data processing in the database.

General steps for Database Tuning. There are 3 ways to do database tuning, namely:

2.1 Reduce working time in a more efficient way.

This process can be done through an optimization process, namely replacing a query with another query with the same function but with less resource usage and faster execution time.

2.2 Balancing work time

Systems generally work harder during the day than at night. During the day, the system performs transaction processing and less important work is done at night (for example, backup processes). Therefore, it is necessary to divide the work tasks for the system during the day and night. This will reduce the resources required to perform work during the day.

2.3 Creating parallel working hours

Queries to access large data can usually be parallelized. This is very useful for reducing response time on data that is rarely accessed simultaneously (Data Warehouse). However, in OLTP, it is necessary to pay attention to where the data has a very high concurrent access time because this can increase resource usage which will then increase the response time of the program.

3. Various ways to improve the performance of Oracle Database

3.1 Changing the database design

Slow system performance is usually caused by poor database design. In database design, someone will usually normalize into 3NF. This 3NF form makes some database accesses slower which can be optimized by denormalizing to improve the performance of database access.

3.2 Performing SQL Tuning

Experience shows that 80% of database performance problems can be solved by optimal use of SQL so that query execution by the server can run quickly and does not consume resources.

3.3 Memory Tuning

By placing the right buffer size (for waiting time, buffer hit ratios, system swapping and paging). Database optimization can be done by pinning large and frequently used objects into memory. This is done to prevent too frequent calls (because if the object is not pinned it will require more computer resources to enter memory).

3.4 Disk I/O Tuning

Files in the database need to be measured and placed properly in a system. This is used to facilitate access to the file where the more difficult the file is to access and the larger the file size will increase the use of I / O resources. The thing to note is how we organize the location of the database files so that they are easily accessible so as to increase ease of access to the file which will ultimately reduce the use of I / O resources.

3.5 Removing Database Contention

Database Contention deals with database locks, latches and wait events that occur in the database. To eliminate Database Contention, you must learn about these three things and determine if they can be eliminated to improve the performance of your database.

3.6 Optimizing the Operating System

Monitoring and optimizing CPU, I/O and memory usage, for example, is overclocking the CPU or giving more priority to the Oracle database so that Oracle can use more CPU or memory to perform activities within the database.

4. Average Active Session

Active session is the number of active sessions (currently making calls to the database or not idle). Figure 2.14 shows 4 users where each user accesses the database at different times. In the initial part shown in Figure 2.14 only user 4 makes a call to the database then in the second part only users 4 and 3 make calls to the database and in the third part only users 4, 3 and 1 make calls to the database so it can be concluded that in the initial part there is only 1 active session (user 4) in the second part there are 2 active sessions and in part 3 there are 3 active sessions and so on.

Figure 2.14 Definition of Active Session
Figure 2.14 Definition of Active Session

A database call session consists of several activities which can generally be categorized into 3 things, namely: CPU, IO and wait, which can be seen in Figure 2.15 below:

Figure 2.15 Division of components of a call session to the database
Figure 2.15 Division of components of a call session to the database

In general, Oracle will store each session in seconds along with the activities carried out by the session, which can be seen in Figure 2.16 below:

Figure 2.16 Illustration of activity from 4 sessions during a certain time period taken by Oracle per second.
Figure 2.16 Illustration of activity from 4 sessions during a certain time period taken by Oracle per second.

To illustrate this in graphical form, Oracle directly combines the results into a bar chart as shown in Figure 2.17 below:

Figure 2.17 Bar chart to illustrate the activity of active sessions.
Figure 2.17 Bar chart to illustrate the activity of active sessions.

However, if the depiction of each session is done per second, the resulting image cannot be read because the lines are too thin, as in Figure 2.18:

Figure 2.18 Bar chart to illustrate the activity of active sessions (per second)
Figure 2.18 Bar chart to illustrate the activity of active sessions (per second)

So, to illustrate this well, Oracle will directly take several samples and average them to get an Average Active Session, the results of which will be easier to see, as in Figure 2.19.

Figure 2.19 Bar Chart of Average Active Session (averaged over 15 seconds)
Figure 2.19 Bar Chart of Average Active Session (averaged over 15 seconds)

In Oracle, the graph form is depicted as in Figure 2.20, where this bar chart can be seen in Oracle Enterprise Manager on the Performance tab, which depicts the Average Active Session (the average number of active sessions, which by default is averaged at 15 seconds).

Figure 2.20 Image of Average Active Session in Oracle Entreprise Manager
Figure 2.20 Image of Average Active Session in Oracle Entreprise Manager

From the Active Average Session image shown by Oracle, we can see how the database performs over a certain period of time, where in the Active Average Session image we can clearly see what work is done the most by that session.

SQL Tuning Techniques

1. Understanding SQL Tuning

According to Immanuel Chan (2008, p11-1), SQL Tuning is an optimization process by changing SQL commands and determining indexing techniques so that SQL works optimally.

2. SQL Tuning Steps

The first step to perform SQL Tuning is to identify High-Load SQL Statements. High-Load SQL Statements are queries that burden the server, causing the performance of a database to slow down and consuming large amounts of system resources. Even though optimization has been performed on the database, non-optimal use of SQL will cause the database performance to decrease. Identifying these queries is an important activity in the SQL Tuning process, which has been automated with the ADDM (Automatic Database Diagnostic Monitor) feature in Oracle Enterprise Manager 10g.

Identifying High-Load SQL Statements is an important activity of a SQL tuning process. Oracle 10 g has automated this with the ADDM (Automatic Database Diagnostic Monitor) feature or by using Enterprise Manager to search for High-Load SQL Statements. There are 2 ways to identify High-Load SQL Statements, namely by using ADDM and Top SQL.

a. Identify High-Load SQL Statements using ADDM

By default, ADDM runs automatically every hour and analyzes data collected from the AWR (Automatic Workload Repository) to identify database performance issues through snapshots. This includes High-Load SQL Statements. When there is a database performance issue, ADDM displays the issue on the ADDM page and provides recommendations for each issue found. For example, when a High-Load SQL Statement is found, ADDM recommends using the SQL Tuning Advisor for that SQL statement.

b. Identify High-Load SQL Statements using Top SQL

In addition to using ADDM. Oracle also provides a feature to view High-Load SQL Statements. This feature provided by Oracle is a Top Activity page found in Enterprise Manager

On this page we can see High-Load SQL Statements, shown in Figure 2.21 below:

Figure 2.21 Top Activity page in Enterprise Manager
Figure 2.21 Top Activity page in Enterprise Manager

In Enterprise Manager, click Top Activity on the Performance tab. The Top Activity page shows the activity running on the database during a 1-hour period. In the Top Activity section, a chart will appear that illustrates the database performance measured by the average active sessions. This chart shows the average active sessions with the classification of categories of activities performed by each session with color indicator descriptions of the activities on the right side of the chart. At the bottom of the chart is the top SQL table section that contains a list of queries that burden the server from all categories. To see the details of the query, click on the SQL ID column in the top SQL table, the query detail text for each top SQL will appear. To see the SQL commands in a particular category, click the desired category button, then a chart of Active Sessions will appear showing the activity and the top SQL will be updated according to the refresh time selected by the user.

3. Restructuring SQL Commands

According to Immanuel Chan (2008, p11-7 to 11-17), the following are several techniques for optimizing queries:

Use Equijoin

Use equijoin whenever possible, because the equijoin command on untransformed columns will improve database performance and simplify the tuning process.

Use untransformed columns in the WHERE clause

Always try to use untransformed columns (not assigned functions such as substr, to_char, to_number, etc.) in the WHERE clause.

Example of where clause with untransformed columns:

WHERE a.order_no = b.order_no

compared to this.

Example of where clause with transformed columns:

WHERE TO_NUMBER
(SUBSTR(a.order_no,INSTR (b.order_no, '.')-1))
=(SUBSTR (a.order_no,INSTR (b.order_no, '.')-1))

The use of SQL functions in the WHERE clause can cause the computer to use more resources to process the function, causing decreased performance because in addition to requiring a process to run the function, the index contained in the column will not be used by the optimizer. If you want to continue using the function in the column in the WHERE clause and the index is still used, use a function-based index so that the optimizer continues to use the index in the query execution process.

Be careful with data types both explicit and implicit.

Oracle implicitly requires more processing if there is a condition where there is a comparison of different data types, where Oracle will immediately convert the data types using the conversion function.

Example of a clause where there is an implicit conversion

AND charcol = numexpr

At first glance, figure 2.24 is the correct syntax because figure 2.24 uses a column that is not given a function but charcol has a varchar data type and numexpr has a numeric data type. In this case, Oracle will implicitly change the query in figure 2.24 to be like the query in figure 2.25:

Implicit conversion performed by Oracle

AND TO_NUMBER (charcol) = numexpr

This can of course make the index unused by the optimizer and require more processing to run the conversion function.

Don't treat SQL like a procedural language

SQL is not a procedural language, if a SQL is made for various tasks then the results will be less than optimal for certain tasks. Therefore, it is better to use many commands for each function compared to one command for many functions. For fairly complex queries, you can use complex commands to make them simpler, as in the example query in figure 2.26:

The query is complex and causes the index to be unreadable

SELECT info FROM tabels
WHERE ... AND somecolumn BETWEEN
DECODE (:loval, 'ALL', somecolumn, :loval)
AND DECODE (:hival, 'ALL', somecolumn, :hival);

The index on the somecolumn column will not be used by the Optimizer in the query in Figure 2.26 because the expression uses the same column on both sides of BETWEEN. If you want the index to be used by the Optimizer, the query can be replaced by using

UNION ALL so that the index can be used, namely with a query as in figure 2.27:

Optimized query using union all so that the index can be used by the optimizer

SELECT /* change this half of UNION ALL if other half changes */ info FROM tabels
WHERE...AND somecolumn BETWEEN :loval AND :hival
AND (:hival != 'ALL' AND :loval != 'ALL')
UNION   ALL SELECT  /*  Change  this    half    of  UNION ALL if other half changes. */ info
FROM tabels
WHERE...AND (:hival = 'ALL' OR :loval = 'ALL');

Using EXISTS versus IN for subqueries

Use IN if the selective-predicate is in a subquery and use EXISTS if the selective-predicate is in the parent because each use of IN and EXISTS has its own advantages.

Here is an example that demonstrates the advantages of IN and EXISTS. Both examples use the same schema with the following characteristics:

  • There is a unique index in employees.employee_id
  • There is an index in the orders.customer_id field.
  • There is an index in the employees.department_id field.
  • The employee table has 27000 rows
  • The order table has 10000 rows
  • HR and OE schemes, both analyzed with COMPUTE

Example 1: use of IN -- selective predicate is inside a subquery.

This query aims to find all workers who serve orders on customer ID 144.

Figure 2.28 shows the SQL to search for this if using the EXISTS command.

SQL example using the EXISTS command

SELECT e.employee_id,e.first_name,e.last_name,e.salary
FROM employees e
WHERE EXISTS(SELECT 1 FROM orders o /* Note 1 */
WHERE e.employee_id = o.sales_rep_id    /* Note 2 */
AND o.customer_id = 144);   /* Note 3 */

*) Information:

  • Use of EXISTS
  • The use of EXISTS is a correlated-subquery
  • This line shows that within this correlated-subquery there is a selective-query.

Table 2.1 Execution plan with EXISTS usage

| ID | OPERATION        | OPTIONS  | OBJECT_NAME     | OPT | COST |
|----|------------------|----------|-----------------|-----|------|
| 0  | SELECT STATEMENT |          |                 | CHO |      |
| 1  | FILTER           |          |                 |     |      |
| 2  | TABEL ACCESS     | FULL     | EMPLOYEES       | ANA | 155  |
| 3  | TABEL ACCESS     | BY INDEX | ROWID ORDERS    | ANA | 3    |
| 4  | INDEX            | SCAN     | ORD_CUSTOMER_IX | ANA | 1    |

Table 2.1 shows the execution process (from V$SQL_PLAN) for the command in Figure 2.28. The execution process shown in Table 2.1 requires a full-table scan of the employees table which produces many rows and then each row is filtered by the order table using an index.

Figure 2.29 shows an example of a SQL command using IN:

SQL with the use of the IN command

SELECT e.employee_id,e.first_name,e.last_name,e.salary
FROM employees e WHERE e.employee_id
IN (SELECT o.sales_rep_id   /* Note 1 */
FROM orders o
WHERE o.customer_id = 144); /* Note 3 */

*) Information:

  • This line shows that the correlated-subquery contains a high selection as indicated by the syntax customer_id = number
  • This line shows that IN is used. This subquery is no longer correlated because the IN replaces the join in the subquery.

Table 2.2 Execution plan with the use of the IN command

| ID | OPERATION        | OPTIONS     | OBJECT_NAME      | OPT | COST |
|----|------------------|-------------|------------------|-----|------|
| 0  | SELECT STATEMENT |             |                  | CHO |      |
| 1  | NESTED_LOOPS     |             |                  |     | 5    |
| 2  | VIEW             |             |                  |     | 3    |
| 3  | SORT             | UNIQUE      |                  |     | 3    |
| 4  | TABEL ACCESS     | FULL        | ORDERS           | ANA | 1    |
| 5  | TABEL ACCESS     | BY INDEX    | ROWID  EMPLOYEES | ANA | 1    |
| 6  | INDEX            | UNIQUE SCAN | EMP_EMP_ID_PK    | ANA |      |

Table 2.2 shows the execution plan (from V$SQL_PLAN) for the command in Figure 2.29. The optimizer rewrites the subquery in Figure 2.29 into a view that is then JOINed through the index on the employees table. This result shows that using IN is faster because the subquery contains a selective predicate so that it only outputs a few employee_ids and those employee_ids are used to access the employee table through a unique index.

Example 2: using EXISTS--selective predicate is in parent query

Using EXISTS can also produce optimal queries. For example, if there is a query that aims to display all employees who have the same ID as a certain sales representative ID who works in department 80 and has served customer orders.

Figure 2.30 shows a query where the use of IN is not optimal:

Query with IN usage

SELECT e.employee_id, e.first_name, e.last_name, e.department_id, e.salary
FROM employees  e
WHERE e.department_id = 80  /*Note 5
AND e.job_id    = 'SA_REP'  /*Note 6
AND e.employee_id IN
(SELECT o.sales_rep_id FROM orders o);  /*Note 4

*) Information:

  • The use of IN indicates that the subqueries are no longer correlated because the use of IN replaces a join.
  • Shows selective-query.

Table 2.3 Execution plan of IN usage

| ID | OPERATION        | OPTIONS       | OBJECT_NAME   | OPT | COST |
|----|------------------|---------------|---------------|-----|------|
| 0  | SELECT STATEMENT |               |               | CHO |      |
| 1  | NESTED LOOPS     |               |               |     | 125  |
| 2  | VIEW             |               |               |     | 116  |
| 3  | SORT             | UNIQUE        |               |     | 116  |
| 4  | TABEL ACCESS     | FULL          | ORDERS        | ANA | 40   |
| 5  | TABEL ACCESS     | BY INDEXROWID | EMPLOYEES     | ANA | 1    |
| 6  | INDEX            | UNIQUE SCAN   | EMP_EMP_ID_PK | ANA |      |

Table 2.3 shows the execution plan (from V$SQL_PLAN) for the command in Figure 2.30. This process will make the optimizer create a view containing many employee_id because there is no selective-predicate and then it will be compared with the employees table through a unique index.

Figure 2.31 shows an example of a query that uses EXISTS:

SELECT e.employee_id, e.first_name, e.last_name, e.salary
FROM employees   e
WHERE e.department_id = 80            /*Note 5*/ AND e.job_id = 'SA_REP'               /*Note 6*/ AND EXISTS
(SELECT 1                             /*Note 1*/
FROM orders o
WHERE e.employee_id =o.sales_rep_id); /*Note 2*/

*) Information:

  • This line shows the use of EXISTS.
  • This line shows that this subquery is a correlated-subquery.
  • This line shows the selective-predicate in the query.

Table 2.4 Execution plan of EXISTS usage

| ID | OPERATION        | OPTIONS        | OBJECT_NAME       | OPT | COST |
|----|------------------|----------------|-------------------|-----|------|
| 0  | SELECT STATEMENT |                |                   | CHO |      |
| 1  | FILTER           |                |                   |     |      |
| 2  | TABEL ACCESS     | BY INDEX ROWID | EMPLOYEES         | ANA | 98   |
| 3  | AND-EQUAL        |                |                   |     |      |
| 4  | INDEX            | RANGE SCAN     | EMP_JOB_IX        | ANA |      |
| 5  | INDEX            | RANGE SCAN     | EMP_DEPARTMENT_IX | ANA |      |
| 6  | INDEX            | RANGE SCAN     | ORD_SALES_REP_IX  | ANA | 8    |

Table 2.4 shows the execution plan (from V$SQL_PLAN) for the SQL command in Figure 2.31. The cost of using EXISTS is smaller because 2 indexes are used to sort the parent query to produce several employee_ids, then several employee_ids are used to access the orders table through the index.

Set access and JOIN methods via hints

We can set the JOIN step options, the INDEX used and the way the optimizer accesses by using hints in the SQL command. An example is the hint /*+FULL */ to force the optimizer to use FULL TABLE SCAN even though there is an index as in the query in figure 2.32.

Example of using FULL HINTS

SELECT /*+ FULL(e) */ e.ename
FROM emp e
WHERE e.job = 'CLERK';

Be careful when using the JOIN command

In a query, the JOIN command can cause a significant effect on performance. So the use of the JOIN command must be very careful because an unnecessary use of JOIN will cause a drastic decrease in performance.

To meet performance goals, there are 3 important rules, namely:

  • Avoid FULL TABLE SCAN if rows can be obtained through the use of indexes.
  • Always use indexes that return fewer rows.
  • Adjust the JOIN order so that the table with the fewest rows is executed first by the optimizer.

Figure 2.33 shows how to optimize the JOIN order:

Examples of join commands that can be executed with various conditions

SELECT info
FROM taba a, tabb b, tabc c
WHERE a.acol BETWEEN 100 AND 200
AND b.bcol BETWEEN 10000 AND 20000
AND c.ccol BETWEEN 10000 AND 20000
AND a.key1 = b.key1
AND a.key2 = c.key2;

The number of rows to be checked from table a is 100, while table b and table c have 10000 rows. A query with such a predicate will produce 2 very different results, namely:

First condition:

If the selective-query process is first performed on table b joined with c, it will take 10000 processes, even if there is an index. The result will be a join with table a which only has 100 rows, so it will take 100 processes.

Second condition:

If table a is joined with table b first then the number of processes used is 100 processes which will run quickly through the use of indexes. Then the resulting 100 rows will be joined with table c which will also require 100 processes and will run quickly because of the use of indexes.

Be careful when using views

When using a view, make sure that all tables used in the view are used because the JOIN process on unused tables will reduce performance. If this happens, use a new view rather than creating a reusable view with poor performance.

Restructuring the index.

Performance can be improved by restructuring the index, which can be done in the following ways:

  • Remove nonselective indexes to speed up DML processing.
  • Index the selective predicate portion of a query.
  • Try to sort the columns contained in the concatenated-index.
  • Adding columns to an index to increase selectivity.

Changing and disabling triggers and constraints

The more triggers and constraints used, the lower the system performance will be, so the use of both must be considered carefully.

Performing data restructuring

After restructuring the indexes and SQL commands, data restructuring can be done in the following way:

  • Avoid using GROUP BY in critical commands.
  • Try to review the existing database design, whether the database design is optimal. If the result is no, then try to do some optimization, for example doing a denormalization process.
  • Use partitions whenever possible.

Combining multi-scan with CASE command

The use of aggregation with various set functions for only 1 set of data is very often found in queries. Retrieving 10 set functions with the same syntax but with different conditions (WHERE clauses) will require a process of 10 scans on 10 queries. This can be eliminated by moving the WHERE conditions in each scan into a column with the use of the CASE command to filter the data.

For example, in this query we will count the number of employees who get salary less than 2000, between 2000 and 4000 and more than 4000 every month. This can be done with 3 different queries (3 scans) as in figure 2.34, 2.35 and 2.36:

Query to retrieve the number of employees whose salary is less than 2000

SELECT COUNT (*)
FROM employees
WHERE salary < 2000;

Query to retrieve the number of employees whose salary is between 2000 and 4000

SELECT COUNT (*)
FROM employees
WHERE salary BETWEEN 2000 AND 4000;

Query to retrieve the number of employees whose salary is greater than 4000

SELECT COUNT (*) FROM employees WHERE salary>4000;

3 The query can be made more efficient by changing it into a single query where each desired result is placed in a column using CASE for the condition. Figure 2.37 is an example of using CASE to eliminate the multiple scans:

Use of CASE to combine MULTISCAN

SELECT COUNT
(CASE WHEN salary < 2000
THEN 1 ELSE null END) count1,
COUNT (CASE WHEN salary BETWEEN 2001 AND 4000
THEN 1 ELSE null END) count2,
COUNT (CASE WHEN salary > 4000
THEN 1 ELSE null END) count3
FROM employees;

Use DML with RETURNING clause

Use the INSERT, UPDATE, or DELETE...RETURNING commands to retrieve and modify data in one call. This technique will improve performance by reducing the number of calls to the database. Try combining several commands into a simple command. The most basic example of this is the DELETE command as shown in Figure 2.38:

Suboptimal use of delete syntax

BEGIN
FOR pos_rec IN
(SELECT * FROM order_positions
WHERE order_id = :id)
LOOP
DELETE FROM order_positions
WHERE order_id = pos_rec.order_id AND
order_position =pos_rec.order_position;
END LOOP;
DELETE FROM orders WHERE order_id = :id;
END;

The above command will perform deletion twice, namely deleting data in the order_positions table and then deleting data in the order table. This of course consumes resources because the command will be submitted to the database twice.

This command can be replaced with a command by simply adding 1 constraint, namely the cascade constraint, where only one delete command is used, which will automatically speed up the work of this delete process.

4. Use of Bind Variables

Every time a query is sent to the database, the query text is checked to see if it already exists in the shared pool. If there is no query text that matches the query text in the shared pool, a hard parse will be performed. Conversely, if the query text already exists in the shared pool, a soft parse will be performed. Different variable values ​​in a query can cause the query to be read differently, so a hard parse will be performed. Therefore, using bind-variables in the same query text allows the query to be reused or read the same as a query that has been entered previously. Only the value of the bind-variable changes. The purpose of using bind-variables to fill in variable values ​​in a query is:

  • Entering the same query text too often (only different variable values) will just waste a lot of memory.
  • Entering the same query text into the shared pool will cause the query to be quickly executed outside the shared pool.
  • Parsing a query is an intensive process that requires a lot of resources. Reducing the amount of hard-parse will reduce the amount of CPU usage.

Tuning with SQL Indexing

1. Definition of Index

According to Ramakrishnan and Gehrke (2005, p276), an index is a data structure that organizes data records on disk to optimize various information search operations. By using an index, search conditions on records can be simplified with the search key field. Another way is to create additional indexes on the data set, each with a different search key, to speed up search operations that are not supported by the file organization.

2. Indexing Techniques

According to Immanuel Chan (2008, p2-11), there are various types of indexing that can be done, including:

B-Tree Indexes

It is a standard index technique with advantages for primary keys and indexes with high selective selection. This B-tree index technique can be used to return data sorted by the index on the column.

Bitmap indexes

This technique is suitable for data with minimum cardinality. Through data compression, this technique can produce a large number of row-ids with minimal I/O usage. The combination of bitmap index techniques on unselected columns can provide efficient use of AND and OR operations by producing a large number of row-ids and minimal I/O usage. This technique is especially effective in queries with the COUNT() command.

Function-based Indexes

This technique can create access via B-tree to values ​​derived from functions in the underlying data. This technique has limitations with the use of NULL and requires the use of query optimization. This function-based indexes technique is especially useful when querying mixed columns to produce derived data or to overcome limitations of data stored in the database.

Partitioned Indexed

Partitioned indexes can be implemented in two ways, namely global index partitioning and local index partitioning. Global indexes are described by a "one-too-many" relationship, with one index partition being mapped to many table partitions. Global indexes can only be used with partitions of a certain range. Local indexes are described by a "one-to-one" relationship mapping between index partitions and table partitions. In general, local indexes allow a "divide and conquer" approach to produce fast SQL statement execution.

Locally partitioned index

By using locally partitioned indexes, the DBA can take a single partition of a table and index "offline" for maintenance (reorganization) purposes without affecting other partitions of the table and index. In a locally partitioned index, the key values ​​and number of index partitions are adjusted to match the partitions in the base table.

Figure 2.39 Example of creating a local partitioned index

CREATE INDEX year_idx on all_fact(order_date)
LOCAL
(PARTITION name_idx1),
(PARTITION name_idx2), (PARTITION name_idx3);

ORACLE will automatically use an index that is adjusted to the number of index partitions on the table in question. For example, in Figure 2.32, if 4 indexes are created on the all_fact table, then the creation of this index will fail because the number of index partitions and table partitions does not match. Maintenance will be easy if using this index partition because a single partition can be taken "offline" and the index can be built without affecting other partitions in the table.

Globally partitioned index

The use of globally partitioned indexes is used for all indexes, except one that is used as a key on a table partition. The use of globally partitioned indexes is very useful in the implementation of OLTP (Online Transaction Processing) applications where fewer indexes are needed than locally partitioned indexes. The disadvantage of using this global index is that it is difficult to maintain, because it must change the partition on the table concerned. For example, if one of the table partitions is deleted for table reorganization, then all applicable global indexes will be affected, so to define this global index, it must be ensured that the DBA has the freedom to determine the partition according to the desired index.

Figure 2.40 shows an example of creating a global index on a table partition:

CREATE INDEX item_idx on all_fact (item_nbr) GLOBAL
(PARTITION city_idx1 VALUES LESS THAN (100)), (PARTITION city_idx1 VALUES LESS THAN (200)), (PARTITION city_idx1 VALUES LESS THAN (300)), (PARTITION city_idx1 VALUES LESS THAN (400)), (PARTITION city_idx1 VALUES LESS THAN (500));

Partitioning using a global index will reduce I/O usage significantly and in a short time if the creation of this global index has divisions with good spacing.

Reverse Key Index

This technique is designed to eliminate indexes used when inserting data into an application. This technique is limited when used for index distance reading.

3. Concatenated Index

According to Niemiec (2007, p39), an index (single index) can be related to several indexed columns, called concatenated or composite index. Oracle 9i introduced the "skip-scan index access" execution process as one of the optimizer's options when there is a concatenated index. Therefore, it is necessary to pay attention when determining the order of columns in the index. In general, the first column written in the creation of the index must be the column that is most often used as a selective column in the WHERE clause. For example, there is a concatenated index on the Empno, Ename, and Deptno columns (Empno is the first part, Ename is the second part and Deptno is the third part). Figure 2.41 shows an example of a query where the concatenated index is not read because the leading column is empno:

Figure 2.41 Query where the column in the where clause is not a leading column

SELECT job, empno
FROM emp
WHERE ename = 'RICH';

In this case, column ename is not the first column declared in the concatenated index so the optimizer will not execute the index that has been created. However, in Oracle 9i, a "skip-scan index" execution process was introduced that allows the optimizer to use the concatenated index even though the first column declared in the index is not in the WHERE clause. The optimizer will still choose the most optimal execution process, whether with index skip-scan access, index fast full scan or with full table scan.

If there is a concatenated index on a query to be executed, then the execution process with "skip-scan index" will be faster than the execution process with "index fast full scan". For example in the following example:

Figure 2.42 below shows an example of creating a concatenated index:

CREATE INDEX skip1 ON emp5(job,empno);

Figure 2.43 shows a query where the concatenated index is not used because the column in the where clause is not a leading column:

SELECT count(*)
FROM emp5
WHERE empno = 7900;

Figure 2.44 shows the execution results generated by the query in Figure 2.43 with SQL*Plus:

Elapsed: 00:00:03.13 (Result is a single row...not displayed)

Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=5)
1 0 SORT (AGGREGATE)
2 1 INDEX (FAST FULL SCAN) OF 'SKIP1' (NON-UNIQUE)

Statistics
6826 consistent gets
6819 physical reads

According to Niemiec (2007, p48), if the column order created in the concatenated index does not match the query to be executed (the leading column is not in the where clause) then the query above is executed with a fast full scan which requires an execution time of 3 minutes 13 seconds. However, we can force the optimizer to ignore it even though the leading column is not in the where clause by using the hints "skip-scan" where the method of use is as in the query in figure 2.45:

SELECT /*+ index_ss(emp5 skip1) */ count(*)
FROM emp5
WHERE empno = 7900;

Figure 2.46 shows the execution results generated by the above query with SQL*Plus:

Elapsed: 00:00:00.56

Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=5)
1 0 SORT (AGGREGATE)
2 1 INDEX (SKIP SCAN) OF 'SKIP1' (NON-UNIQUE)

Statistics
21 consistent gets
17 physical reads

If the query is run with the "skip-scan index" execution process, then the time required to execute this query is 56 seconds. Therefore, the order of the columns in a concatenated index that you want to create needs to be considered. This greatly affects the optimizer in executing the query in question.

4. Index-hint

According to Chan, Immanuel (2008 p16-8), index hint in a query is useful when the user wants to "force" an index to be executed by the optimizer. The use of index hint is used so that the optimizer uses the process of executing a query with access through the index stated in an index hint. Figure 2.47 shows a schematic of how index hint works:

Figure 2.47 Index hint scheme works
Figure 2.47 Index hint scheme works

The syntax for creating an index hint is:

/*+ index(nama_table nama_indeks) */

Figure 2.49 shows how to use the index hint

/*+ index(emp5 skip1) */

When an index hint is used, the optimizer will execute according to the index name specified in the query. If the column list and index match, then the index will be executed. If the index does not exist, then the index associated with the table and column in question first in the sequence will be executed.

5. Inappropriate Use of Indexes

According to Niemiec (2007, p40) as an analogy, there is a "product" table that has a "company_no" column. The company only has 1 branch so that the value of the column from all rows in the "product" table is 1. If there is an index on the column, the optimizer will not use the index. This is because the execution process with the index will slow down the execution process compared to the execution process with "Table Access Full".

Figure 2.50 shows the use of Full Table Access (Full Table Scan) even though there is an index:

SELECT product_id, qty
FROM product
WHERE company_no = 1;

Figure 2.51 shows the execution results generated by the query in Figure 2.50 with SQL*Plus:

Elapsed time: 405 seconds (all records are retrieved via a full table scan)

OPERATION OPTIONS OBJECT NAME
------------------ -------------- -----------
SELECT STATEMENT
TABLE ACCESS FULL PRODUCT

49,825 consistent gets (memory reads)
41,562 physical reads (disk reads)

Figure 2.52 shows the use of hints to force the use of indexes by the optimizer

SELECT /*+ index(product company_idx1) */ product_id, qty
FROM product
WHERE company no = 1;

Figure 2.53 shows the execution results generated by the query in figure 2.52 with SQL*Plus:

Elapsed time: 725 seconds (all records retrieved using the index on company_no)

OPERATION OPTIONS OBJECT NAME
------------------ -------------- ----------- SELECT STATEMENT
TABLE ACCESS BY ROWID PRODUCT
INDEX RANGE SCAN COMPANY_IDX1

4,626,725 consistent gets (memory reads)
80,513 physical reads (disk reads)

From the two comparisons above, it can be seen that if an index is not optimal for running a query, the optimizer will prefer to use "Table Access Full" because the use of this index requires the optimizer to work harder, where in addition to searching for all rows in the table, the optimizer must also search for rows in the specified index.

According to Niemiec (2007, p41-42) there are several things that prevent the index from being read by the optimizer, namely:

Use of operators '<>' and '!='

Indexes can only be used to find data contained in tables. Whenever there is a not equal operator in the WHERE clause, the index contained in the referenced column will not be used. For example, in Figure 2.54, there is a CUSTOMER table and there is an index in the CUST_RATING column and the query you want to run is to retrieve data where the cust_rating is not 'aa'. The execution process that is carried out is a full table scan even though there is an index in the CUST_RATING column.

SELECT cust_id, cust_name
FROM customers
WHERE cust_rating <> 'aa';

Use of is null or is not null

When there is the use of is null or is not null syntax in the WHERE clause, the optimizer will not use the index. This is because the value of null is undefined so that there is no value in the database that is the same as NULL. Figure 2.55 shows an example of a query that causes a full table scan even though the sal column has an index because of the use of is null in the WHERE clause:

SELECT empno, ename, deptno
FROM emp
WHERE sal is null;

Have a good study!


Post a Comment

Previous Next

نموذج الاتصال