Translate

SQL*Loader Performance

Optimize Oracle SQL*Loader Performance
Source: http://www.dba-oracle.com
To know how SQL * Loader works- Click here.
1. Use Direct Path Loads - The conventional path loader essentially loads the data By using standard insert statements. The direct path loader (direct=true) loads directly Into the Oracle data files and creates blocks in Oracle database block format. The fact that SQL is not being issued makes the entire process much less taxing on the database. There are certain cases, however, in which direct path loads cannot be used (clustered Tables). To prepare the database for direct path loads, the script $ORACLE_HOME/rdbms/admin/catldr.sql.sql must be executed.
2. Disable Indexes and Constraints. For conventional data loads only, the disabling of indexes and constraints can greatly enhance the performance of SQL*Loader.
3. Use Parallel Loads. Available with direct path data loads only, this option allows multiple SQL*Loader jobs to execute concurrently.
$ sqlldr control=first.ctl parallel=true direct=true
$ sqlldr control=second.ctl parallel=true direct=true
4. Use Fixed WIDth Data. Fixed wIDth data format saves Oracle some processing when parsing the data. The savings can be tremendous, depending on the type of data and number of rows. SO, If you have the option of loading from either fixed-wIDth or delimited data files, choose the fixed-wIDth data. You'll save the rather significant overhead of scanning each record for delimiters.
5. Disable Archiving During Load. While this may not be feasible in certain environments, disabling database archiving can increase performance consIDerably.
6. Use unrecoverable. The unrecoverable option (unrecoverable load data) disables the writing of the data to the redo logs. This option is available for direct path loads only.
7. Use TRUNCATE instead of REPLACE. If you want to delete existing data from a Table before loading it, I recommend using the TRUNCATE keyword instead of the REPLACE keyword in your INTO TABLE clause.
The REPLACE keyword causes SQL*Loader to issue a DELETE statement against the target Table before beginning the load. A SQL DELETE operation carries with it a lot of baggage. Transactional integrity must be maintained, so deleted rows are written to rollback segments so they are available in the event the transaction is rolled back. In addition, a database redo log entry will be written for each database block affected By the DELETE. Given a large Table, the amount of extra I/O can be substantial.
You also risk running out of rollback segment space. The TRUNCATE keyword, on the other hand, causes SQL*Loader to issue a SQL TRUNCATE TABLE statement. Such a statement does not generate any rollback segment entries. In addition, it only results in one redo log entry--for the statement itself.

Bottom line: If you can use TRUNCATE instead of REPLACE and your target Table is not of a trivial size, your performance gain will be significant.

Data Loading Methods
SQL*Loader provides two methods for loading data:
       Conventional Path Load
       Direct Path Load
A conventional path load executes SQL INSERT statements to populate Tables in an Oracle database. A direct path load eliminates much of the Oracle database overhead By formatting Oracle data blocks and writing the data blocks directly to the database files. A direct load does not compete with other users for database resources, so it can usually load data at near disk speed. Considerations inherent to direct path loads, such as restrictions, security, and backup implications, are discussed in this chapter.
The Tables to be loaded must already exist in the database. SQL*Loader never creates Tables. It loads existing Tables that either already contain data or are empty.
The following privileges are required for a load:
       You must have INSERT privileges on the Table to be loaded.
       You must have DELETE privileges on the Table to be loaded, when using the REPLACE or TRUNCATE option to empty old data from the Table before loading the new data in its place.

Conventional Path Load
Conventional path load (the default) uses the SQL INSERT statement and a bind array buffer to load data Into database Tables. This method is used By all Oracle tools and applications.
When SQL*Loader performs a conventional path load, it competes equally with all other processes for buffer resources. This can slow the load significantly. Extra overhead is added as SQL commands are generated, passed to Oracle, and executed.
The Oracle database server looks for partially filled blocks and attempts to fill them on each insert. Although appropriate during normal use, this can slow bulk loads dramatically.

Direct Path Load
Instead of filling a bind array buffer and passing it to the Oracle database server with a SQL INSERT statement, a direct path load uses the direct path API to pass the data to be loaded to the load engine in the server. The load engine builds a column array structure from the data passed to it.
The direct path load engine uses the column array structure to format Oracle data blocks and build index keys. The newly formatted database blocks are written directly to the database (multiple blocks per I/O request using asynchronous writes if the host platform supports asynchronous I/O).
Internally, multiple buffers are used for the formatted blocks. While one buffer is being filled, one or more buffers are being written if asynchronous I/O is available on the host platform. Overlapping computation with I/O increases load performance.

Advantages of a Direct Path Load

A direct path load is faster than the conventional path for the following reasons:
       Partial blocks are not used, so no reads are needed to find them, and fewer writes are performed.
       SQL*Loader need not execute any SQL INSERT statements; therefore, the processing load on the Oracle database is reduced.
       A direct path load calls on Oracle to lock Tables and indexes at the start of the load and releases them when the load is finished. A conventional path load calls Oracle once for each array of rows to process a SQL INSERT statement.
       A direct path load uses multiblock asynchronous I/O for writes to the database files.
       During a direct path load, processes perform their own write I/O, instead of using Oracle's buffer cache. This minimizes contention with other Oracle users.
       The sorted indexes option available during direct path loads allows you to presort data using high-performance sort routines that are native to your system or installation.
       When a Table to be loaded is empty, the presorting option eliminates the sort and merge phases of index-building. The index is filled in as data arrives.
       Protection against instance failure does not require redo log file entries during direct path loads. Therefore, no time is required to log the load when:
       Oracle is operating in NOARCHIVELOG mode
       The UNRECOVERABLE parameter is set to Y
       The object being loaded has the NOLOG attribute set

When to Use a Direct Path Load
If none of the previous restrictions apply, you should use a direct path load when:
       You have a large amount of data to load quickly. A direct path load can quickly load and index large amounts of data. It can also load data Into either an empty or nonempty Table.
       You want to load data in parallel for maximum performance.


Conventional Path
Direct Path
SQL*Loader uses the SQL INSERT statement and bind array buffer to load data.
It passes on the data to the Load Engine of the database, which creates a Column Array structure.
Makes use of the Database buffer cache and may increase contention for the resources among other users.
Avoids buffer cache and writes directly to the Disk.
Can make use of Asynchronous I/O if available/supported on the OS.
Slower since the SQL INSERT statements have to be generated, passed to Oracle, and executed.
Faster since, the Load Engine converts the column array structure directly to Oracle Data Blocks and adds it to the Table’s existing segment.
While loading the data, searches for blocks with enough free space, Into which the rows can be inserted.
Does not search the existing blocks.
New blocks are formatted and added to the existing segment.
Does not lock the Table being loaded Into.
Locks Table in Exclusive mode. Hence, should not be used if concurrent access to the Table is required during the load.
The 10046 trace output shows:
‘Lock Table <Table_Name> exclusive mode nowait;’
Can be used to load Into Clustered Tables
Cannot be used to load Into a cluster.
Check constraints are enabled during the load. Records not satisfying the constraint are rejected and written Into the BAD file.
The constraints are disabled during the load.
It explicitly executes an
‘alter Table <Table_Name> disable constraint <constraint_Name>’
statement before loading Into the Table.
Can be used to load Into Varrays
Cannot be used to load Into Varrays
Can be used to load Into BFILE columns.
Cannot be used to load Into BFILE columns.
Can be used to load Into a Single partition of a Table having Global indexes
Direct path cannot be used to load Into a particular partition of the Table if the Table has a global index defined on it.
Cannot be used for loading data in Parallel. But, you can use multiple load session concurrently inserting Into the same Table.
Parallel loading of data is possible.
Automatically inserts default values for the columns, if any.
The default value specified for the column is not inserted. If it is a ‘null’ column, it inserts a null.
Indexes
Unique Index on the Table is in a valID state after the load. The uniqueness of the data for the index column is maintained. Records violating the uniqueness are rejected and written Into the BAD file.
The Uniqueness of the data is not valIDated. The unique index is in an ‘UNUSABLE’ state at the end of the load.
If the Table has any indexes, corresponding keys are added Into the index for each new row inserted Into the Table.
After each block is formatted, the new index keys are put in a sort (temporary) segment.
The old index and the new keys are merged at load finish time to create the new index.
The index does not require a re-build at the end of the load. Also, no extra storage is required.
But, since the index is updates for each new row, it increases the processing time.
The index needs to be re-built at the end of the load. The old index, new index and sort segment all require storage space until the indexes are merged.
Loading Into Objects
If the type has a User-defined constructor matching the arguments of the attribute-value constructor, conventional path calls the User-defined constructor.
Direct path calls the Argument-value constructor.
If the type has a User-defined constructor not matching the arguments of the attribute-value constructor, you can invoke the user-defined using an SQL Expression.
It is not possible to invoke the user-defined constructor in direct path loading.
Can be used to load Into Parent and child Table at the same time.
Cannot be used to load Into Parent and child Table at the same time



1 comment:

  1. With Genex, you gain access to our top Oracle certified masters who help maximize the value of your Oracle databases.
    https://genexdbs.com/

    ReplyDelete