Optimize
Oracle SQL*Loader Performance
1.
Use Direct Path Loads - The conventional path loader essentially
loads the data By using standard insert statements. The direct path loader (direct=true) loads directly Into the
Oracle data files and creates blocks in Oracle database block format. The fact
that SQL is not being issued makes the entire process much less taxing on the
database. There are certain cases, however, in which direct path loads cannot
be used (clustered Tables). To prepare the database for direct path loads, the
script $ORACLE_HOME/rdbms/admin/catldr.sql.sql
must be executed.
2.
Disable Indexes and Constraints. For conventional
data loads only, the disabling of indexes and constraints can greatly enhance
the performance of SQL*Loader.
3.
Use Parallel Loads. Available
with direct path data loads only, this option allows multiple SQL*Loader jobs
to execute concurrently.
$ sqlldr control=first.ctl
parallel=true direct=true
$ sqlldr control=second.ctl
parallel=true direct=true
4. Use
Fixed WIDth Data. Fixed
wIDth data format saves Oracle some processing when
parsing the data. The savings can be tremendous, depending on the type of data
and number of rows. SO, If you have the option of loading from either
fixed-wIDth or delimited data files, choose the fixed-wIDth data. You'll save
the rather significant overhead of scanning each record for delimiters.
5. Disable Archiving During Load. While this may not be feasible in certain environments,
disabling database archiving can increase performance consIDerably.
6. Use
unrecoverable. The
unrecoverable option (unrecoverable load data) disables the writing of the data
to the redo logs. This option is available for direct path loads only.
7.
Use TRUNCATE instead of REPLACE. If you want to
delete existing data from a Table before loading it, I recommend using the TRUNCATE keyword instead
of the REPLACE keyword in your INTO TABLE clause.
The REPLACE keyword causes SQL*Loader to issue a DELETE statement
against the target Table before beginning the load. A SQL DELETE operation
carries with it a lot of baggage. Transactional integrity must be maintained,
so deleted rows are written to rollback segments so they are available in the
event the transaction is rolled back. In addition, a database redo log entry
will be written for each database block affected By the DELETE. Given a large Table,
the amount of extra I/O can be substantial.
You also risk running out of rollback segment
space. The TRUNCATE keyword, on the other hand, causes SQL*Loader to issue a SQL TRUNCATE TABLE statement. Such
a statement does not generate any rollback segment entries. In addition, it
only results in one redo log entry--for the statement itself.
Bottom line: If you can use TRUNCATE instead of REPLACE and your target Table
is not of a trivial size, your performance gain will be significant.
Data Loading Methods
SQL*Loader provides two methods for loading data:
A conventional path load executes SQL INSERT statements to
populate Tables in an Oracle database. A direct path load eliminates much of
the Oracle database overhead By formatting Oracle data blocks and writing the
data blocks directly to the database files. A direct load does not compete with
other users for database resources, so it can usually load data at near disk
speed. Considerations inherent to direct path loads, such as restrictions,
security, and backup implications, are discussed in this chapter.
The Tables to be loaded must already exist in the
database. SQL*Loader never creates Tables. It loads existing Tables that either
already contain data or are empty.
The following privileges are required for a load:
•
You must have INSERT privileges on the Table to be loaded.
•
You must have DELETE privileges on the Table to be loaded,
when using the REPLACE or TRUNCATE option to empty old data from the Table before loading the new data
in its place.
Conventional Path Load
Conventional path load (the default) uses the SQL
INSERT statement and a
bind array buffer to load data Into database Tables. This method is used By all
Oracle tools and applications.
When SQL*Loader performs a conventional path
load, it competes equally with all other processes for buffer resources. This
can slow the load significantly. Extra overhead is added as SQL commands are
generated, passed to Oracle, and executed.
The Oracle database server looks for partially
filled blocks and attempts to fill them on each insert. Although appropriate
during normal use, this can slow bulk loads dramatically.
Direct Path Load
Instead of filling a bind array buffer and
passing it to the Oracle database server with a SQL INSERT statement, a
direct path load uses the direct path API to pass the data to be loaded to the
load engine in the server. The load engine builds a column array structure from
the data passed to it.
The direct path load engine uses the column array
structure to format Oracle data blocks and build index keys. The newly
formatted database blocks are written directly to the database (multiple blocks
per I/O request using asynchronous writes if the host platform supports
asynchronous I/O).
Internally, multiple buffers are used for the
formatted blocks. While one buffer is being filled, one or more buffers are
being written if asynchronous I/O is available on the host platform.
Overlapping computation with I/O increases load performance.
Advantages of a Direct Path Load
A direct path load is faster than the
conventional path for the following reasons:
•
Partial blocks are not used, so no reads are
needed to find them, and fewer writes are performed.
•
SQL*Loader need not execute any SQL INSERT statements;
therefore, the processing load on the Oracle database is reduced.
•
A direct path load calls on Oracle to lock Tables
and indexes at the start of the load and releases them when the load is
finished. A conventional path load calls Oracle once for each array of rows to
process a SQL INSERT statement.
•
A direct path load uses multiblock asynchronous
I/O for writes to the database files.
•
During a direct path load, processes perform
their own write I/O, instead of using Oracle's buffer cache. This minimizes
contention with other Oracle users.
•
The sorted indexes option available during direct
path loads allows you to presort data using high-performance sort routines that
are native to your system or installation.
•
When a Table to be loaded is empty, the
presorting option eliminates the sort and merge phases of index-building. The
index is filled in as data arrives.
•
Protection against instance failure does not
require redo log file entries during direct path loads. Therefore, no time is
required to log the load when:
•
Oracle is operating in NOARCHIVELOG mode
•
The UNRECOVERABLE parameter is set
to Y
•
The object being loaded has the NOLOG attribute set
When to Use a Direct Path Load
If none of the previous restrictions apply, you
should use a direct path load when:
•
You have a large amount of data to load quickly.
A direct path load can quickly load and index large amounts of data. It can
also load data Into either an empty or nonempty Table.
•
You want to load data in parallel for maximum
performance.
Conventional Path
|
Direct Path
|
SQL*Loader uses the SQL INSERT statement and bind array buffer
to load data.
|
It passes on the data to the Load Engine of the database, which
creates a Column Array structure.
|
Makes use of the Database buffer cache and may increase
contention for the resources among other users.
|
Avoids buffer cache and writes directly to the Disk.
Can make use of Asynchronous I/O if available/supported on the OS. |
Slower since the SQL INSERT statements have to be generated,
passed to Oracle, and executed.
|
Faster since, the Load Engine converts the column array
structure directly to Oracle Data Blocks and adds it to the Table’s existing
segment.
|
While loading the data, searches for blocks with enough free
space, Into which the rows can be inserted.
|
Does not search the existing blocks.
New blocks are formatted and added to the existing segment. |
Does not lock the Table being loaded Into.
|
Locks Table in Exclusive mode. Hence, should not be used if
concurrent access to the Table is required during the load.
The 10046 trace output shows: ‘Lock Table <Table_Name> exclusive mode nowait;’ |
Can be used to load Into Clustered Tables
|
Cannot be used to load Into a cluster.
|
Check constraints are enabled during the load. Records not
satisfying the constraint are rejected and written Into the BAD file.
|
The constraints are disabled during the load.
It explicitly executes an ‘alter Table <Table_Name> disable constraint <constraint_Name>’ statement before loading Into the Table. |
Can be used to load Into Varrays
|
Cannot be used to load Into Varrays
|
Can be used to load Into BFILE columns.
|
Cannot be used to load Into BFILE columns.
|
Can be used to load Into a Single partition of a Table having
Global indexes
|
Direct path cannot be used to load Into a particular partition
of the Table if the Table has a global index defined on it.
|
Cannot be used for loading data in Parallel. But, you can use
multiple load session concurrently inserting Into the same Table.
|
Parallel loading of data is possible.
|
Automatically inserts default values for the columns, if any.
|
The default value specified for the column is not inserted. If
it is a ‘null’ column, it inserts a null.
|
Indexes
|
|
Unique Index on the Table is in a valID state after the load.
The uniqueness of the data for the index column is maintained. Records
violating the uniqueness are rejected and written Into the BAD file.
|
The Uniqueness of the data is not valIDated. The unique index is
in an ‘UNUSABLE’ state at the end of the load.
|
If the Table has any indexes, corresponding keys are added Into
the index for each new row inserted Into the Table.
|
After each block is formatted, the new index keys are put in a
sort (temporary) segment.
The old index and the new keys are merged at load finish time to create the new index. |
The index does not require a re-build at the end of the load.
Also, no extra storage is required.
But, since the index is updates for each new row, it increases the processing time. |
The index needs to be re-built at the end of the load. The old
index, new index and sort segment all require storage space until the indexes
are merged.
|
Loading Into Objects
|
|
If the type has a User-defined constructor matching the arguments
of the attribute-value constructor, conventional path calls the User-defined
constructor.
|
Direct path calls the Argument-value constructor.
|
If the type has a User-defined constructor not matching the
arguments of the attribute-value constructor, you can invoke the user-defined
using an SQL Expression.
|
It is not possible to invoke the user-defined constructor in
direct path loading.
|
Can be used to load Into Parent and child Table at the same
time.
|
Cannot be used to load Into Parent and child Table at the same
time
|
With Genex, you gain access to our top Oracle certified masters who help maximize the value of your Oracle databases.
ReplyDeletehttps://genexdbs.com/