Oracle® Database Utilities 10g Release 1 (10.1) Part Number B10825-01 |
|
|
View PDF |
The Data Pump utilities are designed especially for very large databases. If your site has very large quantities of data versus metadata, you should experience a dramatic increase in performance compared to the original Export and Import utilities. This chapter briefly discusses why the performance is better and also suggests specific steps you can take to enhance performance of export and import operations.
This chapter contains the following sections:
Performance of metadata extraction and database object creation in Data Pump Export and Import remains essentially equivalent to that of the original Export and Import utilities.
The improved performance of the Data Pump Export and Import utilities is attributable to several factors, including the following:
Multiple worker processes can perform intertable and interpartition parallelism to load and unload tables in multiple, parallel, direct-path streams.
For very large tables and partitions, single worker processes can choose intrapartition parallelism through multiple parallel queries and parallel DML I/O server processes when the external tables method is used to access data.
Data Pump uses parallelism to build indexes and load package bodies.
Dump files are read and written directly by the server and, therefore, do not require any data movement to the client.
The dump file storage format is the internal stream format of the direct path API. This format is very similar to the format stored in Oracle database datafiles inside of tablespaces. Therefore, no client-side conversion to INSERT
statement bind variables is performed.
The supported data access methods, direct path and external tables, are faster than conventional SQL. The direct path API provides the fastest single-stream performance. The external tables feature makes efficient use of the parallel queries and parallel DML capabilities of the Oracle database.
Metadata and data extraction can be overlapped during export.
Data Pump technology fully uses all available resources to maximize throughput and minimize elapsed job time. For this to happen, a system must be well-balanced across CPU, memory, and I/O. In addition, standard performance tuning principles apply. For example, for maximum performance you should ensure that the files that are members of a dump file set reside on separate disks, because the dump files will be written and read in parallel. Also, the disks should not be the same ones on which the source or target tablespaces reside.
Any performance tuning activity involves making trade-offs between performance and resource consumption.
The Data Pump Export and Import utilities enable you to dynamically increase and decrease resource consumption for each job. This is done using the PARALLEL
parameter to specify a degree of parallelism for the job. (The PARALLEL
parameter is the only tuning parameter that is specific to Data Pump.) For maximum throughput, do not set PARALLEL
to much more than 2x the CPU count.
As you increase the degree of parallelism, CPU usage, memory consumption, and I/O bandwidth usage also increase. You must ensure that adequate amounts of these resources are available. If necessary, you can distribute files across different disk devices or channels to get the needed I/O bandwidth.
The PARALLEL
parameter is valid only in the Enterprise Edition of Oracle Database 10g.
The settings for certain initialization parameters can affect the performance of Data Pump Export and Import. In particular, you can try using the following settings to improve performance, although the effect may not be the same on all platforms.
DISK_ASYNCH_IO=TRUE
DB_BLOCK_CHECKING=FALSE
DB_BLOCK_CHECKSUM=FALSE
Additionally, the following initialization parameters must have values set high enough to allow for maximum parallelism:
PROCESSES
SESSIONS
PARALLEL_MAX_SERVERS