Oracle® interMedia User's Guide 10g Release 2 (10.2) Part Number B14302-01 |
|
|
View PDF |
This chapter provides tuning tips for the Oracle DBA who wants to achieve more efficient storage and management of multimedia data in the database when using Oracle interMedia ("interMedia").
The goals of your interMedia application determine the resource needs and how those resources should be allocated. Because application development and design decisions have the greatest effect on performance, standard tuning methods must be applied to the system planning, design, and development phases of the project to achieve optimal results for your interMedia application in a production environment.
Multimedia data consists of a variety of media types including images, audio clips, video clips, line drawings, and so forth. All these media types are typically stored in LOBs, in either internal BLOBs (stored in an internal database tablespace) or in BFILEs (external LOBs in operating system files outside of the database tablespaces). This chapter discusses only the management of audio, image, and video data stored in BLOBs.
Internal LOBs consist of: CLOBs, NCLOBs, and BLOBs and can be of unlimited size (8 terabytes (TB) to 128 TB depending on the database block size of 2 kilobytes (KB) to 32 KB), which are supported in these programming environments: Java using Java Database Connectivity (JDBC), PL/SQL using the DBMS_LOB Package, and C using Oracle Call Interface (OCI).
However, in these programming environments: COBOL using Pro*COBOL precompiler, C/C++ using Pro*C/C++ precompiler, Visual Basic using Oracle Objects for OLE (OO4O), and SQL, you can create and use LOB instances only up to 4 gigabytes (GB) in size. interMedia supports BLOBs up to 4 GB in size for Oracle Database 10g Release 1 (10.1).
BFILEs can be as large as the operating system will allow up to a maximum of 8 TB. interMedia supports BFILEs up to a maximum of 4 GB in size.
The following general topics will help you to better manage your interMedia LOB data:
Setting database initialization parameters (see Section 8.1)
Issues to consider in creating tables with interMedia objects containing LOBs (see Section 8.2)
Improving multimedia data INSERT performance in interMedia objects containing LOBs (see Section 8.3)
Getting the best performance results (see Section 8.6)
Improving interMedia LOB data retrieval and update performance (see Section 8.7)
For more information about LOB partitioning, LOB tuning, and LOB buffering, see Oracle Database Application Developer's Guide - Large Objects, Oracle Call Interface Programmer's Guide, Oracle Database Concepts, and Oracle Database Performance Tuning Guide.
For information about restrictions to consider when using LOBs, see Oracle Database Application Developer's Guide - Large Objects.
For guidelines on using the DIRECTORY feature in Oracle, see Oracle Database Application Developer's Guide - Large Objects. This feature enables a simple, flexible, nonintrusive, and secure mechanism for the DBA to manage access to large files in the file system.
The information that follows is an excerpt from Oracle Database Performance Tuning Guide and Oracle Database Reference, and is presented as an overview of the topic. Refer to Oracle Database Performance Tuning Guide and Oracle Database Reference for more information.
Database tuning of the Oracle instance consists of tuning the system global area (SGA). The SGA is used to store data in memory for fast access. The SGA consumes a portion of your system's physical memory. The SGA must be sufficiently large to keep your data in memory but neither too small nor so large that performance begins to degrade. Degrading performance occurs when the operating system begins to page unused information to disk to make room for new information needed in memory, or begins to temporarily swap active processes to disk so other processes needing memory can use it. Excessive paging and swapping can bring a system to a standstill. The goal in sizing the SGA is to size it for the data that must be kept in main memory to keep performance optimal. With this in mind, you must size the SGA required for your interMedia application. This may mean increasing the physical memory of your system and monitoring your operating system behavior to ensure paging and swapping remain minimal.
The size of the SGA is determined by the values of the following database initialization parameters: DB_BLOCK_SIZE, DB_CACHE_SIZE, SHARED_POOL_SIZE, and LOG_BUFFER.
Beginning with Oracle9i, the SGA infrastructure is dynamic. This means that the following primary parameters used to size the SGA can be changed while the instance is running:
Buffer cache (DB_CACHE_SIZE) -- the size in bytes of the cache of standard blocks
Shared pool (SHARED _POOL_SIZE) -- the size in bytes of the area devoted to shared SQL and PL/SQL statements
Large pool (LARGE_POOL_SIZE) (default is 0 bytes) -- the size in bytes of the large pool used in shared server systems for session memory, parallel execution for message buffers, and by backup and restore processes for disk I/O buffers
The LOG_BUFFER parameter is used when buffering redo entries to a redo log. It is a static parameter and represents a very small portion of the SGA and can be changed only by stopping and restarting the database to read the changed value for this parameter from the initialization parameter file (init.ora
).
Note that even though you cannot change the MAX_SGA_SIZE parameter value dynamically, you do have the option of changing any of its three dependent primary parameters (DB_CACHE_SIZE, SHARED_POOL_SIZE, and LARGE_POOL_SIZE) to make memory tuning adjustments on the fly. To help you specify an optimal cache value, you can use the dynamic DB_CACHE_ADVICE parameter with statistics gathering enabled to predict behavior with different cache sizes through the V$DB_CACHE_ADVICE performance view. Use the ALTER SYSTEM...SET clause... statement to enable this parameter. See Oracle Database Performance Tuning Guide for more information about using this parameter.
Beginning with Oracle9i, there is a concept of creating tablespaces with multiple block sizes and specifying cache sizes corresponding with each block size. The SYSTEM tablespace uses a standard block size and additional tablespaces can use up to five non-standard block sizes.
The standard block size is specified by the DB_BLOCK_SIZE parameter. Its cache size is specified by the DB_CACHE_SIZE parameter. Non-standard block sizes are specified by the BLOCKSIZE clause of the CREATE TABLESPACE statement. The cache size for each corresponding non-standard block size is specified using the notation: DB_nK_CACHE_SIZE parameter, where the value n is 2, 4, 8, 16, or 32 KB.
The standard block size, known as the default block size, is usually set to the same size in bytes as the operating system block size, or a multiple of this size. The DB_CACHE_SIZE parameter, known as the DEFAULT cache size, specifies the size of the cache of standard block size (default is 48 megabytes (MB)). The system tablespace uses the standard block size and the DEFAULT cache size.
Either the standard block size or any of the non-standard block sizes and their associated cache sizes can be used for any of your other tablespaces. If you intend to use multiple block sizes in your database storage design, you must specify at least the DB_CACHE_SIZE and one DB_nK_CACHE_SIZE parameter value. You must specify all sub-caches for all the other non-standard block sizes that you intend to use. This block size/cache sizing scheme lets you use up to five different non-standard block sizes for your tablespaces and lets you specify respective cache sizes for each corresponding block size. For example, you can size your system tablespace to the normal 8 KB standard block size with a default DB_CACHE_SIZE of 48 MB or whatever size you want to specify. Then you can use any of the block sizes of 2 KB, 4 KB, 8 KB, 16 KB, or the maximum 32 KB for storing your interMedia LOB data in appropriate block-sized tablespaces and respective caches to achieve optimal LOB storage and retrieval performance.
Because the DB_BLOCK_SIZE parameter value can be changed only by re-creating the database, the value for this parameter must be chosen carefully and remain unchanged for the life of the database. See the section "DB_BLOCK_SIZE" for more information about this parameter.
The following sections describe these and some related initialization parameters and their importance to interMedia performance.
The DB_BLOCK_SIZE parameter is the size in bytes of database blocks (2048-32768). Oracle manages the storage space in the data files of a database in units called data blocks. The data block is the smallest unit of I/O operation used by a database; this value should be a multiple of the operating system's block size within the maximum (port-specific) limit to avoid unnecessary I/O operations. This parameter value is set for each database from the DB_BLOCK_SIZE parameter value in the initialization parameter file when you create the database. This value cannot be changed unless you create the database again.
The size of a database block determines how many rows of data Oracle can store in a single database page. The size of an average row is one piece of data that a DBA can use to determine the correct database block size. interMedia objects with instantiated LOB locators range in size from 175 bytes for ORDImage to 260 bytes for ORDDoc, ORDAudio, and ORDVideo. This figure does not include the size of the media data. (The difference in row sizes between instantiated image and audio and video data is that audio and video data contain a Comments attribute that is about 85 bytes in size to hold the LOB locator.)
If LOB data is less than 4000 bytes, then it can be stored inline or on the same database page as the rest of the row data. LOB data can be stored inline only when the block size is large enough to accommodate it.
LOB data that is stored out of line, on database pages that are separate from the row data, is accessed (read and written) by Oracle in CHUNK size pieces where CHUNK is specified in the LOB storage clause (see Section 8.2.2 for more information about the CHUNK option). CHUNK must be an integer multiple of DB_BLOCK_SIZE and defaults to DB_BLOCK_SIZE if not specified. Generally, it is more efficient for Oracle to access LOB data in large chunks, up to 32 KB. However, when LOB data is updated, it may be versioned (for read consistency) and logged both to the rollback segments and the redo log in CHUNK size pieces. If updates to LOB data are frequent then it may be more efficient space wise to manipulate smaller chunks of LOB data, especially when the granularity of the update is much less than 32 KB.
The preceding discussion is meant to highlight the differences between the initialization parameter DB_BLOCK_SIZE and the LOB storage parameter CHUNK. Each parameter controls different aspects of the database design, and though related, they should not be automatically equated.
Tuning Memory Allocation
Allocating memory to database structures and proper sizing of these structures can greatly improve database performance when working with LOB data. See Oracle Database Performance Tuning Guide for a comprehensive, in-depth presentation of this subject, including understanding memory allocation issues as well as detecting and solving memory allocation problems. The following sections describe a few of the important initialization parameters specifically useful for optimizing LOB performance relative to tuning memory allocation.
The DB_CACHE_SIZE parameter specifies the size of the DEFAULT buffer pool for buffers in bytes. This value is the database buffer value that is displayed when you issue a SQL SHOW SGA statement. Because you cannot change the value of the DB_BLOCK_SIZE parameter without re-creating the database, change the value of the DB_CACHE_SIZE parameter to control the size of the database buffer cache using the ALTER SYSTEM...SET clause... statement. The DB_CACHE_SIZE parameter is dynamic.
BUFFER_POOL_KEEP and BUFFER_POOL_RECYCLE - Tuning Multiple Buffer Pools Using the Standard Block Size
To greatly reduce I/O operations while reading and processing LOB data, tune the database instance by partitioning your buffer cache into multiple buffer pools for the tables containing the LOB columns.
Note: Multiple buffer pools are available only for the standard block size. Non-standard block size caches have a single DEFAULT pool. Therefore, the information presented in this section applies to only the scenario in which you are using only the standard block size. |
By default, all tables are assigned to the DEFAULT pool. Tune this main cache buffer using the DB_CACHE_SIZE initialization parameter and assign the appropriate tables to the keep pool using the DB_KEEP_CACHE_SIZE initialization parameter and to the recycle pool using the DB_RECYCLE_CACHE_SIZE initialization parameter.
The keep pool contains buffers that always stay in memory and is intended for frequently accessed tables that contain important data. The recycle pool contains buffers that can always be recycled and is intended for infrequently accessed tables that contain much less important data. The size of the main buffer cache (DEFAULT) is calculated from the value specified for the DB_CACHE_SIZE parameter minus the values specified for the DB_KEEP_CACHE_SIZE and DB_RECYCLE_CACHE_SIZE parameters. Tables are assigned to respective buffer pools (KEEP, RECYCLE, DEFAULT) using the STORAGE (buffer_pool) clause of the CREATE or ALTER TABLE statement. Determine what tables you want allocated to which of these memory buffers and the ideal size of each buffer when you implement your memory allocation design. These parameter values can be changed only in the initialization parameter file and take effect only after stopping and restarting the database.
When working with very large images, set the DB_CACHE_SIZE parameter to a large number for your Oracle instance. For example, to cache a 40 MB image, set this parameter to a value of 48 MB. Some general guidelines to consider when working with LOB data are:
You should have enough buffers to hold the object, regardless of table LOB logging and cache settings. See Section 8.2 for more information.
When using log files you should make the log files larger, otherwise, more time is spent waiting for log switches. See Section 8.2 for more information.
If the same BLOB is to be accessed frequently, set the table LOB CACHE parameter to TRUE. See Section 8.2 for more information.
Use a large page size (DB_BLOCK_SIZE) if the database is going to contain primarily large objects.
See Oracle Database Performance Tuning Guide for more information about tuning multiple buffer pools.
SHARED_POOL_SIZE
The SHARED_POOL_SIZE parameter specifies the size in bytes of the shared pool that contains the library cache of shared SQL requests, shared cursors, stored procedures, the dictionary cache, and control structures, Parallel Execution message buffers, and other cache structures specific to a particular instance configuration. This parameter value is dynamic. This parameter represents most of the variable size value that is displayed when you issue a SQL SHOW SGA statement. Specifying a large value improves performance in multiuser systems. A large value for example, accommodates the loading and execution of interMedia PL/SQL scripts and stored procedures; otherwise, execution plans are more likely to be swapped out. A large value can also accommodate many clients connecting to the server with each client connection using some shared pool space. However, when the shared pool is full, the database is unable to accept additional client connections.
SHARED_POOL_RESERVED_SIZE
The SHARED_POOL_RESERVED_SIZE parameter specifies the shared pool space that is reserved for large contiguous requests for shared pool memory. This static parameter should be set high enough to avoid performance degradation in the shared pool from situations where pool fragmentation forces Oracle to search for free chunks of unused pool to satisfy the current request.
Ideally, this parameter should be large enough to satisfy any request scanning for memory on the reserved list without flushing objects from the shared pool.
The default value is 5% of the shared pool size, while the maximum value is 50% of the shared pool size. For interMedia applications, a value at or close to the maximum can provide performance benefits.
LOG_BUFFER
The LOG_BUFFER parameter specifies the amount of memory, in bytes, used for buffering redo entries to the redo log file. Redo entries are written to the on disk log file when a transaction commits or when the LOG_BUFFER is full and space must be made available for new redo entries. Large values for LOG_BUFFER can reduce the number of redo log file I/O operations by allowing more data to be flushed for each write operation. Large values can also eliminate the waits that occur when redo entries are flushed to make space in the log buffer pool. interMedia applications that have buffering enabled for the LOB data can generate large amounts of redo data when media is inserted or updated. These applications would benefit from a larger LOG_BUFFER size. This is a static parameter.
The following information provides some strategies to consider when you create tables with interMedia column objects containing BLOBs. You can explicitly indicate the tablespace and storage characteristics for each BLOB. These topics are discussed in more detail and with examples in Oracle Database Application Developer's Guide - Large Objects. The information that follows is excerpted from Chapter 2 and is briefly presented to give you an overview of the topic. Refer to Oracle Database Application Developer's Guide - Large Objects for more information.
An interMedia column object containing a LOB value set to NULL
has no locator. By contrast, an empty LOB stored in a table is a LOB of zero length that has a locator. So, if you select from an empty LOB column or attribute, you get back a locator, which you can use to fill the LOB with data using the OCI or DBMS_LOB routines or ORDxxx.import method.
Setting interMedia Column Objects Containing a BLOB to NULL
You may want to set the BLOB value to NULL
upon inserting the row whenever you do not have the BLOB data at the time of the INSERT operation. In this case, you can issue a SELECT statement at some later time to obtain a count of the number of rows in which the value of the BLOB is NULL
, and determine how many rows must be populated with BLOB data for that particular column object.
However, the drawback to this approach is that you must then issue a SQL UPDATE statement to reset the NULL
BLOB column to EMPTY_BLOB( ). The point is that you cannot call the OCI or the PL/SQL DBMS_LOB functions on a BLOB that is NULL
. These functions work only with a locator, and if the BLOB column is NULL
, there is no locator in the row.
Setting an interMedia Column Object Containing a BLOB to EMPTY
If you do not want to set an interMedia column object containing a BLOB to NULL
, another option is to set the BLOB value to EMPTY
by using the EMPTY_BLOB( ) function in the INSERT statement. Even better, set the BLOB value to EMPTY
by using the EMPTY_BLOB( ) function in the INSERT statement, and use the RETURNING clause (thereby eliminating a round-trip that is necessary for the subsequent SELECT statement). Then, immediately call OCI, the import method, or the PL/SQL DBMS_LOB functions to fill the LOB with data. See Oracle Database Application Developer's Guide - Large Objects for an example.
When you create tables and define interMedia column objects containing BLOBs, you can explicitly indicate the tablespace and storage characteristics for each BLOB. The following guidelines can help you fine-tune BLOB storage.
The best performance for interMedia column objects containing BLOBs can often be achieved by specifying storage for BLOBs in a tablespace that is different from the one used for the table that contains the interMedia object with a BLOB. See the ENABLE | DISABLE STORAGE IN ROW clause near the end of this section for further considerations on storing BLOB data inline or out of line. If many different LOBs are to be accessed frequently, it may also be useful to specify a separate tablespace for each BLOB or attribute in order to reduce device contention. Preallocate the tablespace to the required allocation size to avoid allocation when inserting BLOB data. See Oracle Database SQL Reference for examples, specifically the CREATE TABLE statement and the LOB column example. See Example 8-1.
Example 8-1 assumes that you have already issued a CONNECT statement as a suitably privileged user. This example creates a separate tablespace, called MONTANA, that is used to store the interMedia column object containing BLOB data for the image column. Ideally, this tablespace would be located on its own high-speed storage device to reduce contention. Other image attributes and the imageID column are stored in the default tablespace. The initial allocation allows 100 MB of storage space. The images to be inserted are about 20 KB in size. To improve insert performance, NOCACHE and NOLOGGING options are specified along with a CHUNK size of 24 KB.
Example 8-1 Create a Separate Tablespace to Store an interMedia Column Object Containing LOB Data
SVRMGR> CREATE TABLESPACE MONTANA DATAFILE 'montana.tbs' SIZE 400M; Statement processed. SVRMGR> CREATE TABLE images (imageID INTEGER ,image ORDSYS.ORDImage) LOB (image.source.localData) STORE AS ( TABLESPACE MONTANA STORAGE ( INITIAL 100M NEXT 100M ) CHUNK 24K NOCACHE NOLOGGING );
LOB Index and LOB_index_clause
The LOB index is an internal structure that is strongly associated with the LOB storage.
Note: The LOB_index_clause in the CREATE TABLE statement is deprecated beginning with release 8.1.5. Oracle generates an index for each LOB column and beginning with release 8.1.5, LOB indexes are system named and system managed. For information about how Oracle manages LOB indexes in tables migrated from earlier releases, see Oracle Database Upgrade Guide. |
When an interMedia column object containing a BLOB is modified, a new version of the BLOB page is made in order to support consistent reading of prior versions of the BLOB value.
PCTVERSION is the percent of all used LOB data space that can be occupied by old versions of LOB data pages. As soon as old versions of LOB data pages start to occupy more than the PCTVERSION amount of used LOB space, Oracle tries to reclaim the old versions and reuses them. In other words, PCTVERSION is the percentage of used LOB data blocks that is available for versions of old LOB data.
One way of approximating PCTVERSION is to set PCTVERSION = (% of LOBs updated at any given point in time) times (% of each LOB updated whenever a LOB is updated) times (% of LOBs being read at any given point in time). Allow for a percentage of LOB storage space to be used as old versions of LOB pages so users can get consistent read results of data that has been updated.
Setting PCTVERSION to twice the default allows more free pages to be used for old versions of data pages. Because large queries may require consistent reading of LOBs, it is useful to keep more old versions of LOB pages around. LOB storage may increase if you increase the PCTVERSION value because Oracle will not be reusing free pages aggressively.
The more infrequent and smaller the LOB updates are, the less space that needs to be reserved for old versions of LOB data. If existing LOBs are known to be read-only, you could safely set PCTVERSION to 0% because there would never be any pages needed for old versions of data.
Use the CACHE option on interMedia column objects containing BLOBs if the same BLOB data is to be accessed frequently. The CACHE option puts the data into the database buffer and makes it accessible for subsequent read operations. If you specify CACHE, then LOGGING is used; you cannot have CACHE and NOLOGGING.
Use the NOCACHE option (the default) if BLOB data is to be read only once or infrequently, or if you have too much BLOB data to cache, or if you are reading lots of images but none more frequently than others.
Use the CACHE READS option if the BLOB data is to be brought into the buffer cache only during frequent read operations and not during write operations.
See Example 8-1.
An example of when NOLOGGING is useful is with bulk loading or inserting of data. See Example 8-1. For instance, when loading data into the interMedia column objects containing BLOBs, if you do not care about redo logging and can just start the load over if it fails, set the BLOB data segment storage characteristics to NOCACHE NOLOGGING. This setting gives good performance for the initial loading of data. Once you have successfully completed loading the data, you can use the ALTER TABLE statement to modify the BLOB storage characteristics for the BLOB data segment to the desired storage characteristics for normal BLOB operations, such as CACHE or NOCACHE LOGGING.
Set the CHUNK option to the number of bytes of interMedia column objects containing BLOB data that are to be accessed at one time. That is, the number of bytes that are to be read or written using the object.readFromSource( ) or object.writeToSource( ) interMedia audio and video object methods or call, OCILobRead( ), OCILobWrite( ), DBMS_LOB.READ( ), or DBMS_LOB.WRITE( ) during one access of the BLOB value. Note that the default value for the CHUNK option is 1 Oracle block and does not vary across systems. If only 1 block of BLOB data is accessed at a time, set the CHUNK option to the size of 1 block. For example, if the database block size is 2 KB, then set the CHUNK option to 2 KB.
Set the CHUNK option to the next largest integer multiple of database block size that is slightly larger than the audio, image, or video data size being inserted. Specifying a slightly larger CHUNK option allows for some variation in the actual sizes of the multimedia data and ensures that the benefit is realized. For large-sized media data, a general rule is to set the CHUNK option as large as possible; the maximum is 32 KB. For example, if the database block size is 2 KB or 4 KB or 8 KB and the image data is mostly 21 KB in size, set the CHUNK option to 24 KB. See Example 8-1.
If you explicitly specify the storage characteristics for the interMedia column object containing a BLOB, make sure that the INITIAL and NEXT parameters for the BLOB data segment storage are set to a size that is larger than the CHUNK size. For example, if the database block size is 2 KB and you specify a CHUNK value of 8 KB, make sure that the INITIAL and NEXT parameters are at least 8 KB, preferably higher (for example, at least 16 KB).
For LOB storage, Oracle automatically builds and maintains a LOB index that allows quick access to any chunk and thus any portion of a LOB. The LOB index gets the same storage extent parameter values as its LOBs. Consequently, to optimize LOB storage space, you should calculate the size of your LOB index size as well as the total storage space needed to store the media data including its overhead.
Assume that N files composed of M total bytes of media data are to be stored and that the value C represents the size of the LOB CHUNK storage parameter. To calculate the total number of bytes Y needed to store the media data:
Y = M + (N*C)
The expression (N*C) accounts for the worst case in which the last CHUNK of each LOB contains a single byte. Therefore, an extra CHUNK is allowed for each file that is stored. On average, the last CHUNK will be half full.
To calculate the total number of bytes X to store the LOB index:
X = CEIL(M/C) * 32
The value 32 indicates that the LOB index requires roughly 32 bytes for each CHUNK that is stored.
The total storage space needed for the media data plus its LOB index is then X + Y.
The following two examples describe these calculations in detail.
Example 1: Assume you have 500 video clips comprising a total size of 250 MB with an average size is 512 KB. Assume a LOB CHUNK size of 32768 bytes. The total space needed for the media data is 250 MB + (5000*32768) or 266 MB. The overhead is 16 MB or about 6.5% storage overhead. The total space needed to store the LOB index is CEIL(250 MB/32768) * 32 or 244 KB. The total space needed to store the media data plus its LOB index is then about 266.6 MB.
SQL> SELECT 250000000+(500*32768)+CEIL(250000000/32768)*32 FROM dual; 250000000+(500*32768)+CEIL(250000000/32768)*32 ---------------------------------------------- 266628160
The following table definition could be used to store this amount of data:
CREATE TABLE video_items ( video_id NUMBER, video_clip ORDSYS.ORDVideo ) -- storage parameters for table in general TABLESPACE video1 STORAGE (INITIAL 1M NEXT 10M) -- special storage parameters for the video content LOB(video_clip.source.localdata) STORE AS (TABLESPACE video2 STORAGE (INITIAL 260K NEXT 270M) DISABLE STORAGE IN ROW NOCACHE NOLOGGING CHUNK 32768);
Example 2: Assume you have 5000 images comprising a total size of 274 MB with an average size of 56 KB. Because the average size of the images are smaller than the video clips in the preceding example, it is more space efficient to choose a smaller CHUNK size, for example 8192 bytes to store the data in the LOB. The total space needed for the media data is 274 MB + (5000*8192) or 314 MB. The overhead is about 40 MB or about 15% storage overhead. The total space needed to store the LOB index is CEIL(274 MB/8192) * 32 or 1.05 MB. The total space needed to store the media data plus its LOB index is then about 316 MB.
SQL> SELECT 274000000+(5000*8192)+CEIL(274000000/8192)*32 FROM dual; 274000000+(5000*8192)+CEIL(274000000/8192)*32 --------------------------------------------- 316030336
The following table definition could be used to store this amount of data:
CREATE TABLE image_items ( image_id NUMBER, image ORDSYS.ORDImage ) -- storage parameters for table in general TABLESPACE image1 STORAGE (INITIAL 1M NEXT 10M) -- special storage parameters for the image content LOB(image.source.localdata) STORE AS (TABLESPACE image2 STORAGE (INITIAL 1200K NEXT 320M) DISABLE STORAGE IN ROW NOCACHE NOLOGGING CHUNK 8192);
When working with very large BLOBs of about 1 gigabyte in size, choose a proportionately large INITIAL and NEXT extent parameter size, for example an INITIAL value slightly larger than your calculated LOB index size and a NEXT value of 100 MB, to reduce the frequency of extent creation, or commit the transaction more often to reuse the space in the rollback segment; otherwise, if the number of extents is large, the rollback segment can become saturated.
Set the PCTINCREASE parameter value to 0 to make the growth of new extent sizes more manageable. When working with very large BLOBs and the BLOB is being filled up piece by piece in a tablespace, numerous new extents are created in the process. If the extent sizes keep increasing by the default value of 50% each time one is created, extents will become unmanageably big and eventually will waste space in the tablespace.
Set the MAXEXTENTS parameter value to suit the projected size of the BLOB or set it to UNLIMITED for safety. That is, when MAXEXTENTS is set to UNLIMITED, extents will be allocated automatically as needed and this minimizes fragmentation.
ENABLE | DISABLE STORAGE IN ROW Clause
You use the ENABLE | DISABLE STORAGE IN ROW clause to indicate whether the interMedia column objects containing a BLOB should be stored inline (that is, in the row) or out of line. You may not alter this specification once you have made it: if you ENABLE STORAGE IN ROW, you cannot alter it to DISABLE STORAGE IN ROW or the reverse. The default is ENABLE STORAGE IN ROW.
The maximum amount of LOB data that will be stored in the row is the maximum VARCHAR size (4000). Note that this includes the control information as well as the LOB value. If the user indicates that the LOB should be stored in the row, once the LOB value and control information are larger than 4000 bytes, the LOB value is automatically moved out of the row.
This suggests the following guideline: If the interMedia column object containing a BLOB is small (that is, less than 4000 bytes), then storing the BLOB data out of line will decrease performance. However, storing the BLOB in the row increases the size of the row. This has a detrimental impact on performance if you are doing a lot of base table processing, such as full table scans, multiple row accesses (range scans), or doing many UPDATE or SELECT statements to columns other than the interMedia column objects containing BLOBs. If you do not expect the BLOB data to be less than 4000 bytes, that is, if all BLOBs are big, then the default is the best choice because:
The LOB data is automatically moved out of line once it gets bigger than 4000 bytes.
Performance can be better if the BLOB data is small (less than 4000 bytes including control information) and is stored inline because the LOB locator and the BLOB data can be retrieved in the same buffer, thus reducing I/O operations.
The following physical attribute is important for optimum storage of BLOB data in the data block and consequently achieving optimum retrieval performance.
The PCTFREE parameter specifies the percentage of space in each data block of the table or partition reserved for future updates to each row of the table. Setting this parameter to an appropriate value is useful for efficient inline storage of multimedia data. The default value is 10%.
Set this parameter to a high enough value to avoid row chaining or row migration. Because the INSERT statement for BLOBs requires an EMPTY_BLOB column object initialization followed by an UPDATE statement to load the BLOB data into the data block, you must set the PCTFREE parameter value to a proper value especially if the BLOB data will be stored inline. For example, row chaining can result after a row INSERT operation when insufficient space is reserved in the existing data block to store the entire row, including the inline BLOB data in the subsequent UPDATE operation. As a result, the row would be broken into multiple pieces and each piece stored in a separate data block. Consequently, more I/O operations would be needed to retrieve the entire row, including the BLOB data, resulting in poorer performance. Row migration can also result if there is insufficient space in the data block to store the entire row during the initial INSERT operation, and thus the row is stored in another data block.
To make best use of the PCTFREE parameter, determine the average size of the BLOB data being stored inline in each row, and then determine the entire row size, including the inline BLOB data. Set the PCTFREE parameter value to allow for sufficient free space to store an entire row of data in the data block. For example, if you have a large number of thumbnail images that are about 3 KB in size, and each row is about 3.8 KB in size, and the database block size is 8 KB, set the value of PCTFREE to a value that ensures that two complete rows can be stored in each data block in the initial INSERT operation. This approach initially uses 1.6 KB of space (0.8 KB/row *2 rows) leaving 6.4 KB of free space. Because two rows initially use 20% of the data block and 95% after an UPDATE operation and adding a third row would initially use 30% of the data block causing a chain to occur when the third row is updated, set the PCTRFEE parameter value to 75. This setting permits a maximum of two rows to be stored in each data block and leaves sufficient space to update each row with its 3 KB thumbnail image leaving about 0.4 KB free space minus overhead for each data block.
There are a number of bulk loading methods available for loading FILE data into interMedia objects containing BLOBs. These include:
interMedia import( ) method in a PL/SQL stored procedure
SQL*Loader (conventional path load and direct path load)
OCILobLoadFromFile( ) relational function
DBMS_LOB.LOADFROMFILE( ) procedure in the DBMS_LOB package
DBMS_LOB.LOADBLOBFROMFILE( ) procedure in the DBMS_LOB package
Java loadDataFromFile( ) or loadDataFromInputStream( ) methods of Oracle interMedia Java Classes to load media data from a client file
Using interMedia import( ) Method in a PL/SQL Stored Procedure
Example 8-2 shows the contents of the load1.bat
file, which invokes SQL*Plus and runs the t1.sql
procedure (Example 8-3). The db_block_size for this schema is 8 kilobytes.
Example 8-2 Show the load1.bat File
sqlplus scott/tiger@intertcp @t1
Example 8-3 shows the contents of the t1.sql
file. This procedure:
Creates two tablespaces.
Creates the image_items
table and defines the physical properties of the table, specifically the physical attributes and LOB storage attributes.
Partitions the table storage into each tablespace by range using the image_id
value.
Creates the load_image
stored procedure that:
Declares a variable nxtseq
defined as the ROWID data type.
Inserts a row into the image_items
table and uses the INSERT RETURNING ROWID statement to return the ROWID value for fastest access to the row for loading the image BLOB data into the object columns of each row using the import( ) method.
Sets the image attribute properties automatically (by means of the import operation) for each loaded image (note that thumbnail images are stored inline, and regular images are stored out of line).
Commits the update operation.
Example 8-3 Show the t1.sql Procedure
spool t1.log set echo on connect sys/change_on_install as sysdba create tablespace Image_h default storage (initial 30m next 400m pctincrease 0) datafile 'h:\IMPB\Image_h.DBF' size 2501M reuse; create tablespace Image_i default storage (initial 30m next 400m pctincrease 0) datafile 'i:\IMPB\Image_i.DBF' size 2501M reuse; connect scott/tiger drop table image_items; create table image_items( image_id number,-- constraint pl_rm primary key, image_title varchar2(128), image_artist varchar2(128), image_publisher varchar2(128), image_description varchar2(1000), image_price number(6,2), image_file_path varchar2(128), image_thumb_path varchar2(128), image_thumb ordsys.ordimage, image_clip ordsys.ordimage ) -- -- physical properties of table -- -- physical attributes clause pctfree 35 storage (initial 30M next 400M pctincrease 0) -- LOB storage clause (applies to LOB column) LOB (image_clip.source.localdata) store as (disable storage in row nocache nologging chunk 32768) -- -- table properties (applies to whole table) -- Partition by range (image_id) ( Partition Part1 values less than (110001) Tablespace image_h, Partition Part2 values less than (maxvalue) Tablespace image_i ); connect scott/tiger; create or replace procedure load_image ( image_id number, image_title varchar2, image_artist varchar2, image_publisher varchar2, image_description varchar2, image_price number, image_file_path varchar2, image_thumb_path varchar2, thumb_dir varchar2, content_dir varchar2, file_name1 varchar2, file_name2 varchar2) as ctx raw(4000) := NULL; obj1 ORDSYS.ORDIMAGE; obj2 ORDSYS.ORDIMAGE; nxtseq rowid; Begin Insert into image_items( image_id, image_title, image_artist, image_publisher, image_description, image_price, image_file_path, image_thumb_path , image_thumb, image_clip) values ( image_id, image_title, image_artist, image_publisher, image_description, image_price, image_file_path, image_thumb_path , ORDSYS.ORDIMAGE.init('FILE',upper(thumb_dir),file_name1), ORDSYS.ORDIMAGE.init('FILE',upper(content_dir),file_name2)) returning rowid into nxtseq; -- load up the thumbnail image select t.image_thumb, t.image_clip into obj1, obj2 from image_items t where t.rowid = nxtseq for update; obj1.import(ctx); -- import sets properties obj2.import(ctx); Update image_items I set I.image_thumb = obj1, I.image_clip = obj2 where i.rowid = nxtseq; Commit; End; / spool off set echo off
Example 8-4 shows the contents of the load1.sql
file. The image load directories are created and specified for each tablespace and user scott
is granted read privilege on each load directory. The stored procedure named load_image
is then executed, which loads values for each column row. By partitioning the data into different tablespaces, each partition can be loaded in a parallel data load operation.
Example 8-4 Show the load1.sql File That Executes the load_image Stored Procedure
connect sys/change_on_install as sysdba drop directory IMAGE_H; drop directory IMAGE_I; create directory IMAGE_H as 'h:\image_files'; create directory IMAGE_I as 'i:\image_files'; grant read on directory IMAGE_H to scott; grant read on directory IMAGE_I to scott; EXEC Load_image(100001,'T_100001',1916,'Publisher','Visit our WEB page' ,8.71,'image_I\T_100001.jpg','image_I\T_100001_thumb1.jpg','image_I','image_I','T_100001_thumb1.jpg','T_100001.jpg'); EXEC Load_image(100002,'T_100002',2050,'Publisher','Visit our WEB page' ,9.61,'image_I\T_100002.jpg','image_I\T_100002_thumb10.jpg','image_I','image_I','T_100002_thumb10.jpg','T_100002.jpg'); exit
Using SQL*Loader
SQL*Loader provides two methods for loading data:
Conventional Path Load
A conventional path load (the default) uses the SQL INSERT statement and a bind array buffer to load data into database tables. When SQL*Loader performs a conventional path load, it competes equally with all other processes for buffer resources. This can slow the load significantly. Extra overhead is added as SQL statements are generated, passed to Oracle, and executed. Oracle looks for partially filled blocks and attempts to fill them on each insert operation. Although appropriate during normal use, this can slow bulk loads dramatically. Use conventional path load if you encounter certain restrictions on direct path loads.
Direct Path Load
A direct path load eliminates much of the database overhead by formatting Oracle data blocks and writing the data blocks directly to the database files. A direct load does not compete with other users for database resources, so it can usually load data at near disk speed. In addition, if the asynchronous I/O operations feature is available on your host platform, multiple buffers are used for the formatted data blocks to further increase load performance.
See Oracle Database Utilities for a complete list of restrictions for using either the conventional path load or direct path load method for loading data using SQL*Loader. See Oracle Database Application Developer's Guide - Fundamentals for more information about LOBs.
Using SQL*Loader to Load Multimedia Data into Oracle Database Using interMedia Column Objects
Example 8-5 shows the use of the control file to load one ORDVideo object for each file into a table named JUKE
that has three columns, with the last one being a column object. Each LOB file is the source of a single LOB and follows the column object name with the LOBFILE data type specifications. Two LOB files are loaded in this example.
Example 8-5 Show the Control File for Loading Video Data
LOAD DATA
INFILE *
INTO TABLE JUKE
REPLACE
FIELDS TERMINATED BY ','
( id integer external,
file_name char(1000),
mediacontent column object
(
source column object
(
1) localData_fname FILLER CHAR(128),
2) localData LOBFILE (mediacontent.source.localData_fname) terminated by EOF
) ) ) BEGINDATA 1,slynne,slynne.rm 2,Commodores,Commodores - Brick House.rm
Notes:
The FILLER field is mapped to the 128-byte long data field which is read using the SQL*Loader CHAR data type.
SQL*Loader gets the LOB file name from the localData_fname FILLER field. It then loads the data from the LOB file (using the BLOB data type) from its beginning to the EOF character, whichever is reached first. Note that if no existing LOB file is specified, the localData
field is initialized to empty.
Using the OCILobLoadFromFile( ) Relational Function
Oracle Call Interface (OCI) is an application programming interface (API) that enables you to manipulate data and schemas in a database using a host programming language, such as C.
The OCI relational function, OCILobLoadFromFile( ), loads or copies all or a portion of a file into an interMedia column object containing a specified BLOB. The data is copied from the source file to the destination interMedia column objects containing a BLOB. When binary data is loaded into an interMedia column object containing a BLOB, no character set conversions are performed. Therefore, the file data must already be in the same character set as the BLOB in the database. No error checking is performed to verify this.
See Oracle Call Interface Programmer's Guide for more information.
Using the DBMS_LOB.LOADFROMFILE( ) Procedure in the DBMS_LOB Package
The DBMS_LOB package provides subprograms to operate on BLOBs, CLOBs, NCLOBs, BFILEs, and temporary LOBs. You can use the DBMS_LOB package for access and manipulation of specific parts of an interMedia column object containing a BLOB, as well as complete BLOBs. DBMS_LOB can read as well as modify BLOBs, CLOBs, and NCLOBs, and provides read-only operations for BFILEs. The majority of the LOB operations are provided by this package.
The DBMS_LOB.LOADFROMFILE( ) procedure copies all, or part of, a source-external LOB (BFILE) to a destination internal LOB.
You can specify the offsets for both the source LOB (BFILE) and destination interMedia column object containing the BLOB and the number of bytes to copy from the source BFILE. The amount and src_offset, because they refer to the BFILE, are in terms of bytes, and the destination offset is either in bytes or characters for BLOBs and CLOBs respectively.
The input BFILE must have been opened prior to using this procedure. No character-set conversions are performed implicitly when binary BFILE data is loaded into a CLOB. The BFILE data must already be in the same character set as the CLOB in the database. No error checking is performed to verify this. See Oracle Database PL/SQL Packages and Types Reference for more information.
Using the DBMS_LOB.LOADBLOBFROMFILE( ) Procedure in the DBMS_LOB Package
The DBMS_LOB.LOADBLOBFROMFILE( ) procedure loads a persistent or temporary BLOB instance with data from a BFILE. This procedure achieves the same result as using DBMS_LOB.LOADFROMFILE, but returns the new offset in bytes in the destination BLOB right after the end of the write operation, which is also where the next write operation should begin and the offset in bytes in the source BFILE right after the end of the read operation, which is also where the next read operation should begin.
To use this procedure, you can specify the offsets for both the source and destination LOBs, and the number of bytes to copy from the source BFILE. The value you pass for the amount
parameter to the DBMS_LOB.LOADBLOBFROMFILE function must be either an amount less than or equal to the actual size (in bytes) of the BFILE you are loading or the maximum allowable LOB size: DBMS_LOB.LOBMAXSIZE. Passing this latter value causes the function to load the entire BFILE, which is a useful technique for loading the entire BFILE without introspecting the size of the BFILE.
To use this procedure, the target BLOB instance and the source BFILE must both exist and the BFILE must be opened and later closed after calling this procedure.
See Oracle Database PL/SQL Packages and Types Reference for more information.
Using Java loadDataFrom...( ) Methods to Load Media Data from a Client File
From the Java client, you can use the Java loadDataFromByteArray( ), loadDataFromFile( ), or loadDataFromInputStream( ) methods of Oracle interMedia Java Classes to load media data from a given file into a server-side media object designated by the corresponding media locator parameters. You must specify the name of the file from which to load the data and the method returns true if loading is successful, false otherwise. See Oracle interMedia Java Classes API Reference for more information.
Oracle Data Pump ("Data Pump") enables very high-speed movement of data and metadata from one database to another using the Data Pump Export and Data Pump Import utilities. Data Pump enables you to specify whether or not a job should move a subset of the data and metadata. This is done using data filters and metadata filters, which are implemented through Export and Import parameters using the Metadata API and the Data Pump API. The Metadata API uses the procedures provided in the DBMS_METADATA PL/SQL package and the Data Pump API uses the procedures provided in the DBMS_DATAPUMP PL/SQL package. See Oracle Database Concepts, Oracle Database Utilities, and Oracle Database PL/SQL Packages and Types Reference for more information.
Example 8-6 shows the contents of the readvideo1.sql
file. This procedure reads data from an ORDVideo object with the video stored in a BLOB in the database using the readFromSource( ) method in a PL/SQL script until no more data is found. The procedure then returns a NO_DATA_FOUND exception when the read operation is complete and displays an "End of data" message.
Note: This example can be modified to work with the ORDAudio, ORDDoc, and ORDImage objects too. |
Example 8-6 Read Data from an ORDVideo Column Object Using the interMedia readFromSource( ) Method in a PL/SQL Stored Procedure
create or replace procedure readVideo1(i integer) as obj ORDSYS.ORDVideo; buffer RAW (32767); numbytes BINARY_INTEGER := 32767; startpos integer := 1; read_cnt integer := 1; ctx RAW(4000) := NULL; BEGIN Select mediacontent into obj from juke where id = 100001; LOOP obj.readFromSource(ctx,startpos,numbytes,buffer); startpos := startpos + numBytes; read_cnt := read_cnt + 1; END LOOP; EXCEPTION WHEN NO_DATA_FOUND THEN DBMS_OUTPUT.PUT_LINE('End of data '); DBMS_OUTPUT.PUT_LINE('doing read '|| read_cnt); DBMS_OUTPUT.PUT_LINE('start position :'|| startpos); END; / show errors
The following guidelines can be used to help you achieve the best performance when working with interMedia objects:
Because interMedia objects are big, attain the best performance by reading and writing large CHUNKS of an interMedia object value at a time. This helps in several respects:
If you are accessing the interMedia object from the client side and the client is on a different node than the server, large read/write operations reduce network overhead.
If you are using the NOCACHE option, each small read/write operation incurs an I/O impact. Reading and writing large quantities of data reduces the I/O impact.
Writing to the interMedia object creates a new version of the interMedia object CHUNK. Therefore, writing small amounts at a time will incur the cost of a new version for each small write operation. If logging is on, the CHUNK is also stored in the redo log.
If you need to read or write small pieces of interMedia object data on the client, use LOB buffering (see OCILobEnableBuffering( ), OCILobDisableBuffering( ), OCILobFlushBuffer( ), OCILobWrite( ), OCILobRead( ) in Oracle Call Interface Programmer's Guide for more information.). Turn on LOB buffering before reading or writing small pieces of interMedia object data. For more information about LOB buffering, its advantages, guidelines for use, and usage, see Oracle Database Application Developer's Guide - Large Objects.
Use interMedia methods (readFromSource( ) and writeToSource( )) for audio and video data or OCILobWrite( ) and OCILobRead( ) with a callback for image data so media data is streamed to and from the BLOB. Ensure that the length of the entire write operation is set in the numBytes
parameter using interMedia methods or in the amount
parameter using OCI calls on input. Whenever possible, read and write in multiples of the LOB CHUNK size.
Use a checkout/checkin model for LOBs. LOBs are optimized for the following:
Updating interMedia object data: SQL UPDATE operations, which replaces the entire BLOB value.
Copying the entire LOB data to the client, modifying the LOB data on the client side, and copying the entire LOB data back to the database. This can be done using OCILobRead( ) and OCILobWrite( ) with streaming.
Commit changes frequently.
Follow temporary LOB performance guidelines.
See Oracle Database Application Developer's Guide - Large Objects for information and guidelines about using temporary LOBs.
Use interMedia column objects containing BLOBs in table partitions.
See the information about LOBs in partitioned tables in Oracle Database Application Developer's Guide - Large Objects and see Oracle Database SQL Reference for examples, specifically the CREATE TABLE statement and the Partitioned Table with LOB Columns example.
See Oracle Database Application Developer's Guide - Large Objects for more information.
Once the LOB data is stored in the database, a modified strategy must be used to improve the performance of retrieving and updating the LOB data compared to the insertion strategy described in Section 8.3. The following guidelines should be considered:
Use the CACHE option on LOBs if the same LOB data is to be accessed frequently by other users.
Increase the number of buffers if you are going to use the CACHE option.
Have enough buffers to hold the object. Using a small number of buffers for large objects is not good. Set the DB_CACHE_SIZE parameter to a value that you know will hold the object.
Ensure that your redo log files are much larger than they usually are; otherwise, you may be waiting for log switches, especially if you are making many updates to your LOB data.
Ensure that you use a larger page size (DB_BLOCK_SIZE), especially if the majority of the data in the database is LOB data.