Oracle® OLAP Developer's Guide to the OLAP API 10g Release 1 (10.1) Part Number B10335-02 |
|
|
View PDF |
This chapter introduces the Oracle OLAP API to application developers who plan to use it in their Java applications.
This chapter includes the following topics:
The OLAP API is a Java application programming interface (API) through which an application can access data for online analytical processing (OLAP). The Java classes that implement the API are part of the Oracle OLAP component.
The purpose of the OLAP API is to facilitate the development of OLAP applications, which allow users to dynamically select, aggregate, calculate, and perform other analytical tasks on data through a graphical user interface. Typically, the user interface of an OLAP application displays data in multidimensional formats, such as graphs and crosstabs.
In general, OLAP applications are developed within the context of business intelligence and data warehousing systems, and the features of the OLAP API are optimized for this type of application. With the OLAP API, a Java application can access, manipulate, and display data in multidimensional terms. The OLAP API also makes it possible to define a query in a step-by-step process that allows for undoing individual query steps without reproducing the entire query. Such multistep queries are easy to modify and refine dynamically.
Data warehousing and OLAP applications are based on a multidimensional view of data, and they work with queries that represent selections of data. The following definitions introduce concepts that reflect the multidimensional view and are basic to data warehousing, OLAP, and the OLAP API:
Dimension. A structure that categorizes data. Commonly used dimensions are customers, products, and times. Typically, a dimension is associated with one or more hierarchies. Several distinct dimensions, combined with measures, enable end users to answer business questions. For example, a times dimension that categorizes data by month helps to answer the question, "Did we sell more widgets in January or June?"
Measure. Data, usually numeric and additive, that can be examined and analyzed. Typically, a given measure is categorized by one or more dimensions, and it is described as "dimensioned by" them.
Hierarchy. A logical structure that uses ordered levels or values as a means of organizing dimension elements in parent-child relationships. Typically, end users can expand or collapse the hierarchy by drilling down or up on its levels.
Level. A position in a level-based hierarchy. For example, a times dimension might have a hierarchy that represents data at the day, month, quarter, and year levels.
Attribute. A descriptive characteristic of the elements of a dimension that an end user can specify to select data. For example, end users might choose products using a color attribute.
Query. A specification for a particular set of data, and for aggregations, calculations, or other operations to perform using the data. Any such operations on the data are an intrinsic part of the query. The data and the operations on it define the result set of the query.
Two additional data warehouse and OLAP concepts, cube and edge, are not intrinsic to the OLAP API, but are often incorporated into the design of applications that use the OLAP API.
Cube. A logical organization of multidimensional data. Typically, the edges of a cube contain dimension values, and the body of a cube contains measure values. For example, data on the quantity of product units sold can be organized into a cube whose edges contain values from the time, product, customer, and channel dimensions and whose body contains values from the units sold measure.
Edge. One side of a cube. Each edge contains values from one or more dimensions. Although there is no limit to the number of edges on a cube, data is often organized for display purposes along three edges, which are referred to as the row edge, column edge, and page edge.
For more information about all of these concepts, see the Oracle Data Warehousing Guide.
The OLAP API, as part of Oracle OLAP, makes it possible for Java applications (including applets) to access data that resides in an Oracle data warehouse. A data warehouse is a relational database that is designed for query and analysis, rather than transaction processing. Warehouse data often conforms to a star schema, which represents a multidimensional data model. The star schema consists of one or more fact tables and one or more dimension tables that are related through foreign keys. Typically, a data warehouse is created from a transaction processing database by an extraction transformation transport (ETT) tool, such as Oracle Warehouse Builder.
In order for the OLAP API to access the data in a given data warehouse, a database administrator must first ensure that the data warehouse is configured according to an organization that is supported by Oracle OLAP. The star schema is one such organization, but not the only one. Once the data is organized in the warehouse, the database administrator must map the data to OLAP metadata objects and add them to the OLAP Catalog. Finally, with the metadata in place, an application can access both the data and the metadata through the OLAP API.
See the Oracle OLAP Application Developer's Guide for information about supported data warehouse configurations and about creating OLAP Catalog metadata.
The collection of warehouse data for which a database administrator has mapped to OLAP Catalog elements is the data store to which the OLAP API gives access. Of course, each user who accesses data through the OLAP API might have security restrictions that limit the scope of the data that he or she can access within the data store.
With the classes in the oracle.olapi.metadata.mtm
package, an application developer who is familiar with SQL and with the mapping of the relational tables and views to the OLAP Catalog metadata can create custom metadata objects. For more information, see Chapter 5, " Working with Metadata Mapping Objects".
Through the OLAP API, an application can do the following:
Establish a connection to a data store.
Explore the metadata to discover what data is available for viewing or analysis.
Create queries that specify and manipulate the data according to the needs of application users (for example, selecting, aggregating, and calculating data).
Retrieve query results that are structured for display in multidimensional format.
Modify existing queries, rather than totally redefine them, as application users refine their analyses.
The OLAP API is a Java API, so it has all of the advantages of the Java environment. It is platform independent, and it provides the benefits of an object-oriented API, such as abstraction, encapsulation, polymorphism, and inheritance. These strengths are built into the OLAP API, and because the client application is written in Java, its code can also take advantage of them.
In order to work with the OLAP API, application developers should have familiarity with Java, object-oriented programming, relational databases, data warehousing, and multidimensional OLAP concepts.
This documentation has examples of OLAP API code that use a relational schema, named the Global schema, and an analytic workspace generated from that relational schema. For the complete code of the examples in this documentation, see the Overview of the Oracle OLAP Java API Reference.
The OLAP Catalog for the Global schema has the following measures:
UNITS
, which has the quantities of product units sold.
UNIT_COST
, which has the cost of a unit.
UNIT_PRICE
, which has the price of a unit.
The data in the measures is identified by detailed (leaf-level) data or aggregate (node-level) data from dimensions. The UNIT
measure is dimensioned by the following dimensions:
PRODUCT
, which has a hierarchy of product values named PRODUCT_ROLLUP
. The leaf level of the hierarchy has product item identification numbers and the higher levels have product family, class, and total products identifiers.
CUSTOMER
, which has two hierarchies of customer values, named SHIPMENTS_ROLLUP
and MARKET_ROLLUP
. The lowest level of each hierarchy has customer identification numbers and higher levels have warehouse, regions, and total customers, and accounts, market segments, and total market identifiers, respectively.
TIME
, which has a hierarchy of calendar year time period identifiers.
CHANNEL
, which has a hierarchy of sales channel identifiers.
The UNIT_COST
and UNIT_PRICE
measures are dimensioned by the following two dimensions:
PRODUCT
TIME
For an example of a program that discovers the OLAP Catalog metadata for the Global schema, see Chapter 4, " Discovering the Available Metadata".
The OLAP Catalog metadata describes the data that is available to the OLAP API through a connection to the database. The metadata records three things:
The existence of sets of data. For example, a measure of unit price figures, dimensions of product and time values, and attributes that contain information about the elements of the dimensions all exist as named entities in the data store.
The structure of the sets of data. For example, the unit price measure is dimensioned by products and times, an attribute is dimensioned by the dimension for which it records information, and the elements of the dimensions are organized into hierarchical levels.
The characteristics of the data. For example, the unit price measure contains numeric values that are specified by the dimension element values, the dimensions have String
values that identify the product or time values and the hierarchical levels, and the dimensions have attributes that provide additional information, such as a descriptive name for each dimension element that can be used in reports.
In contrast, the fact that the price of product 13, which is the Envoy Standard portable PC, was 2426.07 dollars in July 1002 is data, not metadata.
These examples distinguish between the metadata and the data for the measure of unit prices. The OLAP API makes a similar distinction between the metadata and the data for dimensions. For example, the fact that a product dimension exists and that it has text values as elements is metadata. In contrast, the fact that the value of one of its elements is 13 is data.
The OLAP API multidimensional metadata (MDM) model describes data in multidimensional terms, which are familiar to OLAP and data warehousing audiences. For example, it includes objects for measures, dimensions, hierarchies, and attributes.
The following are some of the Java classes that are supplied by the OLAP API in its implementation of the MDM model:
MdmSchema
MdmMetadataProvider
MdmMeasure
MdmDimension
MdmHierarchy
MdmLevel
MdmAttribute
An MdmSchema
is a container for MdmMeasure
, MdmDimension
, and other MdmSchema
objects. An MdmSchema
corresponds to a measure folder in the OLAP management feature of Oracle Enterprise Manager. Note that an MdmSchema
does not necessarily correspond to a relational schema.
An MdmMetadataProvider
gives an application access to metadata objects that were created by a database administrator using the OLAP management feature of Oracle Enterprise Manager. To obtain access to the metadata, an application uses the getRootSchema
method of an MdmMetadataProvider
. This method returns the top-level MdmSchema
, which contains all of the MdmDimension
objects that are accessible through this particular MdmMetadataProvider
. The MdmDimension
objects might be organized in a hierarchical tree, with subschemas nested under the top-level schema. Using the getMeasureDimension
, getSubSchemas
, and getDimensions
methods of the top-level MdmSchema
, and the getSubSchemas
, getMeasures
, and getDimensions
methods of all of the nested MdmSchema
objects, an application navigates through the metadata and discovers what data is available. In addition, the application can use methods to obtain the related MdmMeasure
, MdmHierarchy
, MdmLevel
, and MdmAttribute
objects.
Chapter 2, " Understanding OLAP API Metadata", provides detailed information about the OLAP API metadata.
An MdmMeasure
or MdmDimension
represents data in the data store. For example, an MdmMeasure
object named units
might represent a set of numeric elements whose values are dollar amounts for units sold, and an MdmDimension
called prodDim
might represent a set of text elements whose values are product identifiers. However, an application cannot create a query on the data using an MdmMeasure
or MdmDimension
. As metadata, MdmMeasure
and MdmDimension
objects provide descriptive information about data, but they do not provide the ability to construct a query that specifies the data. To select, calculate, and otherwise manipulate data for analysis, an application must create a query.
To create a query on the data for an MdmMeasure
or MdmDimension
, an application calls the getSource
method of the MdmMeasure
or MdmDimension
. This method creates a Source
object that specifies a query. The query defines a result set, and, in this case, the result set is the data for the MdmMeasure
or MdmDimension
.
In addition to representing the data for metadata objects, Source
objects can represent the data for any query that an application creates. For example, a Source
might specify a query for a selection of MdmDimension
values (such as January, February, and March of the year 2002) or a calculation of the values of one MdmMeasure
minus those of another (such as unitPrice
minus unitCost
). An application can use the powerful methods of the Source
class and its subclasses to combine data in any way that the user requires. Each new query is a new Source
.
To retrieve the data specified by a Source
, an application creates a Cursor
for that Source
. The application then uses this Cursor
to request and retrieve the data from the data store. When an application makes a request for data, it can specify the typical amount of data that it requires at a given time (for example, enough to fill a 40-cell table on the screen). Oracle OLAP then handles the issues related to efficient retrieval. The application does not need to manage the timing, sizing, and caching of the data blocks that it retrieves through the OLAP API.
Because the primary focus of most OLAP applications is making queries against the data store, a significant proportion of their data manipulation code works with the following classes, each of which has methods for selecting, calculating, and otherwise manipulating data.
Source
BooleanSource
DateSource
NumberSource
StringSource
One of the useful characteristic of Source
objects is that they make no distinction between attributes, dimensions, and measures. The Source
objects for all of them behave in the same way.
The elements of an OLAP Catalog dimension are usually organized into one or more hierarchies. Some hierarchies have parent-child relationships based on levels and some have those relationships based on values. In the OLAP API a dimension always has at least one hierarchy dimension object and that hierarchy object has at least one level object. Even a nonhierarchical dimension is represented by a hierarchy dimension object with one level object.
The OLAP API uses a three-part format to specify the hierarchy, the level, and the value of a dimension element, and thus identify a unique value in the hierarchy. The first part of a unique value is the name of the hierarchy object, the second part is the name of the level object, and the third part is the value of the element in the level. The parts of the unique value are separated by a value separation string, which by default is double colons (::
). The following is an example of a unique value in the YEAR
level of the CALENDAR
hierarchy of the TIME
dimension:
CALENDAR::YEAR::2
The third part of a unique value is the local value. The local value in the preceding example identifies the year 1999.
The OLAP API has classes and methods that you can use to get the local values of dimension elements. The MdmPrimaryDimension
class has a method for getting an MdmAttribute
that records the local values for the elements of the hierarchies that are components of the MdmPrimaryDimension
, and the MdmDimensionMemberInfo
class has methods for getting the local or unique values for a hierarchy or a level.
In addition to ensuring that data and metadata have been prepared appropriately, an application developer must ensure that application users can make a connection to the data store through the OLAP API and that users have database privileges that give them access to the data. For information about setting up for such connections, see the Oracle OLAP Application Developer's Guide.
The OLAP API client software is a set of Java packages containing classes that implement the programming interface to Oracle OLAP. An application creates objects of these classes and calls their methods to discover metadata, specify queries, and retrieve data.
When a Java application calls methods of objects of OLAP API Java classes, it uses the OLAP API client software to communicate with Oracle OLAP, which resides within an Oracle database instance. The communication between the OLAP API client software and Oracle OLAP is provided through Java Database Connectivity (JDBC), which is a standard Java interface for connecting to relational databases. For more information about JDBC, see the Oracle Database JDBC Developer's Guide and Reference.
To use the OLAP API classes as you develop your application, import them into your Java code. When you deliver your application to users, include the OLAP API classes with the application. You must also ensure that users can access JDBC.
In order to develop an OLAP API application, you must have the Java Development Kit (JDK), such as one in Oracle JDeveloper or one from Sun Microsystems. Users must have a Java Runtime Environment (JRE) whose version number is compatible with the JDK you used for development.
For information about Java version requirements and about setting up the OLAP API client software, see Appendix A, " Setting Up the Development Environment". For detailed information about the OLAP API classes and methods, see the Oracle OLAP Java API Reference and subsequent chapters of this guide.
An application that uses the OLAP API typically performs the following tasks:
Connects to the data store
Discovers the available metadata
Specifies queries that select and manipulate data
Retrieves query results
The rest of this topic briefly describes these tasks, and the rest of this guide provides detailed information.
An application connects to the data store by identifying some information about the target Oracle database and specifying this information in a JDBC connection method.
For more information about connecting, see Chapter 3, " Connecting to a Data Store".
Having established a connection, the application creates an MdmMetadataProvider
. This object gives access to all of the metadata objects in the data store.
To discover the available metadata, an application uses the getRootSchema
method of the MdmMetdataProvider
to obtain the MdmSchema
object that represents the top-level measure folder for all of the metadata objects to which the MdmMetdataProvider
provides access. The application then gets the dimensions, including the measure dimension, and the subfolders that are under the root.
Once the application has all of the dimensions, it can interrogate them to get their attributes, hierarchies, levels, and other characteristics, and the measures. Having determined the metadata objects that it has to work with, the application can present relevant lists of objects to the user for data selection and manipulation.
For a description of the metadata objects, see Chapter 2, " Understanding OLAP API Metadata". For information about how an application can discover the available metadata, see Chapter 4, " Discovering the Available Metadata".
The heart of any OLAP application lies in the construction of queries against the data store. The application user interface provides ways for the user to select data and to specify what should be done with it. Then, the data manipulation code translates these instructions into queries against the data store. The queries can be as simple as a selection of dimension elements, or they can be complex, including several aggregations and calculations on measure values specified by selections of dimension elements.
The OLAP API object that specifies a query is a Source
. Therefore, a significant portion of any OLAP API application is devoted to dealing with Source
objects.
From an MdmSchema
, you get MdmSource
objects, such as an MdmMeasure
or an MdmPrimaryDimension
. You then get a Source
object from the MdmSource
. With the methods of a Source
object, you can produce other Source
objects that specify a selection of the elements of the Source
, or that specify calculations or other operations to perform on the values of a Source
.
If you are implementing a simple user interface, you might use only the methods of the Source
classes to select and manipulate the data that users specify in the interface. However, if you want to offer your users multistep selection procedures and the ability to modify queries or undo individual steps in their selections, you should design and implement Template
classes. Within the code for each Template
, you use the methods of the Source
classes, but the Template
classes themselves allow you to modify and refine even the most complex query. In addition, you can minimize your work by writing general-purpose Template
classes and reusing them in various parts of your application.
For information about working with Source
objects, see Chapter 6, " Understanding Source Objects". For information about working with Template
objects, see Chapter 11, " Creating Dynamic Queries".
When users of an OLAP application are selecting, calculating, combining, and generally manipulating data, they also want to see the results of their work. This means that the application must retrieve the result sets of queries from the data store and display the data in multidimensional form. To retrieve a result set for a query through the OLAP API, the application creates a Cursor
for the Source
that specifies the query.
An application can also get the SQL that Oracle OLAP generates for a query. To do so, the application creates a SQLCursorManager
for the Source
instead of creating a Cursor
. The generateSQL
method of the SQLCursorManager
returns the SQL specified by the Source
. The application can then retrieve the data by methods outside of the OLAP API. The ExpressSQLCursorManager
class implements the SQLCursorManager
interface.
Because the OLAP API was designed to deal with a multidimensional view of data, a Source
can have a multidimensional result set. For example, a Source
can represent an MdmMeasure
that is structured by four MdmPrimaryDimension
objects. Each MdmPrimaryDimension
is represented by a Source
. An application can create a query by joining the Source
objects for the dimensions to the Source
for the measure. The query has the measure data as its values and it has the Source
objects for the dimensions as its outputs.
A Cursor
for the query Source
has the same structure as the Source
; that is, the values of the Cursor
are the measure data and the Cursor
has four outputs. The values of the outputs are those of the Source
objects for the dimensions.
To retrieve all of the items of data through a Cursor
, the application can loop through the multidimensional Cursor
structure. This design is well adapted to the requirements of standard user interface objects for painting the computer screen. It is especially well adapted to the display of data in multidimensional format.
For more information about using Source
objects to specify a query, see Chapter 6, " Understanding Source Objects". For more information about using Cursor
objects to retrieve data, see Chapter 9, " Understanding Cursor Classes and Concepts". For more information about the SQLCursorManager
class, see the Oracle OLAP Java API Reference.