Oracle® Spatial User's Guide and Reference 10g Release 1 (10.1) Part Number B10826-01 |
|
|
View PDF |
Geocoding is the process of associating spatial locations (longitude and latitude coordinates) with postal addresses. This chapter includes the following major sections:
This section describes concepts that you must understand before you use the Spatial geocoding capabilities.
Addresses to be geocoded can be represented either as formatted addresses or unformatted addresses.
A formatted address is described by a set of attributes for various parts of the address, which can include some or all of those shown in Table 5-1.
Table 5-1 Attributes for Formal Address Representation
Address Attribute | Description |
---|---|
Name | Place name (optional). |
Intersecting street | Intersecting street name (optional). |
Street | Street address, including the house or building number, street name, street type (Street, Road, Blvd, and so on), and possibly other information.
In the current release, the first four characters of the street name must match a street name in the geocoding data for there to be a potential street name match. |
Settlement | The lowest-level administrative area to which the address belongs. In most cases it is the city. In some European countries, the settlement can be an area within a large city, in which case the large city is the municipality. |
Municipality | The administrative area above settlement. Municipality is not used for United States addresses. In European countries where cities contain settlements, the municipality is the city. |
Region | The administrative area above municipality (if applicable), or above settlement if municipality does not apply. In the United States, the region is the state; in some other countries, the region is the province. |
Postal code | Postal code (optional if administrative area information is provided). In the United States, the postal code is the 5-digit ZIP code. |
Postal add-on code | String appended to the postal code. In the United States, the postal add-on code is typically the last four numbers of a 9-digit ZIP code specified in "5-4" format. |
Country | The country name or ISO country code. |
Formatted addresses are specified using the SDO_GEO_ADDR data type, which is described in Section 5.2.1.
An unformatted address is described using lines with information in the postal address format for the relevant country. The address lines must contain information essential for geocoding, and they might also contain information that is not needed for geocoding (something that is common in unprocessed postal addresses). An unformatted address is stored as an array of strings. For example, an address might consist of the following strings: '22 Monument Square' and 'Concord, MA 01742'.
Unformatted addresses are specified using the SDO_KEYWORDARRAY data type, which is described in Section 5.2.3.
The match mode for a geocoding operation determines how closely the attributes of an input address must match the data being used for the geocoding. Input addresses can include different ways of representing the same thing (such as Street and the abbreviation St), and they can include minor errors (such as the wrong postal code, even though the street address and city are correct and the street address is unique within the city).
You can require an exact match between the input address and the data used for geocoding, or you can relax the requirements for some attributes so that geocoding can be performed despite certain discrepancies or errors in the input addresses. Table 5-2 lists the match modes and their meanings. Use a value from this table with the match_mode
attribute of the SDO_GEO_ADDR data type (described in Section 5.2.1) and for the match_mode
parameter of a geocoding function or procedure.
Table 5-2 Match Modes for Geocoding Operations
Match Mode | Description |
---|---|
EXACT | All attributes of the input address must match the data used for geocoding. However, if the house or building number, base name (street name), street type, street prefix, and street suffix do not all match the geocoding data, a location in the first match found in the following is returned: postal code, city or town (settlement) within the state, and state. For example, if the street name is incorrect but a valid postal code is specified, a location in the postal code is returned. |
RELAX_STREET_TYPE | The street type can be different from the data used for geocoding. For example, if Main St is in the data used for geocoding, Main Street would also match that, as would Main Blvd if there was no Main Blvd and no other street type named Main in the relevant area. |
RELAX_POI_NAME | The name of the point of interest does not have to match the data used for geocoding. For example, if Jones State Park is in the data used for geocoding, Jones State Pk and Jones Park would also match as long as there were no ambiguities or other matches in the data. |
RELAX_HOUSE_NUMBER | The house or building number and street type can be different from the data used for geocoding. For example, if 123 Main St is in the data used for geocoding, 123 Main Lane and 124 Main St would also match as long as there were no ambiguities or other matches in the data. |
RELAX_BASE_NAME | The base name of the street, the house or building number, and the street type can be different from the data used for geocoding. For example, if Pleasant Valley is the base name of a street in the data used for geocoding, Pleasant Vale would also match as long as there were no ambiguities or other matches in the data. |
RELAX_POSTAL_CODE | The postal code (if provided), base name, house or building number, and street type can be different from the data used for geocoding. |
RELAX_BUILTUP_AREA | The address can be outside the city specified as long as it is within the same county. Also includes the characteristics of RELAX_POSTAL_CODE. |
RELAX_ALL | Equivalent to RELAX_BUILTUP_AREA. |
DEFAULT | Equivalent to RELAX_BASE_NAME. |
The match code is a number indicating which input address attributes matched the data used for geocoding. The match code is stored in the MATCH_CODE attribute of the output SDO_GEO_ADDR object (described in Section 5.2.1).
Table 5-3 lists the possible match code values.
Table 5-3 Match Codes for Geocoding Operations
Match Code | Description |
---|---|
1 | Exact match: the city name, postal code, street base name, street type (and suffix or prefix or both, if applicable), and house or building number match the data used for geocoding. |
2 | The city name, postal code, street base name, and house or building number match the data used for geocoding, but the street type, suffix, or prefix does not match. |
3 | The city name, postal code, and street base name match the data used for geocoding, but the house or building number does not match. |
4 | The city name and postal code match the data used for geocoding, but the street address does not match. |
10 | The city name matches the data used for geocoding, but the postal code does not match. |
11 | The postal code matches the data used for geocoding, but the city name does not match. |
For an output geocoded address, the ErrorMessage attribute of the SDO_GEO_ADDR object (described in Section 5.2.1) contains a string that indicates which address attributes have been matched against the data used for geocoding. Before the geocoding operation begins, the string is set to the value ???????????281C??
; and the value is modified to reflect which attributes have been matched.
Table 5-4 lists the character positions in the string and the address attribute corresponding to each position. It also lists the character value that the position is set to if the attribute is matched.
Table 5-4 Geocoded Address Error Message Interpretation
Position | Attribute | Value If Matched |
---|---|---|
1-4 | (Reserved for future use.) | ???? |
5 | House or building number | # |
6 | Street prefix | E |
7 | Street base name | N |
8 | Street suffix | U |
9 | Street type | T |
10 | Secondary unit | S |
11 | Built-up area or city | B |
14 | Region | 1 |
15 | Country | C |
16 | Postal code | P |
17 | Postal add-on code | A |
This section describes the data types specific to geocoding functions and procedures.
The SDO_GEO_ADDR object type is used to describe an address. When a geocoded address is output by an SDO_GCDR function or procedure, it is stored as an object of type SDO_GEO_ADDR.
Table 5-5 lists the attributes of the SDO_GEO_ADDR type. Not all attributes will be relevant in any given case. The attributes used for a returned geocoded address depend on the geographical context of the input address, especially the country.
Table 5-5 SDO_GEO_ADDR Type Attributes
Attribute | Data Type | Description |
---|---|---|
Id | NUMBER | (Not used.) |
AddressLines | SDO_KEYWORDARRAY | Address lines. (The SDO_KEYWORDARRAY type is described in Section 5.2.3.) |
PlaceName | VARCHAR2(200) | (Not used.) |
StreetName | VARCHAR2(200) | Street name, including street type. Example: MAIN ST |
IntersectStreet | VARCHAR2(200) | Intersecting street. |
SecUnit | VARCHAR2(200) | Secondary unit, such as an apartment number or building number. |
Settlement | VARCHAR2(200) | Lowest-level administrative area to which the address belongs. (See Table 5-1.) |
Municipality | VARCHAR2(200) | Administrative area above settlement. (See Table 5-1.) |
Region | VARCHAR2(200) | Administrative area above municipality (if applicable), or above settlement if municipality does not apply. (See Table 5-1.) |
Country | VARCHAR2(100) | Country name or ISO country code. |
PostalCode | VARCHAR2(20) | Postal code (optional if administrative area information is provided). In the United States, the postal code is the 5-digit ZIP code. |
PostalAddOnCode | VARCHAR2(20) | String appended to the postal code. In the United States, the postal add-on code is typically the last four numbers of a 9-digit ZIP code specified in "5-4" format. |
FullPostalCode | VARCHAR2(20) | Full postal code, including the postal code and postal add-on code. |
POBox | VARCHAR2(100) | Post Office box number. |
HouseNumber | VARCHAR2(100) | House or building number. Example: 123 in 123 MAIN ST |
BaseName | VARCHAR2(200) | Base name of the street. Example: MAIN in 123 MAIN ST |
StreetType | VARCHAR2(20) | Type of the street. Example: ST in 123 MAIN ST |
StreetTypeBefore | VARCHAR2(1) | (Not used.) |
StreetTypeAttached | VARCHAR2(1) | (Not used.) |
StreetPrefix | VARCHAR2(20) | Prefix for the street. Example: S in 123 S MAIN ST |
StreetSuffix | VARCHAR2(20) | Suffix for the street. Example: NE in 123 MAIN ST NE |
Side | VARCHAR2(1) | Side of the street (L for left or R for right) that the house is on when you are traveling from lower to higher numbered addresses. |
Percent | NUMBER | Number from 0 to 1 (multiply by 100 to get a percentage value) indicating how far along the street you are when traveling from lower to higher numbered addresses. |
EdgeID | NUMBER | Edge ID of the road segment. |
ErrorMessage | VARCHAR2(20) | Error message (see Section 5.1.4). |
MatchCode | NUMBER | Match code (see Section 5.1.3). |
MatchMode | VARCHAR2(30) | Match mode (see Section 5.1.2). |
Longitude | NUMBER | Longitude coordinate value. |
Latitude | NUMBER | Latitude coordinate value. |
You can return the entire SDO_GEO_ADDR object, or you can specify an attribute using standard "dot" notation. Example 5-1 contains statements that geocode the address of the San Francisco City Hall; the first statement returns the entire SDO_GEO_ADDR object, and the remaining statements return some specific attributes.
Example 5-1 Geocoding, Returning Address Object and Specific Attributes
SELECT SDO_GCDR.GEOCODE('SCOTT', SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA 94102'), 'US', 'RELAX_BASE_NAME') FROM DUAL; SDO_GCDR.GEOCODE('CJMURRAY',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO -------------------------------------------------------------------------------- SDO_GEO_ADDR(0, SDO_KEYWORDARRAY(), NULL, 'CARLTON B GOODLETT PL', NULL, NULL, ' SAN FRANCISCO', NULL, 'CA', 'US', '94102', NULL, '94102', NULL, '1', 'CARLTON B GOODLETT', 'PL', 'F', 'F', NULL, NULL, 'L', .01, 23614360, 'nul?#ENUT?B281CP?', 1, 'DEFAULT', -122.41815, 37.7784183) SELECT SDO_GCDR.GEOCODE('SCOTT', SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA 94102'), 'US', 'RELAX_BASE_NAME').StreetType FROM DUAL; SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO -------------------------------------------------------------------------------- PL SELECT SDO_GCDR.GEOCODE('SCOTT', SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA 94102'), 'US', 'RELAX_BASE_NAME').Side RROM DUAL; S - L SELECT SDO_GCDR.GEOCODE('SCOTT', SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA 94102'), 'US', 'RELAX_BASE_NAME').Percent FROM DUAL; SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO -------------------------------------------------------------------------------- .01 SELECT SDO_GCDR.GEOCODE('SCOTT', SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA 94102'), 'US', 'RELAX_BASE_NAME').EdgeID FROM DUAL; SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO -------------------------------------------------------------------------------- 23614360 SELECT SDO_GCDR.GEOCODE('SCOTT', SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA 94102'), 'US', 'RELAX_BASE_NAME').MatchCode FROM DUAL; SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO -------------------------------------------------------------------------------- 1
The SDO_ADDR_ARRAY type is a VARRAY of SDO_GEO_ADDR objects (described in Section 5.2.1) used to store geocoded address results. Multiple address objects can be returned when multiple addresses are matched as a result of a geocoding operation.
The SDO_ADDR_ARRAY type is defined as follows:
CREATE TYPE sdo_addr_array AS VARRAY(1000) OF sdo_geo_addr;
The SDO_KEYWORDARRAY type is a VARRAY of VARCHAR2 strings used to store address lines for unformatted addresses. (Formatted and unformatted addresses are described in Section 5.1.1.)
The SDO_KEYWORDARRAY type is defined as follows:
CREATE TYPE sdo_keywordarray AS VARRAY(10000) OF VARCHAR2(9000);
To use the Oracle Spatial geocoding capabilities, you must use data provided by a geocoding vendor, and the data must be in the format supported by the Oracle Spatial geocoding feature. For information about getting and loading this data, go to the Spatial page of the Oracle Technology Network (OTN):
http://otn.oracle.com/products/spatial/
Find the link for geocoding, and follow the instructions.
To geocode an address using the geocoding data, use the SDO_GCDR PL/SQL package subprograms, which are documented in Chapter 20:
The SDO_GCDR.GEOCODE function geocodes an unformatted address to return an SDO_GEO_ADDR object.
The SDO_GCDR.GEOCODE_AS_GEOMETRY function geocodes an unformatted address to return an SDO_GEOMETRY object.
The SDO_GCDR.GEOCODE_ALL function geocodes all addresses associated with an unformatted address and returns the result as an SDO_ADDR_ARRAY object (an array of address objects).