Sun Microsystems
Products & Services
 
Support & Training
 
 

Previous Previous     Contents     Next Next

4.6.1.3 Scripting

The default output for the zpool list command is designed to be human-readable, and is not easy to use as part of a shell script. In order to aid programmatic uses of the command, the -H option can be used to suppress the column headings and separate fields by tabs, rather than space padding. For example, to get a simple list of all pool names on the system:

# zpool list -Ho name
tank
dozer

Or a script-ready version of the earlier example:

# zpool list -H -o name,size
tank   80.0G
dozer  1.2T

4.6.2 I/O Statistics

To get I/O statistics for a pool or individual virtual devices, use the zpool iostat command. Similar to the iostat(1M) command, this can display a static snapshot of all I/O activity so far, as well as updated statistics every specified interval. The following statistics are reported:

USED CAPACITY

The amount of data currently stored in the pool or device. This differs from the amount of space available to actual filesystems by a small amount due to internal implementation details.

For more information on the difference between pool space and dataset space, see 3.2 Space Accounting.

AVAILABLE CAPACITY

The amount of space available in the pool or device. As with the used statistic, this is differs from the amount of space available to datasets by a small margin.

READ OPERATIONS

The number of read I/O operations sent to the pool or device, including metadata requests.

WRITE OPERATIONS

The number of write I/O operations sent to the pool or device.

READ BANDWIDTH

The bandwidth of all read operations (including metadata), expressed as units per second.

WRITE BANDWIDTH

The bandwidth of all write operations, expressed as units per second.

4.6.2.1 Pool Wide Statistics

With no options, the zpool iostat command displays the accumulated statistics since boot for all pools on the system:

# zpool iostat
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank         100G  20.0G   1.2M   102K   1.2M  3.45K
dozer       12.3G  67.7G   132K  15.2K  32.1K  1.20K

These statistics are since boot, so bandwidth may appear low if the pool is relatively idle. A more accurate view of current bandwidth usage can be seen by specifying an interval:

# zpool iostat tank 2
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank         100G  20.0G   1.2M   102K   1.2M  3.45K
tank         100G  20.0G    134      0  1.34K      0
tank         100G  20.0G     94    342  1.06K   4.1M

The above command displays usage statistics only for the pool tank every two seconds until the user types Ctrl-C. Alternately, you can specify an additional count parameter, which causes the command to terminate after the specified number of iterations. For example, zpool iostat 2 3 would print out a summary every two seconds for 3 iterations, for a total of six seconds. If there is a single pool, then the statistics is displayed on consecutive lines as shown above. If there is more than one pool, then an additional newline delineates each iteration to provide visual separation.

4.6.2.2 Virtual Device Statistics

In addition to pool-wide I/O statistics, the zpool iostat command can also display statistics for individual virtual devices. This can be used to identify abnormally slow devices, or simply observe the distribution of I/O generated by ZFS. To see the complete virtual device layout as well as all I/O statistics, use the zpool iostat -v command:

# zpool iostat -v
               capacity     operations    bandwidth
tank         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
mirror      20.4G  59.6G      0     22      0  6.00K
  c0t0d0        -      -      1    295  11.2K   148K
  c1t1d0        -      -      1    299  11.2K   148K
----------  -----  -----  -----  -----  -----  -----
total       24.5K   149M      0     22      0  6.00K

There are a few important things to remember when viewing I/O statistics on a virtual device basis. The first thing you'll notice is that space usage is only available for top-level virtual devices. The way in which space is allocated among mirror and RAID-Z virtual devices is particular to the implementation and not easily expressed as a single number. The other important thing to note is that numbers may not add up exactly as you would expect them to. In particular, operations across RAID-Z and mirrored devices will not be exactly equal. This is particularly noticeable immediately after a pool is created, as a significant amount of I/O is done directly to the disks as part of pool creation that is not accounted for at the mirror level. Over time, these numbers should gradually equalize, although broken, unresponsive, or offlined devices can affect this symmetry as well.

The same set of options (interval and count) can be used when examining virtual device statistics as well.

4.6.3 Health Status

ZFS provides an integrated method of examining pool and device health. The health of a pool is determined from the state of all its devices. This section describes how to determine pool and device health. It does not document how to repair or recover from unhealthy pools. For more information on troubleshooting and data recovery, see Chapter 9, Troubleshooting and Data Recovery.

Each device can fall into one of the following states:

ONLINE

The device is in normal working order. While some transient errors may still be seen, the device is in otherwise working order.

DEGRADED

The virtual device has experienced failure, but is still able to function. This is most common when a mirror or RAID-Z device has lost one or more constituent devices. The fault tolerance of the pool may be compromised, as a subsequent fault in another device may be unrecoverable.

FAULTED

The virtual device is completely inaccessible. This typically indicates total failure of the device, such that ZFS is incapable of sending or receiving data from it. If a top level virtual device is in this state, then the pool is completely inaccessible.

OFFLINE

The virtual device has been explicitly offlined by the administrator.

The health of a pool is determined from the health of all its top-level virtual devices. If all virtual devices are ONLINE, then the pool is also ONLINE. If any one of them is DEGRADED, then the pool is also DEGRADED. If a top level virtual device is FAULTED or OFFLINE, then the pool is also FAULTED. A pool in the faulted state is completely inaccessible -- no data can be recovered until the necessary devices are attached or repaired. A pool in the degraded state continues to run, but you may not be getting the same level of data replication level or data throughput you would be if the pool were online.

4.6.3.1 Basic Health Status

The simplest way to get a quick overview of pool health status is with the zpool status command:

# # zpool status -x
all pools are healthy

Particular pools can be examined by specifying a pool name to the command. Any pool not in the ONLINE state should be investigated for potential problems, as described in the next section.

4.6.3.2 Detailed Health Status

A more detailed health summary can be found by using the -v option:

# zpool status -v tank
  pool: tank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist 
        for the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: none requested
config:

        NAME                STATE     READ WRITE CKSUM
        tank                DEGRADED     0     0     0
          mirror            DEGRADED     0     0     0
            c0t0d0          FAULTED      0     0     0  cannot open
            c0t0d1          ONLINE       0     0     0

This displays a more complete description of why the pool is in its current state, including a human-readable description of the problem and a link to a knowledge article for more information. Each knowledge article provides up-to-date information on the best way to recover from your current situation. Using the detailed configuration information, you should be able to determine which device is damaged and how to repair the pool.

If a pool has a faulted or offlined device, the output of this command identifies the problem pool. For example:

# zpool status -x
  pool: tank
 state: DEGRADED
status: One or more devices has been taken offline by the adminstrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
 scrub: none requested
config:

        NAME                STATE     READ WRITE CKSUM
        tank                DEGRADED     0     0     0
          mirror            DEGRADED     0     0     0
            c0t0d0          OFFLINE      0     0     0
            c1t0d0          ONLINE       0     0     0

The READ and WRITE columns gives a count of I/O errors seen on the device, while the CKSUM column gives a count of uncorrectable checksum errors seen on the device. Both of these likely indicate potential device failure, and some corrective action is needed. If you see non-zero errors for a top-level virtual device, it may indicate that portions of your data have become inaccessible.

For more information on diagnosing and repairing faulted pools and data, see Chapter 9, Troubleshooting and Data Recovery.

Previous Previous     Contents     Next Next