C H A P T E R  7

Recycling

Recycling is the process of reclaiming space on archive volumes. The recycler works with the archiver to reclaim the space occupied by unused archive copies. As users modify files, the archive copies associated with the old versions can be purged from the system. The recycler identifies the volumes with the largest proportions of expired archive copies and directs the moving of unexpired copies to different volumes. If only expired copies exist on a given volume, a site-defined action is taken. For example, such a volume can be relabeled for immediate reuse or exported to offsite storage, thus keeping a separate historical record of file changes. Users are unaware of the recycling process as it relates to their data files.

This chapter includes the following topics:


Recycler Overview

The recycler keeps the amount of space consumed by expired archive copies to a minimum as defined by site-specified parameters. At any time, the space on a given archive volume consists of the following:

The capacity of a volume is the total amount of space for data on a volume. For example, a 10-gigabyte tape volume with 3 gigabytes written to it has a capacity of 10 gigabytes and 7 gigabytes of free space.

New or newly labeled archive media starts out with all its capacity as free space. As data is archived to the media, the amount of free space decreases and the amount of current data increases.

As archived files in the file system are changed or removed, their archive images expire and they move from the current data classification to the expired data classification. The physical space used by these images remains the same; there is simply no longer a file in the file system pointing to that space.

These expired images (and thus, expired data) would eventually consume all free space. Only when space is recycled can these images be removed and the space they occupy become free. The goal of the recycler is to transform space used by expired data into free space without losing any current data.

For example, removable media cartridges, such as tapes, can only be appended to. They cannot be rewritten in place. The only way to reuse a cartridge is to move all of the current data off of the cartridge, relabel the cartridge, and start using it again from the beginning.

You initiate recycling by entering the sam-recycler(1M) command. This can be done manually or through a cron(1) job. TABLE 7-1 shows recycling methods.

TABLE 7-1 Recycling Methods and Media Types

Recycling Method

Media and Notes

By automated library

Removable media cartridges.

When you archive by library, you put recycling directives in the recycler.cmd file.

By archive set

Removable media cartridges and disk.

When you archive by archive set, you do not use a recycler.cmd file. You put all your recycling directives in the archiver.cmd file.


Note that you can recycle either by library or by archive set. As TABLE 7-1 shows, if you are archiving to disk, you can recycle only by archive set.

The recycler and the archiver work together, as follows:

1. The recycler marks all the current (valid) archive images that are present on a volume with the rearchive attribute.

2. If you are archiving to removable media, the recycler marks the selected archive volume with the recycle attribute. This prevents the archiver from writing any more archive images to the volume.

3. The archiver moves all the marked images to another volume. This operation is called rearchiving. After the archiver moves the current archive images from the old volume to the new volume, the old volume contains only free space and expired space. If you are archiving to removable media cartridges, you can relabel and reuse the cartridge. If you are archiving to disk, the recycler removes the file that contains the expired archive images.

The recycler is designed to run periodically. It performs as much work as it can each time it is invoked. The recycler has to finish marking copies for rearchiving before the archiver can rearchive the files.

Sometimes expired archive images, with the rearchive attribute set, remain on media. This can happen under the following conditions:

Between executions, the recycler keeps state information in the library catalogs and the inodes. During the recycling process, you can use the sls(1) command and its -D option to display information about a file. The output from the sls(1) command shows whether or not a file is scheduled for rearchiving.


Recycling Directives

The recycler.cmd file accepts the directives described in the following sections:

Specifying a Log File: the logfile Directive

The logfile directive specifies a recycler log file. This directive has the following format:

logfile = filename

For filename, specify the path to the log file.

The following is an example of a logfile= directive line:

logfile=/var/adm/recycler.log

Preventing Recycling: the no_recycle Directive

The no_recycle directive enables you to prevent recycling of volumes. To specify the VSNs, you use regular expressions and one or more specific media types. This directive has the following format:

no_recycle media_type VSN_regex [ VSN_regex ... ]

TABLE 7-2 Arguments for the no_recycle Directive

Argument

Meaning

media_type

Specify a media type from the mcf(4) man page.

VSN_regexp

Specify one or more space-separated regular expressions to describe the volumes. For information on the format of a regex, see the regexp(5) man page or see File Name search_criteria Using Pattern Matching: -name regex.


 

By specifying a media_type, you can prevent the recycling of volumes stored on a particular type of media. One or more VSN_regexp specifications enables you to use a regular expression to identify specific cartridges to be excluded from recycling.

For example, the following directive line excludes from recycling any tape volumes whose VSN identifiers begin with DLT:

no_recycle lt DLT.*

Specifying Recycling for an Entire Automated Library: the Library Directive

The library directive enables you to specify various recycling parameters for the VSNs associated with a specific library. This directive has the following format:

library parameter [ parameter ... ]

For library, specify the library's name as specified in the Family Set field of the mcf(4) file.

For parameter, specify one or more space-separated parameter keywords from TABLE 7-3.

TABLE 7-3 Library Directive parameter Values

parameter

Action

-dataquantity size

Limits the amount of data that the recycler can schedule for rearchiving in its efforts to clear volumes of useful data. Default is 1 gigabyte.

-hwm percent

Library high watermark. Default is 95.

-ignore

Prevents volumes in this library from being recycled. This directive is useful when testing the recycler.cmd file.

-mail [ email_address ]

Sends email messages to the designated email_address. By default, no email is sent. If -mail is specified with no argument, email is sent to root.

-mingain value

Minimum VSN gain. Default is 50.

-vsncount count

Limits the number of volumes to be recycled to count. Default is 1.


For example, consider the following directive line:

gr47 -hwm 85 -ignore -mail root -mingain 40

It specifies the following for library gr47:


Configuring the Recycler

Prior to configuring the recycler, note the following:

The recycler is not enabled by default. You must initiate recycling by entering the sam-recycler(1M) command. When the recycler is initiated, the default recycler settings specified in Specifying Recycling for an Entire Automated Library: the Library Directive take effect. For more information on the recycler, see the sam-recycler(1M) man page.

The following sections describe the process for configuring the recycler. This process includes the following steps:

If you are archiving to cartridges in a library, this process includes creating a recycler.cmd file and, optionally, editing the archiver.cmd file. If you are archiving to disk, you can archive only by archive set, so to enable recycling of these disk volumes, you edit the archiver.cmd file. The following procedure describes configuring the recycler for any archive media.


procedure icon  Step 1: Creating a recycler.cmd File (Optional)

Perform this step if you are recycling archive copies on cartridges in a library.

If you are recycling archive copies on disk volumes, you cannot complete this step because recycling is controlled by directives in the archiver.cmd file. For information on the configuring recycling in the archiver.cmd file, see Step 2: Editing the archiver.cmd File (Optional).

The recycler.cmd file contains general recycling directives and can also contain directives for each library in the Sun StorEdge SAM-FS environment. For information on the recycling directive, see Recycling Directives.

Even if you are recycling by archive set, you still should configure each library in the recycler.cmd file. This ensures that VSNs that do not fall into an archive set can be recycled if needed.

A typical recycler.cmd file contains the following directive lines:

Because you are still creating the recycler.cmd line, and it has not yet been tested, use the ignore keyword. You remove the ignore keyword in a later step in this process.

To create a recycler.cmd file, perform the following steps:

1. Become superuser.

2. Use vi(1) or another editor to open file /etc/opt/SUNWsamfs/recycler.cmd.

3. Add one or more directives described in this chapter to control recycler activity.

4. Save and close the file.

Example recycler.cmd File

CODE EXAMPLE 7-1 shows an example of a recycler.cmd file.

CODE EXAMPLE 7-1 A recycler.cmd File Example
logfile = /usr/tmp/recycler.log
stk30  -hwm 51  -mingain 60  -ignore  -mail  root

The following sections describe the parameters specified in CODE EXAMPLE 7-1.

The -hwm 51 Parameter

By specifying a high watermark, you can set the percentage of media usage below which recycling cannot occur. This percentage is the ratio of the used space in the library to its total capacity. As an example, a library that holds 10 20-gigabyte tapes, three of them 100 percent full and the remaining seven each 30 percent full, has the following media utilization percentage:

((3* 1.00 + 7 * 0.30) * 20G ) / ( 10 * 20G ) * 100%= 51%

Note that this calculation does not distinguish between current data and expired data. It only addresses the amount of media used.

In this example, if the high watermark is 51 percent or less, the recycler does not automatically select any of the automated library's VSNs for recycling.



Note - You can force a VSN to be recycled by using the following command to set the recycling flag:

# chmed +c lt.AAA123

When the +c flag is set, the archiver does not write any more archive images to the volume. The +c flag can be viewed through the samu(1M) utility. For more information, see the chmed(1M) and samu(1M) man pages.



The -mingain 60 Parameter

The minimum VSN gain percentage sets a lower limit on the amount of space to be gained by recycling a cartridge. For example, if a cartridge in an automated library is 95 percent current data and 5 percent expired data, the gain obtained by recycling the cartridge is only 5 percent. It might not be worth moving the other 95 percent to retrieve this space. Setting the minimum-gain to 6 percent or more inhibits the recycler from automatically selecting this example VSN.

Another example is a cartridge with 90 percent expired data, 5 percent current data, and 5 percent free space. This would have a gain of 90 percent if recycled.

The -ignore Parameter

The -ignore parameter keeps the recycler from recycling a particular library and should be used when you are configuring the recycler.

The -mail root Parameter

The -mail parameter specifies that the recycler send mail when recycling occurs on a given library. The mail message has the following subject line:

Robot robot-name recycle

TABLE 7-2 shows sample message bodies.

CODE EXAMPLE 7-2 Sample Recycling Messages
I will recycle VSN vsn.
Cannot find any candidate VSN in this media changer.
Previously selected VSN vsn is not yet finished recycling.
Previously selected VSN vsn is now finished recycling. It will now be post-recycled.


procedure icon  Step 2: Editing the archiver.cmd File (Optional)

Perform this step if you are recycling by archive set. If you are archiving to disk, recycling by archive set is the only means of recycling that is possible, so if you are archiving to disk, you must complete this step in order to recycle.

If you are recycling by library, you can proceed to the next step.

single-step bulletTo edit the archiver.cmd file, perform the procedure called To Create or Modify an archiver.cmd File and Propagate Your Changes.

The directives you add to the archiver.cmd file to enable recycling by archive set, must appear between params and endparams directives. TABLE 7-4 shows the archive set recycling directives that you can use.

TABLE 7-4 Archive Set Recycling Directives

Directive

Function

-recycle_dataquantity size

Limits the amount of data that the recycler can schedule for rearchiving in its efforts to clear volumes of useful data.

-recycle_hwm percent

Sets the high watermark percentage.

-recycle_ignore

Prevents the archive set from being recycled.

-recycle_mailaddr mail_address

Sends recycler messages to mail_address.

-recycle_mingain percent

Limits recycling to those VSNs that would increase their free space by percent or more.

-recycle_vsncount count

Limits the number of volumes to be rearchived to count.


For more information about the preceding directives, see Archiving or see the archiver.cmd(4) man page.

CODE EXAMPLE 7-3 shows an archiver.cmd example for recycling disk archives.

CODE EXAMPLE 7-3 Disk Archiving Specifications in the archiver.cmd File
fs = samfs1
    1 2m
 
arset0  testdir0
    1 2m
    2 4m
 
arset1  testdir1
    1 2m
    2 4m
 
params
arset0.1 -disk_archive disk01 -recycle_hwm 80 \
    -recycle_mingain 20 -recycle_ignore
arset1.1 -disk_archive disk02 -recycle_hwm 80 \
    -recycle_mingain 20 -recycle_ignore
endparams


procedure icon  Step 3: Running the Recycler

1. Issue the sam-recycler(1M) command.

The recycler reads the recycler.cmd file.

2. Examine the standard output, log, SAM log, and /var/adm/messages for any error messages from the recycler.

Correct your files if errors appear.

CODE EXAMPLE 7-4 shows a sample recycler log file for recycling removable media cartridges.

CODE EXAMPLE 7-4 Recycler Log File Example for Removable Media Cartridges
========== Recycler begins at Wed Dec 12 14:05:21 2001 ===========
Initial 2 catalogs:
 
0  Family: m160                 Path: /var/opt/SUNWsamfs/catalog/m160
   Vendor: ADIC                 Product: Scalar 100
   SLOT                  ty    capacity         space vsn
      0                  at        25.0G        25.0G CLN005
      1                  at        48.5G         6.1G 000003
      2                  at        48.5G        32.1G 000004
      3                  at        48.5G        35.1G 000005
      4                  at        48.5G        44.6G 000044
      5                  at        48.5G        45.1G 000002
      6                  at        48.5G        45.9G 000033
      7                  at        48.5G        48.5G 000001
   Total Capacity:  364.8G bytes, Total Space Available: 282.3G bytes
   Volume utilization 22%, high 95% VSN_min 50%
   Recycling is ignored on this robot.
 
 
 
 
1  Family: hy                   Path: /var/opt/SUNWsamfs/catalog/historian
   Vendor: Sun SAM-FS           Product: Historian
   SLOT                  ty    capacity         space vsn
      (no VSNs in this media changer)
   Total Capacity:  0    bytes, Total Space Available: 0    bytes
   Volume utilization 0%, high 95% VSN_min 50%
   Recycling is ignored on this robot.
 
 
 
8 VSNs:
 
                    ---Archives---   -----Percent-----   m160
 ----Status-----    Count    Bytes   Use Obsolete Free   Library:Type:VSN
no-data VSN            0      0        0    87     13    m160:at:000003
no-data VSN            0      0        0    33     67    m160:at:000004
no-data VSN            0      0        0    27     73    m160:at:000005
no-data VSN            0      0        0     8     92    m160:at:000044
no-data VSN            0      0        0     7     93    m160:at:000002
no-data VSN            0      0        0     5     95    m160:at:000033
empty VSN              0      0        0     0    100    m160:at:CLN005
empty VSN              0      0        0     0    100    m160:at:000001
 
 
 
Recycler finished.
 
========== Recycler ends at Wed Dec 12 14:05:32 2001 ===========
 

CODE EXAMPLE 7-5 shows a sample recycler log file for recycling disk archive files.

CODE EXAMPLE 7-5 Recycler Log File Example for Disk Archive Files
---Archives---   -----Percent-----
 ----Status-----    Count    Bytes   Use Obsolete Free   Library:Type:VSN
new candidate          0      0        0    41     59  <none>:dk:disk01
 
 
677 files recycled from VSN disk01 (mars:/sam4/copy1)
0 directories recycled from VSN disk01 (mars:/sam4/copy1)


procedure icon  Step 4: Creating a crontab File for the Recycler (Optional)

If the system is performing as expected, you are ready to make a crontab entry for the superuser to run the recycler periodically. You might want to run the recycler no more than once every two hours, depending on your site's conditions.

single-step bulletCreate a crontab entry.

For information about this, see the cron(1M) man page.

The following example entry in root's crontab file ensures that the cron daemon runs the recycler every five minutes after the hour for every odd-numbered hour:

5 1,3,5,7,9,11,13,15,17,19,21,23   * * * /opt/SUNWsamfs/sbin/sam-recycler


procedure icon  Step 5: Removing -recycle_ignore and ignore Parameters

1. Use vi(1) or another editor to remove the -recycle_ignore parameters from the archiver.cmd file.

2. Use vi(1) or another editor to remove the ignore parameters from the recycler.cmd files.

You are now recycling.


procedure icon  Step 6: Creating a recycler.sh File (Optional)

Perform this step if you are recycling archive copies on removable media cartridges. If you are archiving only to disk, do not perform this step.

The recycler executes the recycler.sh script when all the current images from a VSN have been rearchived to another VSN. For an example, see the recycler.sh(1M) man page. Another example, found in /opt/SUNWsamfs/examples/recycler.sh, shows how to relabel a recycled VSN and send mail to the superuser.

The recycler called the /opt/SUNWsamfs/sbin/recycler.sh script with the following arguments:

Media type: $1  VSN: $2  Slot: $3  Eq: $4

The /opt/SUNWsamfs/sbin/recycler.sh script is called when the recycler determines that a VSN has been drained of all known active archive copies. You should determine your site requirements for dispensing with recycled cartridges. Some sites choose to relabel and reuse the cartridges; others choose to remove the cartridges from the automated library to use later for accessing historical files. For more information, see the recycler(1M) and recycler.sh(1M) man pages.


Troubleshooting the Recycler

The most frequent problem encountered with the recycler occurs when the recycler generates a message similar to the following when it is invoked:

Waiting for VSN mo:OPT000 to drain, it still has 123 active archive copies.

One of the following conditions can cause the recycler to generate this message:

Condition 1 can exist for one of the following reasons:

To determine which condition is in effect, run the recycler with the -v option. As CODE EXAMPLE 7-6 shows, this option displays the path names of the files associated with the 123 archive copies in the recycler log file.

CODE EXAMPLE 7-6 Recycler Messages
Archive copy 2 of /sam/fast/testA resides on VSN LSDAT1
Archive copy 1 of /sam3/tmp/dir2/filex resides on VSN LSDAT1
Archive copy 1 of Cannot find pathname for file system /sam3 inum/gen 30/1 resides on VSN LSDAT1
Archive copy 1 of /sam7/hgm/gunk/tstfilA00 resides on VSN LSDAT1
Archive copy 1 of /sam7/hgm/gunk/tstfilF82 resides on VSN LSDAT1
Archive copy 1 of /sam7/hgm/gunk/tstfilV03 resides on VSN LSDAT1
Archive copy 1 of /sam7/hgm/gink/tstfilA06 resides on VSN LSDAT1
Archive copy 1 of /sam7/hgm/gink/tstfilA33 resides on VSN LSDAT1
Waiting for VSN dt:LSDAT1 to drain, it still has 8 active archive copies.

In this example output, messages containing seven path names are displayed along with one message that includes Cannot find pathname... text. To correct the problem with LSDAT1 not draining, you need to determine why the seven files cannot be rearchived. After the seven files are rearchived, only one archive copy is not associated with a file. Note that this condition should occur only as the result of a system crash that partially corrupted the .inodes file.

To solve the problem of finding the path name, run samfsck(1M) to reclaim orphan inodes. If you choose not to run samfsck(1M), or if you are unable to unmount the file system to run samfsck(1M), you can manually relabel the cartridge after verifying that the recycler -v output is clean of valid archive copies. However, because the recycler continues to encounter the invalid inode remaining in the .inodes file, the same problem might recur the next time the VSN is a recycle candidate.

Another recycler problem occurs when the recycler fails to select any VSNs for recycling. To determine why each VSN was rejected, you can run the recycler with the -d option. This displays information on how the recycler selects VSNs for recycling.