Online File Recovery (Snapshots)
Last updated November 17, 2009
Introduction
CCIS uses a Network Appliance file server to provide NFS and CIFS service to our machines. This server has a very handy feature known as snapshots which allows for online file recovery of recently changed or deleted files. This document will show you how this mechanism works and how to use it.
How Does It Work?
The Network Appliance File server (“the NetApp”) uses a custom filesystem and other techniques to perform its snapshot magic. For more technical information on their very clever implementation, we recommend you read their technical report.
Here’s how it works from a user’s point of view: once every X hours, the NetApp “makes a snapshot” of the current filesystem. This snapshot, which essentially looks like a read-only copy of the filesystem at that point, gets placed into a special directory at the top of your home directory and other filesystem mount points. There are also hidden snapshot directories in every “snapshot-ed” directory.
The snapshot mechanism is independent of our normal tape backup mechanism.
How Often Do Snapshots Get Taken And How Long Do They Last?
At CCIS, “hourly” snapshots on our /home and /proj filesystems happen at:
- 8:00 am
- 12:00 pm
- 4:00 pm
- 8:00 pm
In addition to this, a “daily” snapshot is taken at midnight every day, and a “weekly” snapshot made on Sunday at midnight every week.
We try to keep live in the filesystem at any one time:
- four hourly snapshots
- three daily snapshots
- one weekly snapshot
How Do I Use This Feature?
The best way to see how this feature works is to see it in action. Let’s look at a simple scenario where you’ve deleted a file in your “mail” directory and you want to get it back. There are three simple steps:
- Change to the
.snapshotdirectory
In the directory where you lost your file, there is a magically-hidden.snapshotdirectory. This directory does not appear tolsor any other command unless you explicitly request it. You shouldcd .snapshot, even though it is not visible:% cd .snapshot
Once in this directory, you’ll see eight other directories, each representing a snapshot:
% ls -lu total 32 drwx------ 2 dnb 4096 Jan 11 16:01 hourly.0 drwx------ 2 dnb 4096 Jan 11 12:01 hourly.1 drwx------ 2 dnb 4096 Jan 11 08:00 hourly.2 drwx------ 2 dnb 4096 Jan 10 20:01 hourly.3 drwx------ 2 dnb 4096 Jan 10 00:00 nightly.0 drwx------ 2 dnb 4096 Jan 9 00:00 nightly.1 drwx------ 2 dnb 4096 Jan 8 00:00 nightly.2 drwx------ 2 dnb 4096 Jan 11 00:01 weekly.0
We’ve used
ls -luhere to display when each snapshot was taken. - Find the version of the file you need
The easiest way to find the file you need is to use something likels -l */filenamefrom within this directory. For instance, if a user (with user name dnb) wanted to locate possible copies of the file userdata:% ls -l */userdata -rw------- 1 dnb 81920 Jan 10 22:53 hourly.0/userdata -rw------- 1 dnb 81920 Jan 10 22:53 hourly.1/userdata -rw------- 1 dnb 81920 Jan 10 22:53 hourly.2/userdata -rw-r--r-- 1 dnb 155648 Jan 10 19:33 hourly.3/userdata -rw------- 1 dnb 81920 Jan 10 22:53 weekly.0/userdata
This shows that there are five copies of the file available, four of which are the same.
- Copy the file you need back to the active filesystem
One can use the standardcpcommand:% cp hourly.0/userdata ../userdata.new
You’ll notice that I’ve copied the file from the snapshot to a different name than the original. This is especially important when you are retrieving an old version of a file if the original version still exists in the active filesystem.
cpwill not overwrite files with the same name as snapshot copies.
Also note that we’ve usedcp, notmv, since data in.snapshotis always read-only, so it can’t be moved out of that directory.
That’s the whole file recovery process!
Other Useful Snapshot Information
Here are a few other useful pieces of information about snapshots:
- Files found in .snapshot do not count against your quota.
- Unless you specify otherwise, most programs which crawl the filesystem will report the files in the
.snapshotdirectory. - The
.snapshotdirectory in your home directory and at the top of
the mounted filesystems offers another way to get to snapshot files. For instance,~/.snapshot/hourly.0/dir/dir/fileis the same as~/dir/dir/.snapshot/hourly.0/file.
We have modified /arch/gnu/bin/find and /arch/gnu/bin/du
to ignore these directories (use f ind-with-snapshots
and du-with-snapshots to get the original behavior), but all other programs like this will require some attention. For instance, Professor Wand was kind enough to point out that one has to include .snapshot/* in your .glimpse_exclude to get glimpse to ignore this directory.
For further information
If you still have questions about snapshots after reading this document, please send mail to systems@ccs.neu.edu. We hope you enjoy this useful facility.
You must log in to post a comment.