AFS and Performance

AFS, the Andrew File System, was developed at Carnegie Melon University in the early 1980s. It was part of a larger project to permeate the University with computer technology. Within a few years it was in wide use at many large universities throughout the world. The University of Michigan was one of those. The following is a very short description of AFS and how using it can penalize your applications in some cases. This description is necessarily simplified and many complexities have been glossed over.

AFS Internals

AFS diagram

AFS is a system to manage remote files. All of your files are on a central set of servers. The complete set of Unix file semantics are preserved. Your applications reference files as you'd expect and they just work with no changes in your code or your libraries. For this discussion the primary attribute of interest is that AFS is a FILE caching system. This means when a file is referenced or modified, the entire file is first moved between the server and client machine.

A key AFS assumption is that in the vast majority of cases, once a file is referenced, it will be referenced again relatively soon after that. This is its great strength, compared to NFS for instance. It is also the largest weakness of AFS. Consider the following Perl code (the syetem would behave the same, regardless of the programming language used):

    my $file = "myfile.txt";
    open(IN,$file) ||       # STEP A
        die "Unable to open '$file': $!\n";
    my $line = <IN>;
    close(IN);             # STEP B

The program simply reads the first line of a file. Exactly what really happens at STEP A depends on several factors:

When the file is closed, STEP B, there are things for AFS to do too:

From this sequence you can see that the first time a file is opened, the entire file is copied to the client machine. This clearly introduces some extra overhead and time. The second time the file is opened, however, response is very quick as there is little extra bookkeeping necessary. Each time the file is modified, however, the entire file is copied to the server - before the close() will complete.

Performance Issues

With all this file copying being done, it's something of a wonder the whole thing works and appears to perform reasonably. One of the reasons AFS is popular in large complex environments is because it allows centralized files and backup without the whole system falling apart when loaded. The cost of copying a file over the net can vary widely, based on

These variables can make a difference in the performance of your application, but over time, they even out. Usually we assume that AFS files are as fast as non-AFS and for the vast majority of files, this is true. There are some cases where we pay a very high penalty for having files in AFS, so let's explore these issues.

/tmp

First of all, realize that we can use non-AFS file systems. Every Unix machine has a local disk which is not in AFS and every Unix machine has a directory /tmp on the local disk. (The leading '/' says this is on the local disk and not in AFS.) Some things to know about /tmp are:

Knowing this about /tmp we can formulate some guidelines when using /tmp is a good idea for applications:

What is the Cost of AFS

Here are some examples of timings in creating various sized files in and out of AFS. This is not intended to be a definitive performance test, but just an indication of what you might observe.

Varying the count of small files

File Count File Size File System /tmp
Seconds
/tmp
Average/File
AFS
Seconds
AFS
Average/File
100 8192 /tmp 0 0.00 2 0.02
1000 8192 /tmp 2 0.00 13 0.01
10,000 8192 /tmp 21 0.00 154 0.02
100,000 8192 /tmp 212 0.00 > 20 minutes n/a

Varying the size of one file

File Count File Size File System /tmp
Seconds
/tmp
Average/File
AFS
Seconds
AFS
Average/File
5 102,400 /tmp 0 0.00 1 0.20
5 512,000 /tmp 0 0.00 3 0.60
5 1,024,000 /tmp 1 0.20 6 1.20
5 2,048,000 /tmp 1 0.20 13 2.60
5 3,072,000 /tmp 1 0.20 19 3.80
5 4,096,000 /tmp 1 0.20 26 5.20
5 5,120,000 /tmp 3 0.60 32 6.40
5 10,240,000 /tmp 3 0.60 64 12.80
5 20,480,000 /tmp 6 1.20 126 25.20
5 40,960,000 /tmp 13 2.60 270 54.00

From the tables above, it's pretty clear where the largest impact is when you make lots and lots of small files or very very large files. Not too surprisingly, the extra overhead is directly porportional to the total number of bytes in the file(s). Each individual file has its own additional overhead, but until the number of files get very large, you cannot easily detect it.

Conclusions

So what conclusions can we draw? When does the overhead for a file in AFS begin to get noticable?

There are extemely important reasons for keeping your data in AFS and this document is NOT trying to induce you to move your data out of AFS. There are, however, significant speed improvements to be had by selectively moving certain kinds of data out of AFS. To paraphrase Bill Duren, "It's all about the number of bytes". Consider the following when analyzing your application: