HowTo:tarmark

From Greg Porter's Wiki

Jump to: navigation, search

Contents

Overview

Blah, blah, more here soon.

I'm old school, I guess. I did everything at the command line as root. To actually get root on a Nexenta box (not nmc or admin or some other (crippled) user), do:


nmc@filer6:/$ option expert_mode=1 -s 
nmc@filer6:/$ !bash                                                             
You are about to enter the Unix ("raw") shell and execute low-level Unix command(s). 
Warning: using low-level Unix commands is not recommended! Execute?  Yes

root@filer6:/volumes#

Getting Ready to Run tarmark

Prepare the Host

Basically you need a ZFS capable host. I commonly use either OpenSolaris of Nexenta. I prefer Nexenta. I have some detailed load notes on Nexenta.

So, for example, say you chose to use Nexenta.

Think about how many drives you have. You need at least two. One to boot from (syspool), and one for testing. Ideally you'd have more.

Get Nexenta media.

Burn Nexenta to CD.

Boot from Nexenta CD.

Load Nexenta.

On the console, do the text based registration and preliminary set up.

Use the GUI to do remaining set up as prompted.

Log in as root. Run the command 'setup appliance upgrade'. This will download the latest packages and install them as available. This may take a while. Reboot. Do this sequence again, and make sure that everything is the current shipping version.

Log in as root. Try the command 'setup appliance upgrade nms'. My understanding is that the 'setup appliance upgrade' (without nms) should upgrade nms as well, but it sometimes doesn't appear so. If needed, this will download the latest packages and install them as available. This may take a while. Reboot. Do this sequence again, and make sure that everything is the current shipping version.

Configure the Host

Configure the test drive(s) into a testing zpool. You could use the GUI or the command line. I chose to call mine testpool.

Make a ZFS file system in the testing pool to store the source tarball(s) in. You need at least one tarball to unpack. You could use the GUI (Data Management/Shares) to make the ZFS file system or use the command line. I used the command line. I called mine tarballs. So far mine looks like:

NAME                     USED  AVAIL  REFER  MOUNTPOINT
syspool                 5.66G  67.2G    36K  legacy
syspool/dump            2.80G  67.2G  2.80G  -
syspool/rootfs-nmu-000  1.83G  67.2G  1.23G  legacy
syspool/rootfs-nmu-001  33.5K  67.2G  1.15G  legacy
syspool/rootfs-nmu-002  33.5K  67.2G  1.21G  legacy
syspool/rootfs-nmu-003  62.5K  67.2G  1.22G  legacy
syspool/swap            1.03G  68.2G    16K  -
testpool                 142K   147G    32K  /volumes/testpool
testpool/tarballs         31K   147G    31K  /volumes/testpool/tarballs

Put a tarball in the tarball source directory. I used an OpenSolaris source tarball, about 8MB in size. Any old tarball will work. You could use multiple ones if desired.

Some source tarballs you may consider are compressed as well, usually with bzip2. They typically have an file extension of tar.bz2. If you use tarballs like this, then each untar process will have to spend some CPU time dealing with the bz2 compression. If you do tarmark against many, many bz2 tarballs, then you will see the system get CPU bound, and lots of bzip2 action in prstat. I suggest uncompressing the tarball first with a command like ' bunzip2 /volumes/testpool/tarballs/*bz2 '. This will leave an uncompressed tarball for tarmark to deal with.

Getting Ready to run tarmark against a remote host via NFS

TBP

The basic concept is:

Follow the instructions above. Make a testpool and tarball source directory on the filer.

On the filer, do zfs commands to allow NFS clients to mount and write to /volumes/testpool (and all subdirectories).

Do tarmark --prep on the filer, to make the testing ZFS file systems.

On the client, mount the filer's exported /volumes/testpool filesystem. Maybe you could mount it as /volumes/testpool on the client, just to keep things simple.

Then do tarmark --run on the client.

When you are done running untars, then do tarmark --cleanup on the filer.

Getting Ready to run tarmark against a remote host via iSCSI

TBP

Getting tarmark

Hmm. I need some cool sourceforge site or something. For now, I guess email me at mailto:greg@greg.porter.name and I'll email you the script.

Get tarmark. Put it somewhere the user root can get to. I made the directory /root/scripts/tarmark, and I ran mine from there.

Configure tarmark. I'm not a scripting guru, so tarmark is relatively unsophisticated. A better script would actually have some error checking, and interactively prompt you for configuration information. For now, you will have to insert your configuration details into the script yourself with an editor. There's three basic settings you must provide:

  • MountPointCount

This specifies how many testing ZFS file systems tarmark will make. For simple runs, change this to a smaller number, like 10. For more realistic runs, then make this larger, say like 100. tarmark will hang for however long it takes to make the testing ZFS file systems, so don't make this 10000 unless you don't mind waiting a while. Mine looks like:

MountPointCount=12;
  • Zpool

This specifies the name of the testing zpool, the parent zpool that tarmark will create it's test ZFS file systems in. I used testpool, which Nexenta automatically mounts for me under /volumes, so the full path is /volumes/testpool.

Zpool="testpool";
  • TestSource

This specifies the name of the ZFS file system that tarmark copies the tarballs from. This needs to be in the parent zpool, and needs to have a tarball in it. I used tarballs, which Nexenta automatically mounts for me under /volumes/testpool, so the full path is /volumes/testpool/tarballs.

TestSource="tarballs";

Think about overall size. The testing zpool will need enough space to hold $MountPointCount times ( <size of tarball(s)> + <size of unpacked tarballs> ). So a $MountPointCount of 10 with an 8MB tarball will need something like 10 X 8MB or 80MB of space. A better script would do some error checking and not let you specify parameters that the zpool couldn't actually handle.

Using tarmark in preparation mode

More soon.

Basically, once the host is ready as specified above, then run

./tarmark --prep

This will make however many test ZFS file systems that you specified, and copy the tarball you specified to each. If you do this against thousands of test directories, it may take a while.

Using tarmark in run mode

More soon.

Once you have run tarmark --prep, the you should be ready. You have lots of test file systems with one (or more?) tarball(s) in them.

When ready for a run, then do

./tarmark --run

This will spawn however many untars you configured. These go into the background, so you will see many separate tar process if you look at ps or prstat.

Each untar is timed with ptime. Output from each untar is redirected into the directory it is working against and goes into tarmark.log. So stderr and stdout for the untar working on /volumes/testpool/0 will go into /volumes/testpool/0/tarmark.log, the untar working against /volumes/testpool/1 will go into /volumes/testpool/1/tarmark.log and so forth.

You can run tarmark --run multiple times. This will just make more and more untar processes. So if you run it 5 times in quick succession, then you might have as many as 5 untars working against each tarball. Since they are all appending to the same log file, then you should see ptime dump timing statistics from all 5 untars into the same file.

Using tarmark in cleanup mode

More soon.

Once you have run tarmark and are tired of playing with it for now, then you can use the cleanup option to get rid of all the testing ZFS file systems.

Run

./tarmark --cleanup

This will destroy all of the testing ZFS file systems, like /volumes/testpool/0, /volumes/testpool/1, and so forth.

It leaves /the parent zpool (say volumes/testpool) and the tarball source (say /volumes/testpool/tarballs) for you to destroy by hand, if desired.

Example local run

More soon.

Configure tarmark. Usually all you have to do is change MountPointCount to the desired number of test file systems. I'll use 10 in this example.

root@filer6:~/scripts/tarmark# vi tarmark
...
MountPointCount=10;

Before running prep, if you are following the notes above, you'l have a parent zpool, and a tarball source zfs file system. Mine looks like:

root@filer6:~/scripts/tarmark# zfs list
NAME                     USED  AVAIL  REFER  MOUNTPOINT
syspool                 5.68G  67.2G    36K  legacy
syspool/dump            2.80G  67.2G  2.80G  -
syspool/rootfs-nmu-000  1.85G  67.2G  1.24G  legacy
syspool/rootfs-nmu-001  33.5K  67.2G  1.15G  legacy
syspool/rootfs-nmu-002  33.5K  67.2G  1.21G  legacy
syspool/rootfs-nmu-003  62.5K  67.2G  1.22G  legacy
syspool/swap            1.03G  68.2G    82K  -
testpool                50.4M   147G    36K  /volumes/testpool
testpool/tarballs       31.1M   147G  31.1M  /volumes/testpool/tarballs

root@filer6:~/scripts/tarmark# ls -al /volumes/testpool/tarballs/
total 23836
drwxr-xr-x 2 root root        3 Jul  5 14:00 .
drwxr-xr-x 3 root root        3 Jul  5 13:36 ..
-rw-r--r-- 1 root root 24300032 Jul  4 22:48 on-closed-bins-nd.i386.tar

Run prep. This looks like:

root@filer6:~/scripts/tarmark# ./tarmark --prep

You have specified /volumes/testpool as the parent zpool for testing.
You have specified /volumes/testpool/tarballs as the tarball source.

Preparing 10 test ZFS file systems.
Making test file system /volumes/testpool/0
Copying tarball(s) to   /volumes/testpool/0
Making test file system /volumes/testpool/1
Copying tarball(s) to   /volumes/testpool/1
Making test file system /volumes/testpool/2
Copying tarball(s) to   /volumes/testpool/2
Making test file system /volumes/testpool/3
Copying tarball(s) to   /volumes/testpool/3
Making test file system /volumes/testpool/4
Copying tarball(s) to   /volumes/testpool/4
Making test file system /volumes/testpool/5
Copying tarball(s) to   /volumes/testpool/5
Making test file system /volumes/testpool/6
Copying tarball(s) to   /volumes/testpool/6
Making test file system /volumes/testpool/7
Copying tarball(s) to   /volumes/testpool/7
Making test file system /volumes/testpool/8
Copying tarball(s) to   /volumes/testpool/8
Making test file system /volumes/testpool/9
Copying tarball(s) to   /volumes/testpool/9
Prep complete.

After prep is run, you should see the file systems prep just built. They should have tarball(s) in them.

root@filer6:~/scripts/tarmark# zfs list
NAME                     USED  AVAIL  REFER  MOUNTPOINT
syspool                 5.68G  67.2G    36K  legacy
syspool/dump            2.80G  67.2G  2.80G  -
syspool/rootfs-nmu-000  1.85G  67.2G  1.24G  legacy
syspool/rootfs-nmu-001  33.5K  67.2G  1.15G  legacy
syspool/rootfs-nmu-002  33.5K  67.2G  1.21G  legacy
syspool/rootfs-nmu-003  62.5K  67.2G  1.22G  legacy
syspool/swap            1.03G  68.2G    82K  -
testpool                 252M   146G    48K  /volumes/testpool
testpool/0              23.3M   146G  23.3M  /volumes/testpool/0
testpool/1              23.3M   146G  23.3M  /volumes/testpool/1
testpool/2              23.3M   146G  23.3M  /volumes/testpool/2
testpool/3              23.3M   146G  23.3M  /volumes/testpool/3
testpool/4              23.3M   146G  23.3M  /volumes/testpool/4
testpool/5              23.3M   146G  23.3M  /volumes/testpool/5
testpool/6              23.3M   146G  23.3M  /volumes/testpool/6
testpool/7              23.3M   146G  23.3M  /volumes/testpool/7
testpool/8              23.3M   146G  23.3M  /volumes/testpool/8
testpool/9                31K   146G    31K  /volumes/testpool/9
testpool/tarballs       23.3M   146G  23.3M  /volumes/testpool/tarballs

root@filer6:~/scripts/tarmark# ls -al /volumes/testpool/0        
total 23836
drwxr-xr-x  2 root root        3 Jul  5 14:03 .
drwxr-xr-x 13 root root       13 Jul  5 14:03 ..
-rw-r--r--  1 root root 24300032 Jul  5 14:03 on-closed-bins-nd.i386.tar

Run tarmark. This will spawn however many untars you configured. The run looks like:

root@filer6:~/scripts/tarmark# ./tarmark --run

You have specified /volumes/testpool as the parent zpool for testing.
You have specified /volumes/testpool/tarballs as the tarball source.

Spawning background processes to untar all tarball(s) 
in each test ZFS mount point.

There will be 10 background processes spawned.
Spawning bg proc 0
Spawning bg proc 1
Spawning bg proc 2
Spawning bg proc 3
Spawning bg proc 4
Spawning bg proc 5
Spawning bg proc 6
Spawning bg proc 7
Spawning bg proc 8
Spawning bg proc 9

If you do a ps immediately after the run, while untars are still working you'll see:

root@filer6:~/scripts/tarmark# ps
  PID TTY         TIME CMD
 2453 pts/1       0:00 bash
 2461 pts/1       0:02 nmc
 2514 pts/1       0:01 bash
 1391 pts/1       0:00 ptime
 1392 pts/1       0:00 tar
 1399 pts/1       0:00 tar
 1401 pts/1       0:00 ps
 1398 pts/1       0:00 ptime
 1400 pts/1       0:00 tar
 1395 pts/1       0:00 ptime

Each untar dumps a log file. If you look at one, you'll see the output from ptime in it.

root@filer6:~/scripts/tarmark# ls -al /volumes/testpool/0
total 23838
drwxr-xr-x  3 root root        5 Jul  5 14:04 .
drwxr-xr-x 13 root root       13 Jul  5 14:03 ..
drwxr-xr-x  3 root 8190        6 Jun 26 02:41 closed
-rw-r--r--  1 root root 24300032 Jul  5 14:03 on-closed-bins-nd.i386.tar
-rw-r--r--  1 root root      204 Jul  5 14:04 tarmark.log

root@filer6:~/scripts/tarmark# cat  /volumes/testpool/0/tarmark.log 
Mon Jul  5 14:04:39 PDT 2010

real        1.843788676
user        0.007886822
sys         0.148401620

You can immediately run it again, or immediately run it multiple times (no need to clean up after every one...) Subsequent run results are appended to the end of tarmark.log, so you'll see multiple ptime results.

When you are finished, to get rid of the test file systems, do:

root@filer6:~/scripts/tarmark# ./tarmark --cleanup

You have specified /volumes/testpool as the parent zpool for testing.
You have specified /volumes/testpool/tarballs as the tarball source.

Cleaning up test ZFS file systems.
Destroying test file system /volumes/testpool/0
Destroying test file system /volumes/testpool/1
Destroying test file system /volumes/testpool/2
Destroying test file system /volumes/testpool/3
Destroying test file system /volumes/testpool/4
Destroying test file system /volumes/testpool/5
Destroying test file system /volumes/testpool/6
Destroying test file system /volumes/testpool/7
Destroying test file system /volumes/testpool/8
Destroying test file system /volumes/testpool/9
Clean up complete.
If desired, you must manually remove the parent zpool and tarball source.

Now, you should see just the 2 file systems you started with.

root@filer6:~/scripts/tarmark# zfs list
NAME                     USED  AVAIL  REFER  MOUNTPOINT
syspool                 5.68G  67.2G    36K  legacy
syspool/dump            2.80G  67.2G  2.80G  -
syspool/rootfs-nmu-000  1.85G  67.2G  1.24G  legacy
syspool/rootfs-nmu-001  33.5K  67.2G  1.15G  legacy
syspool/rootfs-nmu-002  33.5K  67.2G  1.21G  legacy
syspool/rootfs-nmu-003  62.5K  67.2G  1.22G  legacy
syspool/swap            1.03G  68.2G    82K  -
testpool                42.3M   147G    39K  /volumes/testpool
testpool/tarballs       23.3M   147G  23.3M  /volumes/testpool/tarballs

Interesting Results To Date

Don't use a tarball that is also compressed with bzip2. These have a .tar.bz2 file extension. If you try a tarmark run against 10 or more like this, the host will immediately become CPU bound as it deals with the compression.

I suggest you unzip it first before the prep phase by hand like so:

bunzip2 /volumes/testpool/tarballs/tarball.tar.bz2

Maybe you should try bigger tarballs. I used a 8MB OpenSolaris tarball from http://dlc.sun.com/osol/on/downloads/20100704/. (I bunzip2-ed it first.) Running it against 10 file systems (I set MountPointCount to 10) is over in a blink of an eye. Running it against 100, it only takes a few seconds, like 10 seconds. Running it against 1000 takes a while, like an hour or two.

For some strange reason, I started out this whole process thinking that untar was inherently a writing process. When you do this against 10 or 100, it does appear that way, the filer get busy with lots of writes. But on larger runs, like 1000, it isn't busy writing, it's busy reading. I guess that makes sense, it has to read the tarball before it can do anything with it.

I don't understand how the background processes are spawned. I think it is memory bound somehow. If I kick off 1000, then something like 750 are immediately made. Then subsequent ones take like minutes for each one to spawn. Strange.

Personal tools