Archive

Posts Tagged ‘Fractional Reserve’

NetApp commands for Volume / LUN management

February 19, 2012 Leave a comment


vol options <volname> fractional_reserve 0

This command sets the fractional reserve to zero percent, down from the default of 100 percent. Note that fractional reserve only applies to LUNs, not to NAS storage presented via CIFS or NFS.

snap autodelete trigger snap_reserve

This sets the trigger at which Data ONTAP will begin deleting Snapshots. In this case, Snapshots will start getting deleted when the snap reserve for the volume gets nearly full. The current size of the snap reserve can be viewed for a particular volume with the “snap reserve” command.

snap autodelete defer_delete none

This command instructs Data ONTAP not to exhibit any preference in the types of Snapshots that are deleted. Options for this command include “user_created” (delete user-created Snapshot copies last) or “prefix” (Snapshot copies with a specified prefix string).

snap autodelete target_free_space 10

With this setting in place, Snapshots will be deleted until there is 10% free space in the volume.

snap autodelete on

Now that the Snapshot autodelete options have been configured, this command will actually turn the functionality on.

vol options <volname> try_first snap_delete

When a FlexVol runs into an issue with space, this option tells Data ONTAP to first try to delete Snapshots in order to free up space. This command works in conjunction with the next command:

vol autosize on

This enables Data ONTAP to automatically grow the size of a FlexVol if the need arises. This command works hand-in-hand with the previous command; Data ONTAP will first try to delete Snapshots to free up space, then grow the FlexVol according to the autosize configuration options. Between these two options—Snapshot autodelete and volume autogrow—you can reduce the fractional reserve from the default of 100 and still make sure that you don’t run into problems taking Snapshots of your LUNs.

Fractional Reservation – LUN Overwrite (Continued)

May 2, 2011 Leave a comment

I seem to get questioned about Fractional Reservation at least once a week, and find myself explaining it over and over. I have found quite a simple way of explaining this now, unfortunately much of the documentation doesn’t make it quite so simple to understand. I’ve got a much better understanding of what it actually is now. It makes more sense as NetApp have changed it’s description in some places, in Operations Manager 3.7 and above it’s now referred to as “Overwrite Reserved Space”.
This is easiest to explain with pictures. We should have all seen a standard snapshot graphic. When we snapshot the filesystem, we take a copy of the file allocation tables and this locks the data blocks in place. Any new or changed data is written to a new location, and the changed blocks are preserved on disk.

snap00285.bmp
So basically a snapshot locks the data blocks of the data referenced by it in place. This means that any new or changed blocks (D1 in the above graphic) in the active file-system are written to a different location. This concept is fundamentally the same as what Fractional Reservation is.

As the LUN gets filled up with data, we take a snapshot and that data is locked in place. Potentially all this data could change, and we need to guarantee not only this existing data, but also the potential that we need to write totally new data blocks. Any changed data gets written into the Fractional Reservation area rather than into the area that the existing LUN data is in. (I know that in reality this is spread across all the disks and these areas don’t actually exist, but it makes it easier to visual and understand explaining it this way). As changed data blocks are written, old data blocks get preserved in the snapshot reservation area. Fractional Reservation is preserving the maximum rate of change we could potentially get.

snap00286.bmp

Don’t confuse this with the snapshot reservation area. The snapshot reservation includes saved data blocks from previous snapshots, where-as the Fractional Reservation is protecting your Active File System (AFS in the above graphic) from it’s own potential rate of change.

So the reason a LUN may be switched offline if the fractional reservation area is set to 0, is that the filer needs to protect the existing data that is locked between the active file system and the most recent snapshot, plus any additional changes that happen to the active file system. If the volume / LUN / frac-res and snap reserve are full, then this space is not available and the filer needs to take action to prevent these writes from failing. The filer guarantees no data loss, but with no space free and nowhere to write the new data, it has to offline the LUN to prevent the writes from failing.

So fractional reservation is in constant use by the filer as an over-write area for the LUN. Without it, you need to make sure that sufficient space is free to allow for the maximum rate of change you would expect. Defaults are good, but trimming down on these you need to monitor the rate of change and make sure the worst case scenario is within a buffer of free space that you allow. If you reduce the Fractional Reservation to 0, you need to make sure the rate of change is within the volume size, or you need to make sure the volume can auto-grow when required or even snap auto-delete to reduce the reserved blocks and free up space (although I am not a huge fan of snap auto-delete for various reasons).

And that is Fractional Reservation!

Quick last thoughts… A-SIS won’t make any difference to the Fractional Reservation area as such, but it can help as the data blocks within the LUN will get de-duped, but the Fractional Reservation area per-se would always be required as you need this LUN over-write area for changing data. If you reduce the footprint of the non-changing data with A-SIS, you reduce the potential reservation area required. Space savings aren’t apparenty when you have things thick provisioned. Reducing Fractional Reservation andthin provisioning can be a dangerous game.

This was changed when SnapManager (SnapDrive) was released. If you use SnapManager (SnapDrive) to create volumes etc then it will automatically apply the best practices to that volume when it is created, for SAN the best practice is to have 0% snap reserve for that volume.  When using CIFS or NFS the snapshot reserve is needed because the filer manages the share, it presents it out to the users. For example if you have a share of 100Gb on CIFS with a 10% SS reserve then the space actually available for writing to by the users is 90Gb (as 10% is reserved).

However, this is different when creating a volume which has a LUN inside of it. You create the volume and then place the LUN inside of the volume, the LUN is now managed via iSCSI (in this case) by a server with the filer only taking care of the presenting of the LUN. In this case you must size your LUN accordingly to leave enough space in your volume for a snapshot, you reserve the space.  If you were to have a space reserve on this volume then you would actually be reserving 2 blocks of space, think of it this way.

One thing to note btw is when you create a LUN using SnapDrive, it will ask you how large you want the LUN to be and how many snapshots you wish to keep. It will then create the volume with the LUN inside for you. Looking at this volume on the filer you will see that the volume size is equal to the LUN size + however manay snapshots you wish to retain but it has 0% snap reserve.

The most important rule is to monitor and understand your data. If you understand your rate of change, you can tweak a lot of areas of the storage system.

NetApp Fractional Reserve

April 25, 2011 Leave a comment

Fractional Space Reservation
I see a lot of posts (good ones) about FSR (fractional space reservation) for LUN-based volumes, and while they do a great job of showing you the concepts, it might be nice to see an example of how it works. This can enable IT staff to test this capability in their own environments to see how it works.

The first thing we need to do is to create a 200MB volume to hold a LUN. Let’s call it voltest and set it up for holding LUNs:

fas> vol create voltest aggr1 200m

The new language mappings will be available after reboot Creation of volume ‘voltest’ with size 200m on containing aggregate ‘aggr1’ has completed.
fas> snap reserve voltest 0
fas> snap sched voltest 0
Now that we’ve configured a new volume, let’s set the FSR to 0%. This ensures that when snapshots are taken in a LUN-based volume, no additional space is set aside. The FSR default is 100% — meaning, if you try to take a snapshot, ONTAP will attempt to set aside 100% of the used space inside the LUNs in the volume to ensure you have space if all the blocks change. This is a great feature, but definitely takes up a lot more space in your volumes:

fas> vol options voltest fractional_reserve 0
Okay, so now when we take a snapshot, no additional space will be reserved in the volume. Although we’ve done this, we aren’t done.

There are two additional options we should consider using with this volume, both of which can be immensely useful for allowing ONTAP to dynamically handle when using more space for snapshots than anticipated. Sometimes users change more data, and our snapshots require more space. Sometimes we don’t delete manually created snapshots. And sometimes we just grow faster than expected.

The first option to configure is vol autosize. This option lets us automatically increase the size of a volume if we start to use more snapshot space:

fas> vol autosize voltest -i 20m -m 300m
vol autosize: Flexible volume ‘voltest’ autosizesettings UPDATED.
This command tells ONTAP that I’ll let you increase the size of the volume 20MB at a time, but only up to 300MB. If for some reason we need to use more space than expected, ONTAP can grow the volume as needed.

The second option to configure is snap autodelete. This feature tells ONTAP that it can start to delete snapshots if it finds that it needs even more space in the volume:

fas> snap autodelete voltest on
snap autodelete: snap autodelete enabled
fas> snap autodelete voltest
snapshot autodelete settings for voltest:
state : on
commitment : try
trigger : volume
target_free_space : 20%
delete_order : oldest_first
defer_delete : user_created
prefix : (not specified)
destroy_list : none
So when we print out the default parameters for snap autodelete, we see how there are a lot of possible tunable parameters to choose from. For now, let’s leave these at the default settings, although we can always change these in the future if we want to test other possible features. For example, if we change the target_free_space from 20% to 10%, it will delete fewer snapshots when snap autodelete is triggered, so that only 10% space is left in the volume instead of 20%. The key here is that ONTAP says, if space gets really tight I’ll start deleting snapshots for you automatically in order to make sure the LUN stays online. Good stuff.

Let’s finally check out the order of these two features. Clearly they are great, but let’s make sure we try to grow the volume before we start deleting snapshots. This ensures we use space up to a certain limit, and once we hit our volume growth maximums, start deleting snapshots:

fas> vol options voltest
nosnap=off, nosnapdir=off, minra=off, no_atime_update=off, nvfail=off,
ignore_inconsistent=off, snapmirrored=off, create_ucode=on,
convert_ucode=on, maxdirsize=31457, schedsnapname=ordinal,
fs_size_fixed=off, compression=off, guarantee=volume, svo_enable=off,
svo_checksum=off, svo_allow_rman=off, svo_reject_errors=off,
no_i2p=off, fractional_reserve=0, extent=off, try_first=volume_grow,
read_realloc=off, snapshot_clone_dependency=off
You can see the try_first parameter is set to volume_grow (the default). This ensures we try to use the volume growth feature first before autodeleting snapshots.

Now that we’ve created a volume, let’s create a LUN. I’m going to manually create a LUN, but it’s just as easy to do this with SnapDrive (for UNIX or Windows) if you want. I’ll create the LUN in ONTAP and manually map it to a Windows server (you’ll have to manually rescan the disks, create a partition and set a drive letter in Disk Management through the MMC, or you can just use the Create Disk feature with SnapDrive for Windows):

fas> lun create -s 100m -t windows /vol/voltest/luntest.lun
lun create: created a LUN of size: 102.0m (106928640)
fas> igroup create -i viaRPC.iqn.1991-05.com.microsoft:w2k3srvr.microsoft.com iqn.1991-05.com.microsoft:w2k3srvr.microsoft.com
fas> lun map /vol/voltest/luntest.lun viaRPC.iqn.1991-05.com.microsoft:w2k3srvr.microsoft.com

If you have SnapDrive, I highly recommend using it to create your LUNs instead of manually doing it, but I also wanted to show you can do this without SnapDrive if you really wanted to.

Now that we’ve created our volume, setup snapshot autodeletion and automatic volume growth, and we’ve also created and mapped a LUN, let’s actually test how this functionality works.

First, we’ll manually create a snapshot for the voltest volume and review how much disk space is taken up:

fas> snap create voltest testsnap.1
fas> df -hr voltest
Filesystem total used avail reserved Mounted on
/vol/voltest/ 200MB 102MB 97MB 0MB /vol/voltest/
/vol/voltest/.snapshot 0GB 0GB 0GB 0GB /vol/voltest/.snapshot
fas> snap list voltest
Volume voltest
working…

%/used %/total date name
———- ———- ———— ——–
0% ( 0%) 0% ( 0%) Mar 23 10:16 testsnap.1

Okay, great. Snapshot is taken, no space is used (snapshots don’t really take up space until there are changes in the volume), and there is no reserved space.

The next thing we do is get a copy of dd for Windows to create large random files quickly and painlessly. This is a very handy tool for the purposes of testing the behavior of ONTAP — of course, you can use anything you’d like, even your own files. In this example, we’ve set the drive letter for the new LUN as D:, so all of our new files will be written to that drive. Also note our input device is /dev/random so we’re writing lots of random data to the files:

C:\Temp> dd.exe of=d:80mbfile.txt bs=1M count=80 if=/dev/random
rawwrite dd for windows version 0.5.

80+0 records in
80+0 records out
C:\Temp> dir d:
Volume in drive D is New Volume
Volume Serial Number is 5261-7E0F
Directory of D:\
03/23/2009 11:23 AM 83,886,080 80mbfile.txt
1 File(s) 83,886,080 bytes
0 Dir(s) 19,428,864 bytes free
Okay, so we’ve written an 80MB file to a 100MB LUN. What happened to the volume? Let’s take a look:

fas> snap list voltest
Volume voltest
working…
%/used %/total date name
———- ———- ———— ——–
44% (44%) 37% (37%) Mar 23 10:16 testsnap.1
fas> vol size voltest
vol size: Flexible volume ‘voltest’ has size 220m.
fas> df -hr voltest
Filesystem total used avail reserved Mounted on
/vol/voltest/ 220MB 182MB 37MB 0MB /vol/voltest/
/vol/voltest/.snapshot 0MB 80MB 0MB 0MB /vol/voltest/.snapshot
So it looks like our volume grew by 20MB, and we’re holding a whole lot of space inside that snapshot (80MB specifically). Remember we took the first testsnap.1 snapshot with nothing in the volume or in the LUN, and now that snapshot has to hold 80MB of data from that new file we created! But so far, everything looks great. There was even an system alert in ONTAP to tell us the volume grew!:

Mon Mar 23 10:23:46 EST [wafl.vol.autoSize.done:info]: Automatic increase size of volume ‘voltest’ by 20480 kbytes done.

Now that we’ve seen the volume automatically grow, let’s make another 30MB file to grow it some more (to capacity):

C:\Temp> dd.exe of=d:30mbfile.txt bs=1M count=30 if=/dev/random
rawwrite dd for windows version 0.5.

Error writing file: 112 There is not enough space on the disk
19+0 records in
18+0 records out
C:\Temp> dir d:
Volume in drive D is New Volume
Volume Serial Number is 5261-7E0F
Directory of D:\
03/23/2009 01:40 PM 18,874,368 30mbfile.txt
03/23/2009 11:23 AM 83,886,080 80mbfile.txt
2 File(s) 102,760,448 bytes
0 Dir(s) 554,496 bytes free
As you can see, we’ve filled up the LUN at this point. Let’s see how ONTAP reacts:

Mon Mar 23 12:41:01 EST [wafl.vol.autoSize.done:info]: Automatic increase size of volume ‘voltest’ by 20480 kbytes done.

The volume has grown again by a bit more. Now let’s get a little aggressive. We’ll take another snapshot, delete all the files on the LUN, and then fill it up again. When we do this, we’ll see an number of things happen:

1. The snapshot growth will be more than the space in the volume, so the size of the volume will have to grow;
2. The volume will not be able to grow beyond its maximum (300MB), and so snapshots will start having to be deleted;
3. The number of snapshots to be deleted will depend on the target free space (20%), so we may end up losing more than one snapshot.

So let’s make a new snapshot in ONTAP for the volume voltest:

fas> snap create voltest testsnap.2
fas> snap list voltest
Volume voltest
working…
%/used %/total date name
———- ———- ———— ——–
0% ( 0%) 0% ( 0%) Mar 23 12:44 testsnap.2
49% (49%) 41% (41%) Mar 23 10:16 testsnap.1
Then we’ll delete the files on the Windows server, and make another really big file:

C:\Temp> del d:\*mbfile.txt
C:\Temp> dd.exe of=d:95mbfile.txt bs=1M count=95 if=/dev/random
rawwrite dd for windows version 0.5.

This program is covered by the GPL. See copying.txt for details
95+0 records in
95+0 records out
C:\Temp> dir d:
Volume in drive D is New Volume
Volume Serial Number is 5261-7E0F
Directory of D:\
03/23/2009 01:46 PM 99,614,720 95mbfile.txt
1 File(s) 99,614,720 bytes
0 Dir(s) 3,700,224 bytes free
Here are the logs that appear in ONTAP on the storage controller when we start making the final large file:

Mon Mar 23 12:46:19 EST [wafl.vol.full:notice]: file system on volume voltest is full
Mon Mar 23 12:46:22 EST [wafl.vol.autoSize.done:info]: Automatic increase size of volume ‘voltest’ by 20480 kbytes done.
Mon Mar 23 12:46:26 EST [wafl.vol.autoSize.done:info]: Automatic increase size of volume ‘voltest’ by 20480 kbytes done.
Mon Mar 23 12:46:32 EST [wafl.vol.autoSize.done:info]: Automatic increase size of volume ‘voltest’ by 20480 kbytes done.
Mon Mar 23 12:46:45 EST [wafl.vol.autoSize.fail:info]: Unable to grow volume ‘voltest’ to recover space: Volume cannot be grown beyond maximum growth limit
Mon Mar 23 12:46:46 EST [wafl.volume.snap.autoDelete:info]: Deleting snapshot ‘testsnap.1’ in volume ‘voltest’ to recover storage
Once we hit our maximum volume growth, we can no longer take space from the aggregate for the volume. At that point, ONTAP moves to the next option which is to delete snapshots to ensure sufficient space in the volume. It looks at what it requires from a free space standpoint and starts deleting from the oldest snapshot to recover storage space within the volume. As the log shows, the oldest snapshot testsnap.1 is deleted, leaving testsnap.2:

fas> snap list voltest
Volume voltest
working…
%/used %/total date name
———- ———- ———— ——–
49% (49%) 32% (32%) Mar 23 12:44 testsnap.2
While this example is great, there are a lot of other things you can do with a FSR value less than 100%. Here are a few other data points:

• In some Exchange environments, it may be ideal to set the FSR to a small percentage and use similar options as above, but set the defer_delete option to prefix and set the prefix value to exchsnap, sqlsnap, etc., depending on the name of the snapshot for the SnapManager product (if used). Check out TR-3578 as a starting point, as there are settings included in the document for base Exchange environments, and then modify as needed for your environment.
• It may be good to use an extremely thin provisioned space by turning off LUN reservations. This post doesn’t go into that detail, but again, test it out in your environment by showing how space can be taken as you make changes in the LUN on the server and how ONTAP reacts.
• While it’s great to see these options in action, it’s always best to size your volumes according to the LUN size(s) and the snapshot growth you expect. That way you always provision your volumes to match up with your expectations and only have to trigger volume autosize or snapshot autodelete if absolutely necessary. The goal here is to maintain your SLAs for snapshot retention and not have to autodelete anything.
• Remember you can always keep FSR at 100%; if you have the capacity and are simply sizing your aggregates for spindle performance, you may have plenty of capacity and you don’t need to change it.

Categories: netapp Tags: , ,