Archive

Posts Tagged ‘netapp’

NetApp Cluster-Mode Snapshots

February 28, 2012 Leave a comment

NetApp Snapshottechnology is famous for its uniqueness and ease of use. For example, unlike most competing snapshot technologies, you can create a volume snapshot in seconds, regardless of the size of the volume. It is very efficient in storage space usage.  And you can create hundreds of snapshot copies based on your needs. These excellent features are there for you in Data ONTAP 8.1 operating in cluster-mode.

The familiar 7-mode commands, such as snap reserve, snap sched and snap list, are still operational in cluster-mode. But cluster-mode has a new set of commands (see Fig. 1); which you can explore by simply typing the command (e.g., snap create) and hit return (see Fig. 2).

fig1.PNG

Figure 1: Cluster-mode snapshot commands

fig2.PNG

Figure 2: Cluster-mode snap create’s usage

One thing I did observe is that the cluster-mode snapshot policy seems to take precedence over the 7-mode snap sched command. The default snapshot policy in cluster-mode is that the hourly, daily and weekly snapshot schedules are enabled, with the following frequency:

  • Hourly: 6
  • Daily: 2
  • Weekly: 2

If you try to set snapshot schedule using the command snap sched0 0 0, meaning don’t take any scheduled snapshot, you will be surprised that this command is ignored; and hourly, daily and weekly snapshot copies are taken.

There are several ways to change the default snapshot policy in cluster-mode. Here are some examples:

a)     Use the snap policy modify command to disable the policy

b)     Under the  scope of snapshot policy, use add-schedule, modify-schedule, or remove-schedule to change it to your liking (see Fig. 3)

c)      You can also use snap policy create to create new snapshot policy

fig3.PNG

Figure 3: Cluster-mode snapshot policy commands

In summary, the 7-mode commands, by and large, are still valid for snapshot management. But be aware of the new cluster-mode snapshot policy which may take precedence.

Categories: netapp Tags: , ,

NetApp Powershell with Snaps & Cluster-Mode

February 28, 2012 Leave a comment

Many Powershell cmdlets have been developed for NetApp Data ONTAP. This is true for both 7-mode and cluster-mode. Since the cluster-mode cmdlets are relatively new, we’ll take a close look at it here, using a couple cluster-mode cmdlets to demonstrate how to create a volume snapshot and then restore it.

First, two prerequisites:

·         Powershell v2.0, which you can download and install from the Microsoft website here.

·         Data ONTAP Powershell Toolkit v1.7, or DataONTAP.zip, you can download from NetApp Community here (see Fig.1). You need to login with your NOW credential to download.

fig1_2012feb27.PNG

Figure 1: Download DataONTAP.zip from NetApp Community

Note: for Powershell background info, here are some useful websites with good info:

After you downloaded the ONTAP Powershell Tool Kit v1.7, on your Windows host, open a command prompt, create a directory C:\psontap. Unzip the DataONTAP.zip kit to C:\psontap\DataONTAP. Fig. 2 shows the contents after unzipping the tool kit.

fig2_2012feb27.PNG

Figure 2: Unzip DataONTAP.zip

Next, open a powershell prompt by clicking on the icon (see Fig. 3).

fig3_2012feb27.PNG

Figure 3: Click on the Powershell icon to open the Powershell command prompt

Then, initialize the ONTAP Powershell Tool Kit using import-module dataontap, as shown in Fig. 4.

fig4_2012feb27.PNG

Figure 4: Import the DataONTAP module

To distinguish cluster-mode cmdlets from 7-mode ones, the mnemonic ‘Nc’ is used. For example, to create a snapshot, you use New-NaSnapshot in 7-mode, but New-NcSnapshot in cluster-mode. Therefore, to discover all the snapshot cmdlets in cluster-mode, you can simply do get-help *NcSnapshot*, as show in Fig. 5. Note, the whildcard is allowed in cmdlets.

fig5_2012feb27.PNG

Figure 5: Cluster-mode snapshot cmdlets

In order to take a volume snapshot (or manage the FAS controller), you first need to use the cmdlet connect-NcController  to establish communication with the NetApp FAS controller (operating in cluster-mode) from your Windows box (see Fig. 6). When prompted, type in the admin password and hit OK. Note again the cmdlet is cluster-mode because of the presence of ‘Nc’.

fig6_2012feb27.PNG

Figure 6: Establish connection to a FAS controller operating in cluster-mode

Create a Snapshot

Figure 7 shows how to use the Powershell cmdlet New-NcSnapshot to create a volume snapshot called mysnap. Note that here we assumed that the FlexVol volume test_fv and Vserver vc1 are already on the controller. The parameter VserverContext is useful because it can uniquely identify the volume belonging to the specific Vserver, if there are multiple volumes named test_fv while belonging to different Vservers.

fig7_2012feb27.PNG

Figure 7: Create a snapshot in cluster-mode using Powershell cmdlet

Restore a Snapshot

Suppose after sometime you want to restore the snapshot mysnap for whatever reason, you can do that by using the cmdlet restore-NcSnapshotVolume, as shown in Fig. 8. The parameter PreserveLunIds allows the LUNs within the volume being restored to stay mapped and their identities preserved.

fig8_2012feb27.PNG

Figure 8: Restore a volume snapshot

You can explore other cluster-mode snapshot cmdlets by doing get-helpfor each cmdlet shown in Fig. 5. I found these cmdlets are quite straightforward to use, although a little bit verbose. And of course, if you have many volumes and snapshot copies, you can write your own scripts based on these cmdlets to streamline the operations.

Categories: netapp Tags: , , ,

Space Reclaimer for NetApp SnapDrive

October 18, 2011 Leave a comment

SnapDrive for Windows 6.3, which was released last year, introduced support for VMDKs on NFS & VMFS Datastores

mainscreen.JPG

A couple of quick notes, you need Data ONTAP 7.3.4 or later to use block reclamation with SnapDrive for VMDKs.

You need to have VSC 2.0.1 or later installed, with the Backup and Recovery feature, and also SnapDrive (within the VM) must have network connectivity to the VSC (port 8043, by default) as well as Virtual Center.

Also, SnapDrive cannot create VMDKs for you, in the way it can create RDM LUNs. Instead, you have to create VMDKs the old fashioned way, but once they’re attached to the VM, SnapDrive will be able to manage them.

Okay, so I’ve got a VMDK (my C: Drive), which is in an NFS Datastore.

screen.JPG

I copied 5GB worth of Data into the C: drive, then deleted it. This left my VMDK at 10GB in size.

datastore.JPG

So, Windows took up about 5GB, and the data (which is now deleted), was about 5GB – so let’s kick off space reclamation and see how much space I get back.

Right click on the VMDK, and select “Start Space Reclaimer”.

rightclick.jpg

It will do a quick check, to see if it actually can reclaim any space for you.

screen2.JPG

The confirmation box reckons I get reclaim up to 3GB? Hmm, I was hoping for a bit more than that. Well, let’s run it anyway and see how well it does.

confirm.JPG

It’s worth noting the warning on here – while the VSC requires VMs to be shutdown in order to reclaim space, SnapDrive runs space reclamation while the VM is powered up – but, it will take a backseat to any actual I/O that’s going on, so you might want to run it in a low usage period.

 

So, I clicked okay, and it kicked off space reclamation – and it even gives me a nice little progress bar.

progress2.JPG

In my lab, it took about 3 minutes, and when it was done, it had shrunk my VMDK down to 5.6 GB.

datastore2.JPG

 

So it was just being modest earlier when it said I could free up to 3GB!

In total, it has reclaimed 5.2GB – which is actually a little more than the data I copied in and deleted to start with!

Categories: netapp Tags: , ,

NetApp SAN Boot with Windows

October 6, 2011 Leave a comment

Thoughts and ideas concerning booting from SAN when we attempted this with our NetApp array.

  1. SAN Boot is fully supported by MSFT.  The first thing that happened is that we were told that SAN boot is not supported and we could not get Microsoft support for this configuration.  It turns out that this is not correct.  SAN boot is fully support by Microsoft along with HW partners like NetApp.  This TechNet article fully outlines MSFT’s support for SAN Boot:  http://support.microsoft.com/kb/305547
  2. Zoning is the #1 issue with SAN Boot on FC. In talking with NetApp support team (who were a HUGE help on this issue) the most common issue in SAN Boot from Fiber Channel is zoning.  Because zoning can be complex, this is the most likely cause of error.  We strongly recommend you check and then double-check your zoning before opening a support ticket.   In our case, the zoning for the server was correct, but we did make a zoning error on another server that we were able to correct on our own.
  3. Windows MPIO support at install time is limited. Because WinPE is not MPIO aware, there can be strange results when deploying against a LUN that is visible via multiple paths.  Keep in mind that at install time, Windows boots to boot.wim which is running WinPE instead of a full Windows install.  After the bits are copied locally, windows reboots to complete the install and at this time Windows is actually running.  Because of this, NetApp support team recommends having only one path to the LUN at install time and then adding paths later once Windows is up and running and you can enable Windows MPIO.
  4. AND YET…  MPIO is strongly recommended for SAN Boot.  Because a Windows host will blue screen if it’s boot lun dies, MPIO is strongly recommended for boot LUNs.  This is documented here:  http://technet.microsoft.com/en-us/library/cc786214(WS.10).aspx.  This can seem contradictory at first, but the bottom line is that MPIO is good, just add it later once Windows is up and running correctly.
  5. Yes, but what about Sysprep?  It turns out that MPIO is not supported for Sysprep based deployments:  http://support.microsoft.com/kb/2504934.  So, again you need to configure MPIO post deployment when you are deploying against sysprep’d images.  In the case of NetApp, we strongly recommend using Sysprep boot LUNS which you can then clone for new machines.  This significantly shortens deployment time as opposed to doing a full Windows install for each new host.
  6. It’s all about BIOS. Actually installing windows on a boot LUN does require that Windows Setup sees your target LUN as a valid install path.  This means that the server must report this drive as a valid install target or Setup will not let you select this disk.  For FC, you will need to enable the BIOS setting and select your target LUN in the HBA setup screen.  This process varies by vendor.  Then you need to make this disk your #1 boot target in your server’s BIOS.  Again, this process varies by manufacturer.  As noted above, you should only establish one path.  This includes dual port HBA’s.  Only configure one of the ports.
  7. Where’s my disk? Once you do all the above correctly, Setup may still refuse to show you the disk.  This could be because the correct driver is not present on the install media.  One way to fix this is to inject drivers into your Boot.WIM and Install.WIM.  This process is required if you are using WDS but optional if you are hand building a server from DVD or ISO.  In our case, we were building a single master image that we were going to Sysprep so we simply inserted the media and added the drivers manually during setup.
  8. OK, the disk is there, but I can’t install! One funny thing about Windows setup is that if you are installing from DVD, that DVD must be present to install (duh).  This is fine, unless you used the process above to insert the driver.  To do this, you need to remove the disk.  Then you get the drives and click install.  Windows will fail to install with a fairly obtuse error.  You need to remove the drivers DVD at this point and put the install DVD back in.  Seems obvious, but it took me a few minutes to figure out what was wrong the first time I tried it.

The Windows Host Utilities Installation and Setup Guide (http://now.netapp.com/NOW/knowledge/docs/hba/win/relwinhu53/pdfs/setup.pdf) has a very detailed description of Windows support for FC LUNs and there is a step by step process in this guide for configuring Boot LUNs.

Categories: netapp Tags: , ,

Set NetApp NFS Export Permissions for vSphere NFS Mounts

October 3, 2011 Leave a comment

One of the things missing from the NetApp VSC is the ability to set permissions on NFS exports when you add a host to an existing cluster.  If you have a lot of NFS datastores and don’t feel like setting permissions across NetApp arrays when you add a new host this should ease the pain.  Here are a few other use cases.

  1. You change a VMkernel IP for NFS traffic on a host
  2. You add a VMkernel IP for NFS traffic on a host
  3. You add a new host to a cluster
  4. You remove a host from a cluster
You’ll see removing host is a reason to run this script.  This is an important thing to note.  Running this script will replace existing NFS export permissions with those associated to the vCenter you run it against.  If you have any additional IP addresses assigned to the export they will get blown away by this script!  I also thought it would be cool to turn this into a form so I used PrimalForms to design a very simple front-end you can see below.
The DataONTAP PowerShell Toolkit 1.2 has support for networking, but we don’t have any systems running 7.3.3 or greater so I wasn’t able to make use of those cmdlets in this script.  Because of that I hard code the NetApp VIFs.  Additionally the way I parse the data is related to the length of the VIF used, and I have no support for VIFs of different lengths.  The VMkernel ports for NFS are found using a wildcard search for “NFS” in the port group name.
Don’t be intimidated by all this code, 99% of it was generated by PrimalForms in order to build the GUI.  Modify the 5 variables up front to add your NetApp VIFs and controller names.  You can make a simple batch file to call the script and run it with just a desktop icon to get a nice easy way to modify your NFS permissions on vSphere.  Thanks to @jasemccarty and @glnsize for help with finding the NFS mount in vSphere!

 

$array1VIF = "10.1.1.40", "10.1.1.41", "10.1.1.42", "10.1.1.43"
$array2VIF = "10.1.1.44", "10.1.1.45", "10.1.1.46", "10.1.1.47"

$array1Name = "netapp1"
$array2Name = "netapp2"

$vCenters = "server1", "server2"

$vifLength = $array1VIF[0].Length
$volStart = $vifLength + 9

#Generated Form Function
function GenerateForm {
##############################################################
# Code Generated By: SAPIEN Technologies PrimalForms
#(Community Edition) v1.0.8.0
# Generated On: 10/24/2010 9:34 PM
# Generated By: theselights.com
##############################################################

#region Import the Assemblies
[reflection.assembly]::loadwithpartialname("System.Windows.Forms") | Out-Null
[reflection.assembly]::loadwithpartialname("System.Drawing") | Out-Null
#endregion

#region Generated Form Objects
$form1 = New-Object System.Windows.Forms.Form
$cancelButton = New-Object System.Windows.Forms.Button
$okButton = New-Object System.Windows.Forms.Button
$groupBox1 = New-Object System.Windows.Forms.GroupBox
$vcenter = New-Object System.Windows.Forms.ComboBox
$groupBox2 = New-Object System.Windows.Forms.GroupBox
$nfsDatastores = New-Object System.Windows.Forms.ListBox
$InitialFormWindowState = New-Object System.Windows.Forms.FormWindowState
#endregion Generated Form Objects

#----------------------------------------------
#Generated Event Script Blocks
#----------------------------------------------
#Provide Custom Code for events specified in PrimalForms.
$handler_vcenter_DropDownClosed= 
{

Connect-VIServer $vcenter.SelectedItem

$nfsDS = get-datastore | where {$_.Type -eq "NFS"} | get-view | select Name,@{n="url";e={$_.summary.url}}
$nfsDS | % {$nfsDatastores.Items.Add($_.URL.substring($volStart)) | Out-Null }

}

$handler_vcenter_DropDown= 
{

$nfsDS | % {$nfsDatastores.Items.Remove($_.url.substring($volStart)) | Out-Null }
$nfsDatastores.Items.Remove("Select a Virtual Center to gather NFS mounts.")|Out-Null

}

$okButton_OnClick= 
{

$esxNFSIP = Get-VMHostNetworkAdapter -VMKernel | where {$_.PortGroupName -like "*NFS*"} | select IP -Unique
$esxNFSIP = $esxNFSIP | % {$_.IP}

Foreach ($ds in $nfsDS) {

 $nfsVIF = $ds.url.substring(8,$vifLength)
 $nfsMount = $ds.url.substring($volStart)
 $nfsName = $ds.name

 #//// Set permissions on source NFS exports

 $array1VIF | % { If ($_ -eq $nfsVIF) { $storageArray = $array1Name } }
 $array2VIF | % { If ($_ -eq $nfsVIF) { $storageArray = $array2Name } }

 Connect-NaController $storageArray

 Set-NaNfsExport $nfsMount -Persistent -ReadWrite $esxNFSIP -Root $esxNFSIP

 }

}

$cancelButton_OnClick= 
{

$form1.close()

}

$OnLoadForm_StateCorrection=
{#Correct the initial state of the form to prevent the .Net maximized form issue
 $form1.WindowState = $InitialFormWindowState
}

#----------------------------------------------
#region Generated Form Code
$form1.Text = "Set VMware NFS Permissions"
$form1.Name = "form1"
$form1.DataBindings.DefaultDataSourceUpdateMode = 0
$System_Drawing_Size = New-Object System.Drawing.Size
$System_Drawing_Size.Width = 344
$System_Drawing_Size.Height = 379
$form1.ClientSize = $System_Drawing_Size

$cancelButton.TabIndex = 5
$cancelButton.Name = "cancelButton"
$System_Drawing_Size = New-Object System.Drawing.Size
$System_Drawing_Size.Width = 103
$System_Drawing_Size.Height = 23
$cancelButton.Size = $System_Drawing_Size
$cancelButton.UseVisualStyleBackColor = $True

$cancelButton.Text = "Cancel"

$System_Drawing_Point = New-Object System.Drawing.Point
$System_Drawing_Point.X = 204
$System_Drawing_Point.Y = 328
$cancelButton.Location = $System_Drawing_Point
$cancelButton.DataBindings.DefaultDataSourceUpdateMode = 0
$cancelButton.add_Click($cancelButton_OnClick)

$form1.Controls.Add($cancelButton)

$okButton.TabIndex = 4
$okButton.Name = "okButton"
$System_Drawing_Size = New-Object System.Drawing.Size
$System_Drawing_Size.Width = 103
$System_Drawing_Size.Height = 23
$okButton.Size = $System_Drawing_Size
$okButton.UseVisualStyleBackColor = $True

$okButton.Text = "Set Permissions"

$System_Drawing_Point = New-Object System.Drawing.Point
$System_Drawing_Point.X = 45
$System_Drawing_Point.Y = 328
$okButton.Location = $System_Drawing_Point
$okButton.DataBindings.DefaultDataSourceUpdateMode = 0
$okButton.add_Click($okButton_OnClick)

$form1.Controls.Add($okButton)

$groupBox1.Name = "groupBox1"

$groupBox1.Text = "Virtual Center"
$System_Drawing_Size = New-Object System.Drawing.Size
$System_Drawing_Size.Width = 265
$System_Drawing_Size.Height = 94
$groupBox1.Size = $System_Drawing_Size
$System_Drawing_Point = New-Object System.Drawing.Point
$System_Drawing_Point.X = 42
$System_Drawing_Point.Y = 26
$groupBox1.Location = $System_Drawing_Point
$groupBox1.TabStop = $False
$groupBox1.TabIndex = 2
$groupBox1.DataBindings.DefaultDataSourceUpdateMode = 0

$form1.Controls.Add($groupBox1)
$vcenter.FormattingEnabled = $True
$System_Drawing_Size = New-Object System.Drawing.Size
$System_Drawing_Size.Width = 226
$System_Drawing_Size.Height = 21
$vcenter.Size = $System_Drawing_Size
$vcenter.DataBindings.DefaultDataSourceUpdateMode = 0
$vcenter.Name = "vcenter"
$vCenters | % {$vcenter.Items.Add($_) | out-null}
$System_Drawing_Point = New-Object System.Drawing.Point
$System_Drawing_Point.X = 19
$System_Drawing_Point.Y = 35
$vcenter.Location = $System_Drawing_Point
$vcenter.TabIndex = 0
$vcenter.add_DropDownClosed($handler_vcenter_DropDownClosed)
$vcenter.add_DropDown($handler_vcenter_DropDown)

$groupBox1.Controls.Add($vcenter)

$groupBox2.Name = "groupBox2"

$groupBox2.Text = "NFS Mounts"
$System_Drawing_Size = New-Object System.Drawing.Size
$System_Drawing_Size.Width = 262
$System_Drawing_Size.Height = 167
$groupBox2.Size = $System_Drawing_Size
$System_Drawing_Point = New-Object System.Drawing.Point
$System_Drawing_Point.X = 45
$System_Drawing_Point.Y = 141
$groupBox2.Location = $System_Drawing_Point
$groupBox2.TabStop = $False
$groupBox2.TabIndex = 3
$groupBox2.DataBindings.DefaultDataSourceUpdateMode = 0

$form1.Controls.Add($groupBox2)
$nfsDatastores.FormattingEnabled = $True
$System_Drawing_Size = New-Object System.Drawing.Size
$System_Drawing_Size.Width = 226
$System_Drawing_Size.Height = 134
$nfsDatastores.Size = $System_Drawing_Size
$nfsDatastores.DataBindings.DefaultDataSourceUpdateMode = 0
$nfsDatastores.Items.Add("Select a Virtual Center to gather NFS mounts.")|Out-Null
$nfsDatastores.HorizontalScrollbar = $True
$nfsDatastores.Name = "nfsDatastores"
$System_Drawing_Point = New-Object System.Drawing.Point
$System_Drawing_Point.X = 16
$System_Drawing_Point.Y = 24
$nfsDatastores.Location = $System_Drawing_Point
$nfsDatastores.TabIndex = 0

$groupBox2.Controls.Add($nfsDatastores)

#endregion Generated Form Code

#Save the initial state of the form
$InitialFormWindowState = $form1.WindowState
#Init the OnLoad event to correct the initial state of the form
$form1.add_Load($OnLoadForm_StateCorrection)
#Show the Form
$form1.ShowDialog()| Out-Null

} #End Function

#Call the Function
GenerateForm
Categories: netapp, VMware Tags: , ,

Safely Virtualize Oracle on NetApp, VMware, and UCS

October 3, 2011 Leave a comment

Virtualizing your Tier1 applications is one of the last hurdles on the way to a truly dynamic and flexible datacenter. Large Oracle databases almost always fall into that category. In the past a lot of the concern revolved around performance, but with faster hardware and support for larger and larger virtual machines this worry is starting to fade away. The lingering issue remains what is and what isn’t supported in a virtual environment from your software vendor?
Although Oracle has relaxed their stance on virtualization, they take the same approach that most do when it comes to support in virtual environments. Take for example the following excerpt from Oracle’s database support matrix: Oracle will provide support for issues that are known to occur on the native OS, or can be demonstrated not to be a result of running on the server virtualization software. Oracle may request that the problem be reproduced on the native hardware.

That last part is the killer for most companies. How could you quickly re-create a multi-terabyte database on physical hardware once it is virtualized if there is a problem? Luckily NetApp, VMware, and Cisco UCS provide a very elegant solution to address this issue. Let’s take a look at a simple diagram depicting a 10TB virtualized Oracle DB instance connected via 10GbE and utilizing Oracle’s Direct NFS client.
The guest OS has been virtualized and resides on the VMFS datastore, the vSphere host is booting from SAN, and the database is directly hosted and accessed on the NetApp array using NFS. Each data volume in the picture is connected using a different technology to illustrate protocol independence (outside of Oracle where NFS is used for simplicity of setup).
As you can see from the diagram the real challenge is re-creating that 10TB database in a way that is cost effective and fast. NetApp’s FlexClone technology allows the instant creation of zero space virtual copies. The process is similar to VMware’s linked clones, but NetApp does it with LUN’s or file data, and with no performance hit.
To build your safety net follow the steps below.
  1. Create LUN on NetApp array
  2. Create UCS Service Profile Template
  3. Configure Service Profile Template and set to boot from LUN in step 1
  4. Deploy Service Profile from template
  5. Install same OS as virtualized instance (OEL 5.5 in this case)
  6. Create FlexClone of Oracle files/volumes
  7. Create exports and set permissions for newly created server
  8. Configure OS with mount points designed for FlexCloned file/volume
At this point you have a full physical environment of that 10TB virtualized Oracle database. The diagram below shows what this looks like.
The next step is to clean this up since you don’t want this UCS blade occupied with the test environment.
  1. Shut down the OS
  2. Delete the Service Profile (not the template)
  3. Delete the FlexClone(s)
Now in the event you have some nasty database issue, and Oracle tells you to reproduce the issue on physical hardware, you can listen on the phone as the support guys jaw hits the floor when you tell him to give you 5 minutes. The entire process can be scripted easily using the Data ONTAP and UCS PowerShell Toolkit, or using an orchestration tool of your choice.
Reserving a blade or two for this unlikely scenario may seem wasteful to some, but because of the flexibility of UCS you can quickly spin that hardware up into production for things like hardware maintenance without a performance hit or capacity on demand for your vSphere environment. With NetApp, VMware, and Cisco you can safely and efficiently take your company to a 100% virtualized private cloud environment.
Categories: netapp, VMware Tags: , , ,

NetApp and VMware View 5,000-Seat Performance Report

September 28, 2011 Leave a comment

This report is a follow up to the 50,000-Seat VMware View Deployment white paper where NetApp analyzed a single pod of 5,000 virtual desktops.  This report is an in depth storage performance analysis of what VDI really is.  VDI is not only about steady state, login, or boot.  It’s about all phases, in the life span of the virtual desktop.  Below is one of the many charts and graphs that helps to demonstrates this fact.  The chart demonstrates that each phase has its own unique characteristics and such impacts storage very differently.

Lifecycle.png

For simplicity NetApp takes a unique approach in this document and overlay the performance tests on top of a calendar.  This way each of the different events in a “2 weeks in the life” of a virtual desktop can be easily analyzed and explained.

NetApp measured the deployment of 2500 virtual desktops using the NetApp Virtual Storage Console. We then look at first login where the user has never logged into this virtual machine before.  This simulates a scenario where the desktop has been re-provisioned after patching or something similar.  We look at a cached login for example “a Tuesday” where the user has already logged onto the desktop and this is the second time they log in.  Here the user logs in and starts working, which is probably the most common login workload.  We then look at a boot storm where the environment has to be shutdown and rebooted to demonstrate that with NetApp and VST, rebooting an entire VDI environment can be done quite rapidly (5,000 VMs in 18 minutes to be exact).  This demonstrates that the workload of booting or rebooting an entire environment doesn’t have to take forever!

Screen Shot 2011-08-29 at 3.10.58 PM.png

So what does all this mean and what do we look at in this paper?  We  dive deep into read write ratios, IO sizes, Sequential Randomness, and demonstrate that its not just all about IOPS.

Customers are often asked by their partners, virtualization vendors and storage vendors, “how many IOPS are your desktops doing”, they often reply with a number like 16 IOPS or maybe even more generic response like “we have a percentage of task workers, a percentage of power users, and a percentage of developers”.  If the response is along these lines, it will be sized wrong, almost guaranteed.

Lets take the simplest sizing approach…

Vendor: “Mr Customer, how many IOPS do you need each of your desktops to do?”

Customer: “Great question, I need my desktops to each do 16IOPS!”

Vendor: “Thanks for the info!  I’ll get you a sizing right away!”

Ok, does anyone else see the significant flaw in this methodology of sizing?  Lets do some simple math to figure out how this could go wrong…

If my IO size is 4K then: 16IOPS x 4K / IO = 64K/sec

If my IO size is 32K then:  16IOPS x 32K / IO = 512K/sec

So 16 IOPS != 16IOPS  There is a difference of 440Kb/sec in the two calculation

Why does everyone then size for only IOPS and not ask more difficult questions?  There are so many other questions that MUST BE ASKED!!!!

Are the IOPS 4K or 32K or a blend of all sizes? Are these reads or writes? Are they sequential or random?  Each of these has a SIGNIFICANT impact on storage as you can see by the example above!

This is why it is so important to perform an assessment with a product like Liquidware Labs Stratusphere Fit .  Then and only then are you able to get it sized right the first time!

Here are a couple of key takeaways from the paper!

  1. Assessments are the only way to get VDI right!
  2. VDI is not all small block random reads
  3. Theres more to VDI then steady state
  4. Login storms are the hardest workloads as it is reads and writes
  5. IOPS is only one part of the much larger story.
    1. Saying my desktops NEED 16 IOPS is useless!!!
    2. Saying 16 IOPS, 80%r/20%w, 20K reads / 8K write sizes, 50% sequential / 50% random reads gets you correct sizing’s!!!!!
  6. Memory overcommitment hurts really bad… The answer, buy more memory for your host or buy more storage!

http://media.netapp.com/documents/tr-3949.pdf

Categories: netapp, VMware Tags: ,

The 4 Most Common Misconfigurations with NetApp Deduplication

August 2, 2011 Leave a comment

Being a field engineer I work with customers from all industries. When I tell customers that the usual deduplication ratio I see on production VMware workloads is 60-70% I am often met with skepticism. “But my VM workload is different” is usually the response I get, followed by “I’ll believe it when I see it”. I do also get the occasional “Thats not what your competitor tells me I will see” I love those ones.

Consistently though when the customer does a proof of concept or simply buys our gear and begins their implementation this is exactly the savings they tend to see in their VMware environment. Quite recently one of my clients moved 600+ VMs from their incumbent array which were using 11.9TB of disk to a new NetApp array. Those 600 VMs of varied application, OS type and configuration deduped back to 3.2 TB, a 73% savings!

Once in a while though I get the call from a customer saying “Hey, I only got 5% dedupe! What gives?” These low dedupe numbers are almost always because of one of the following deduplication configuration mistakes.

Misconfiguration #1 – Not turning on dedupe right away (or forgetting the -s or scan option)

As Dr. Dedupe pointed out in a recent blog, NetApp recommends dedulpication on all VMware workloads. You may have noticed that if you use our Virtual Storage Console (VSC) plugin for vCenter that creation of a VMware datastore using the plugin results in dedupe being turned on. We recommend enabling dedupe right away for a number of reasons but here is the primary reason why;

Enabling dedupe on a NetApp volume (ASIS) starts the controller tracking the new blocks that are written to that volume. Then during the scheduled deduplication pass the controller looks at those new blocks and eliminates any duplicates. What if, however, you already had some VMs in the volume before you enabled deduplication? Unless you told the NetApp specifically to scan the existing data, those VMs are never examined or deduped! This results in the low dedupe results. The good news, this is a very easy fix. Simply start a deduplication pass from the VSC with the “scan” option enabled or from the command line with the “-s” switch.

dedupmgmt1.png

Above, where to enable a deduplication volume scan in VSC.

Below, how to do one in Systems Manager;

dedupmgmt2.png

For you command line guys its “sis start -s /vol/myvol” note the -s, amazing what 2 characters can do!

This is by far is the most common mistake I come across but thanks to more customers provisioning their VMware storage with the free VSC plug-in it is becoming less common.

Misconfiguration #2 – LUN reservations

Thin Provisioning has gotten a bad reputation in the last few years. Storage admins who have been burned by thin provisioning in the past tend to get a bit reservation happy. On a NetApp controller we have multiple levels of reservations depending on your needs but with regard to VMware two stand out. First there is the volume reservation. This reserves space away from the large storage pool (the Aggregate) and insures whatever object you place into that volume has space. Inside the volume we now create the LUN for VMware. Again you can choose to reserve the space for the LUN which removes the space away from the available space in the volume. There are two problems with this. First, there is no need to do this. You have already reserved the space with the volume reservation, no need to reserve the space AGAIN with a LUN reservation. Second, the LUN reservation means that the unused space in the LUN will aways consume the space reserved. That is, a 600GB LUN with space reservation turned on will consume 600 GB of space with no data in it. Deduping a space reserved LUN will yield you some space from the used data but any unused space will remain reserved.

For example say I had a 90GB LUN in a 100GB volume and the LUN was reserved. With no data in the LUN the volume will show 90GB used, the unused but reserved LUN. Now I place 37 GB of data in the LUN. The volume will still show 90GB used. No change. Next I dedupe that 37 GB and say it dedupes to 10GB. The volume will no report 63 GB used since I reclaimed 27GB from deduping. However when I remove the LUN reservation I can see the data is actually taking up only 10GB with the volume now reporting 90GB free. [I updated this section from my original post, Thanks to Svetlana for pointing out my error here]

In these occasions, a simple deselection of the LUN reservation reveals the actual savings from dedupe (yes this can be done live with the VMs running). Once the actual dedupe savings are displayed (likely back in that 60-70% range) we can adjust the size of the volume to suit the size of the actual data in the LUN (yes, this too can be done live)

dedupmgmt3.png

Misconfiguration #3 – Misaligned VMs

The problem with some guest operating systems being misaligned with the underlying storage architecture has been well documented. In some cases though this misalignment can cause lower than expect deduplication numbers. Clients are often surprised (I know I was) at how many blocks we can dedupe between unlike operating systems. That is, between say Windows 2003 and 2008 or Windows XP and 2003. However if the starting offset of one of the OS types is different that the starting offset of the other than almost none of the blocks will align.

In addition to lowing your dedupe savings and using more disk space that required, misalignment can also place more load on your storage controller (any storage controller, not a NetApp specific problem). Thus it is a great idea to fix this situation. There are a number of tools on the market that can correct this situation including the MBRalign tool which is free for NetApp customers and included as part of the VSC. As you align the misaligned VMs, you will see your dedupe savings rise and your controller load decrease. Goodness!

Misconfiguration #4 – Large amounts of data in the VMs

Now this one isn’t really a misconfiguration, it’s more of a design option. You see, most of my customers do not separate their data from their boot VMDK files. The simplicity  of having your entire VMs in a single folder is just too good to mess with. Customers are normally still able to achieve very high deduplication ratios even with the application data mixed in with the OS data blocks. Sometimes though customers have very large data files such as large database files, large image file repositories or large message datastores mixed in with the VM. These large data files tend not to deduplicate well and as such drive down the percentage seen. No harm is done though since the NetApp will deduplicate the all the OS and other data around these large sections. However the customer can also move these VMDKs off to other datastores which can then expose the higher dedupe ratios on the remaining application and OS data. Either option is fine.

So there it is, the 4 most common misconfigurations I see with deduplication on NetApp in the field. Please feel free to post and share your savings, we always love to hear from our customers directly.

Categories: netapp Tags: ,

Firewall usage with SnapMirror

May 24, 2011 Leave a comment

SnapMirror uses the typical socket/bind/listen/accept sequence on a TCP socket.

SnapMirror source binds on port 10566.  The destination storage system contacts the SnapMirror source storage system at port 10566 using any of the available ports assigned by the system.  The firewall must allow requests to this port of the SnapMirror source storage system.  Synchronous SnapMirror requires additional TCP ports to be open.  The source storage system listens on TCP ports 10566 and 10569.  The destination storage system listens on TCP ports 10565, 10567, and 10568.  Therefore, you should ensure that the firewall allows a range of TCP ports from 10565 to 10569.
Categories: netapp Tags: , ,

Removing broken SnapMirror relationships

May 24, 2011 Leave a comment

If you have managed SnapMirror relationships on a NetApp SAN you no doubt encountered problems deleting them after they have been broken off. One command I have found that resolves this if FilerView will not work is:

snapmirror release source { filer:volume | filer:qtree }

Tell snapmirror that a certain direct mirror is no longer going to request updates. If a certain destination is no longer going you request updates, you must tell SnapMirror so that it will no longer retain a snapshot for that destination. This command will remove snapshots that are no longer needed for replication and can be used to clean up SnapMirror created snapshots after snapmirror break is issued on the destination side.

I find I have to use this command every so often to clean up my snapmirror configs.