Vous êtes sur la page 1sur 9

Summary: File system misalignment is a storage industry problem that generates an un-optimized workload on a storage system.

Misalignment occurs when a file system is laid down on a logical or physical storage device in such a way that the file systems blocks to not align with the underlying devices block boundaries. This problem is especially apparent in virtual environments because of the extra layer between the actual file system and the underlying storage. In addition, virtual disk formats from the major hypervisor vendors are empty disks with no built in offset to account for the default file system layouts of these operating systems. It is out of the scope of this document to go into the detailed history of why these default file system layouts result in misalignment, but suffice it to say that doing nothing to account for it will result in misaligned I/O in your virtual environment. In some cases this can lead to serious performance problems. This document details a process to create quick, temporary relief in SAN environments by migrating virtual machines to specially designated LUNs that account for this file system misalignment. The special LUNs are configured to align the guest partition to the storage controller, thereby optimizing the guest I/O workload. Once performance on the storage system returns to an acceptable level, the process may begin to correct alignment of the virtual machines on a less urgent schedule. What this does: The process outlined below outlines the following actions: Create a specially offset LUN to account for the default file system alignment of Windows Server 2003 and Windows XP operating systems. Mount the LUN as a Datastore in the vSphere environment Migrate VMs to the new LUN Capture performance data before, during, and after the migration to measure the effect Maintain the environment while correcting alignment of the VMs

Caveats: The solution documented here is intended for temporary relief, not for permanent storage of virtual machines. While we will give the Datastore a descriptive name to help identify it as the repository for VMs with misaligned file systems, it is ultimately the responsibility of the vSphere administrator to ensure that only those such VMs get placed in that Datastore. It is therefore NetApps recommendation that once storage system performance relief has been realized, the process of aligning the virtual machines VMDKs begins. The solution trades aligned hypervisor I/O (zero vmdk, copy vmdk, Storage vMotion vmdk) for aligned guest I/O (work the hypervisor does on behalf of a virtual machine). Since the guest does the majority of I/O in a virtualized environment, this results in better performance. The effect on the hypervisor I/O is as follows: 1) Hypervisor I/O will place more load on the controller 2) Certain VAAI offload commands will no longer be supported for VMs in offset Datastores.

This procedure assumes that all VMDKs contain either a single partition, or that all partitions on a given VMDK are offset by the same amount. If there are any VMDKs with multiple partitions and those partitions are at differing offsets, consideration must be given as to which partition receives the bulk of the I/O. Based on that partitions offset, the administrator can then move the VM to the properly offset VMFS Datastore.

Collecting Preliminary Performance Data: Before beginning the process of creating a special Datastore to house the misaligned VMs, it is important to capture baseline performance data. This can assist in identifying the overall improvement to the storage systems performance as VMs are migrated to the new Datastore. In this example, we will capture a 30 minute sample of data (which should be taken during a normal peak period): 1) Download the perfstat tool from the NOW site: http://now.netapp.com/NOW/download/tools/perfstat/perfstat7.shtml 2) Ensure either RSH or SSH is enabled on the storage system to allow perfstat to run from the selected host. a. RSH: https://kb.netapp.com/support/index?page=content&id=1010082&actp=search&viewlocale=en_US&se archid=1306246472741 b. SSH (requires PKI setup): i. ESX: https://kb.netapp.com/support/index?page=content&id=1010082&actp=search&viewlocale=en _US&searchid=1306246472741 ii. Windows & Linux: https://kb.netapp.com/support/index?page=content&id=1010841&actp=search&viewlocale=en _US&searchid=1306246472741 iii. Please note for Windows 7/2008 requires plink.exe as well: https://kb.netapp.com/support/index?page=content&id=2011414&actp=LIST_RECENT&viewloc ale=en_US&searchid=1306246472741 3) Here is an example of the perfstat syntax to use. It assumes RSH has been configured for this purpose (note that the options are case sensitive): Host1> perfstat.sh f <storage system hostname or IP address> -p F I t 5 i 6 > pre_baseline.perfstat.out The utility will compile all of its statistics into a large text file. The pertinent LUN statistics can be found in the section with header:
=-=-=-=-=-= PERF <value used with the f option> POSTSTATS =-=-=-=-=-= stats stop -I perfstat_lun

If you search the file for the string perfstat_lun you will find this section. Then, navigate to the LUN in question. This is what you should see for a default installation of Windows XP or 2003 (unaligned) in terms of I/O to the VMFS LUN: lun:/vol/vmfs1/lun-HnT9lZcqHWF9:display_name:/vol/vmfs1/lun lun:/vol/vmfs1/lun-HnT9lZcqHWF9:read_ops:0/s lun:/vol/vmfs1/lun-HnT9lZcqHWF9:write_ops:336/s lun:/vol/vmfs1/lun-HnT9lZcqHWF9:other_ops:0/s lun:/vol/vmfs1/lun-HnT9lZcqHWF9:read_data:2396b/s lun:/vol/vmfs1/lun-HnT9lZcqHWF9:write_data:11004277b/s lun:/vol/vmfs1/lun-HnT9lZcqHWF9:queue_full:0/s lun:/vol/vmfs1/lun-HnT9lZcqHWF9:avg_latency:2.19ms

lun:/vol/vmfs1/lun-HnT9lZcqHWF9:total_ops:337/s lun:/vol/vmfs1/lun-HnT9lZcqHWF9:scsi_partner_ops:0/s lun:/vol/vmfs1/lun-HnT9lZcqHWF9:scsi_partner_data:0b/s lun:/vol/vmfs1/lun-HnT9lZcqHWF9:read_align_histo.0:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:read_align_histo.1:8% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:read_align_histo.2:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:read_align_histo.3:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:read_align_histo.4:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:read_align_histo.5:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:read_align_histo.6:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:read_align_histo.7:56% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:write_align_histo.0:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:write_align_histo.1:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:write_align_histo.2:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:write_align_histo.3:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:write_align_histo.4:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:write_align_histo.5:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:write_align_histo.6:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:write_align_histo.7:99% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:read_partial_blocks:34% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:write_partial_blocks:0% lun:/vol/vmfs1/lun-HnT9lZcqHWF9:queue_depth_lun:22 lun:/vol/vmfs1/lun-HnT9lZcqHWF9:avg_read_latency:1.67ms lun:/vol/vmfs1/lun-HnT9lZcqHWF9:avg_write_latency:2.19ms lun:/vol/vmfs1/lun-HnT9lZcqHWF9:avg_other_latency:0.00ms Note the line in red indicating that almost all 4k divisible writes are going to the 7th logical 512 byte block of the WAFL 4K block. This VMFS Datastore almost certainly contains perfect virtual machine candidates to be migrated to our special Datastore. You can further validate that the VMs in the above Datastore are misaligned using the mbralign tool as indicated in TR-3747. Using this information, you can determine the prefix we will use for the new Datastore. In this case, we are seeing most of the I/O in logical block 7. Take this number, multiple it by 512 to get the prefix in bytes: 7 * 512 = 3584 If you find that there are varying values in multiple logical blocks, chances are you have VMs with different partition offsets. In this case, you will need to determine each VMs partition offset in order to understand how many of these special LUNs would need to be created with non-standard offsets. For example, using the mbralign tool to check a VMDK you may see the following output: [root@esx1 SMSQL] # /var/tmp/santools/mbralign --scan SMSQL.vmdk ./SMSQL-flat.vmdk P1 lba:63 Aligned: No You can see that the logical block offset is 63. In this case, 63 refers to the sectors, which are 512 bytes in size. Thus, to find the prefix for our new LUN based on this output, you divide the lba offset by 7, take the whole number remainder, and then multiply that number by 512 bytes: 63/8 = 7 with a remainder of 7. So taking 7, we multiply by 512 to arrive at 3584 bytes. Some further examples: Offset 31: 31/8 = 3 with a remainder of 7 The prefix is 7 logical blocks, which again equates to 3584 bytes

Offset 64: 64/8 = 8 with a remainder of 0 The prefix is 0 so in this case, the VM is aligned to a 4K divisible offset and does not require placement in a special Datstore.

Steps to Create the Datastore: This section outlines the steps required to create a LUN to correct alignment based on the default partition layout for Windows 2000/XP/2003 MBR partitioned VMDKs (as well as some flavors of Linux check with your Linux distributions documentation). This default offset is 63 sectors (as we see from the previous section), which equates to 31.5KB (thus, this procedure will work for any VM that has an MBR partition whose offset is either 31.5K or some multiple thereof). Before you begin, ensure you have enough space on the storage system to accommodate a LUN large enough to hold the VMs that will be migrated. For this example, we will call the volume tempalign, and configure it to hold a LUN of roughly 500GB. For this example, we will also assume that there is already an Initiator Group (here we call it esxigroup) that contains the initiators of the ESX hosts in the vSphere cluster (this can also be an iSCSI igroup). Please note, that if you wish to take snapshots of the volume, you should make the volume large enough to account for this (for more information on this, please see Technical Report TR-3483. Although the procedure here is not intended to use thin provisioning, the concepts described in the Technical Report will help you make the right decision about volume size for this LUN) 1) Create a volume of at least the required size of the new LUN. Storage1> vol create tempalign aggr1 500g 2) Turn off snap reserve and scheduled snapshots: Storage1> snap reserve tempalign 0 Storage1> snap sched tempalign 0 0 0 3) Enter Diagnostic mode Storage1> priv set diag Warning: These diagnostic commands are for use by NetApp personnel only. (Entering diagnostic mode for the purposes of this procedure is required and acceptable. However, please only perform the commands indicated here while in diagnostic mode, as many commands only available in this mode can be harmful if not used appropriately.) 4) Create the LUN (where our P value is the prefix in bytes). We will call it lun_prefix_3584 so we can quickly tell what the offset is: Storage1*> lun create -s 500g -t vmware -o noreserve -P 3584 /vol/tempalign/lun_prefix_3584 5) Verify that the LUN is set up correctly:

Storage1*> lun show v /vol/tempalign/lun_prefix_3584 /vol/tempalign/lun_prefix_3584 500.1g (536952700928) Serial#: HnaRi4cbSNsj Share: none Space Reservation: disabled Multiprotocol Type: vmware Prefix Size: 3.5k (3584) Suffix Size: 0 (0) Extent Boundary: 0 (0) 6) Map the LUN to your ESX hosts: Storage1*> lun map /vol/tempalign/lun_prefix_3584 esxigroup The LUN will be assigned the next available LUN ID for the given igroup. If you wish to use a specific LUN ID, then specify it at the end of the command. In this example, we will use LUN ID 20: (r/w, online)

Storage1*> lun map /vol/tempalign/lun_prefix_3584 esxigroup 20

7) Open the vSphere client and select the first ESX host in the cluster. Click on the configuration tab, and then on the link marked storage adapters under hardware. In the upper right hand corner, click on rescan all to rescan the HBAs to discover the new LUN.

Click Ok on the pop up box asking if you want to scan for new Datastores. You can uncheck the box for Scan for New VMFS Volumes if you wish. 8) Confirm that the LUN is now visible by highlighting the appropriate HBAs and ensuring that the LUN appears under the Details section:

9) Click on the Storage link under Hardware and then choose Add Storage:

10) The Add Storage wizard will begin. In the first screen, choose Disk/LUN and click next:

11) On the next screen, ensure you select the proper LUN and click next:

12) On the next screen you can further verify that you are choosing the correct LUN in a couple of ways. First, you can follow the steps in NetApp KB Article 1012613. Second, at the bottom of the dialog box, it will simply say a partition will be created if it is an empty LUN:

In the event that you have multiple available LUNs, and you select one that already has a partition on it, you will see a warning about the LUN contents being deleted. 13) On the next screen, choose a name for the Datastore that will clearly indicate this is for the temporary storage for misaligned Windows 2003/XP VMs:

14) On the next screen, choose a block size. Typically, this should be the same as what is used on the other production Datastores. If there are multiple block sizes in use on other Datastores, then choose a block size matching the largest in use to accommodate any size VMDKs that are in use in the environment. Here, we will choose the default:

15) On the next screen you will see a summary. Verify the contents are correct and click finish to complete the Datastore creation. You should now see a new Datastore under your Storage section:

16) Complete Steps 7) & 8) for each ESX host in the cluster to ensure they see the new Datastore as well.

Migrating the Virtual Machines

1) Before the migration begins, begin a new perfstat capture that will run during the process. Using the syntax above, modify the t and i options to account for the time during the migrations. The t option controls the length of the iteration while the i controls the number of those iterations. It is recommended to at least capture the performance during the initial migration of several VMs to get an idea of the load on the storage system during the period of highest activity. During this initial migration, performance will likely get worse as all of the storage vMotion I/O will be misaligned. However, once the bulk of the VMs are migrated to this specially configured Datastore, and all other workload being equal, the storage systems workload should begin to ease. 2) At this point, you can begin migrating the Windows Server 2003 and Windows XP VMs that have misaligned partitions on their VMDKs to this Datastore. For documentation, please see page 213 of VMwares vSphere Datacenter Administration Guide under the heading Migration with Storage vMotion. It is possible to migrate more than one VM at a time, but realize that each migration is an I/O intensive workload. Based on what is still tolerable in the environment from a performance perspective, it may not be feasible to migrate more than one VM at a time. 3) Upon completion of the migration, a new baseline perfstat should be captured using the same syntax as preliminary capture, but using a name for the file like post_baseline.perstat.out The pre and post perfstats can then be used to measure the approximate difference in workload once it has been aligned. Assuming that the newly created VMFS Datastore contains only VMs with the 63 sector offset, the LUN stats from the perfstat should look something like this: lun:/vol/tempalign/lun_prefix_3584HnT9lZd1sF8o:display_name:/vol/tempalign/lun_prefix_3584 lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:read_ops:1/s lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:write_ops:856/s lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:other_ops:0/s lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:read_data:31153b/s lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:write_data:28200197b/s lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:queue_full:0/s lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:avg_latency:0.44ms lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:total_ops:858/s lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:scsi_partner_ops:858/s

lun:/vol/tempalign/lun_prefix_3584HnT9lZd1sF8o:scsi_partner_data:28231350b/s lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:read_align_histo.0:43% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:read_align_histo.1:9% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:read_align_histo.2:7% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:read_align_histo.3:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:read_align_histo.4:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:read_align_histo.5:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:read_align_histo.6:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:read_align_histo.7:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:write_align_histo.0:99% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:write_align_histo.1:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:write_align_histo.2:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:write_align_histo.3:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:write_align_histo.4:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:write_align_histo.5:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:write_align_histo.6:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:write_align_histo.7:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:read_partial_blocks:38% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:write_partial_blocks:0% lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:queue_depth_lun:10 lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:avg_read_latency:5.36ms lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:avg_write_latency:0.43ms lun:/vol/tempalign/lun_prefix_3584-HnT9lZd1sF8o:avg_other_latency:0.00ms The writes are now going to the first logical 512 byte of the WAFL 4k block indicating that the I/O is properly aligned. Permanently Correcting the VM Alignment Now that the storage system should have gained measurable and observable relief, the process of permanently correcting alignment of the VMs can begin. The previously mentioned TR-3747: Best Practices for File System Alignment in Virtual Environments explains in detail how to go about doing this. While a VM needs to be powered down in order to complete alignment using the NetApp mbralign tool, you can specify a destination Datastore that is different than the source, thereby combining both the alignment procedure and the migration back to a normal Datastore in one step. It is important for the vSphere administrator to monitor the environment to ensure no VMs are improperly placed on the specially designated Datastore that do not belong there. If a VM that has been properly aligned, or one that does not share the same file system offset is placed in this Datastore, it will generate misaligned I/O, thus defeating the purpose of the Datastore.

Vous aimerez peut-être aussi