Virtual Networks in Solaris

Virtual Networks
By user12616590 on Jan 05, 2011 Network virtualization is one of the industry's hot topics. The potential to reduce cost while increasing network flexibility easily justifies the investment in time to understand the possibilities. This blog entry describes network virtualization and some concepts. Future entries will show the steps to create a virtual network.
Introduction to Network Virtualization

Network virtualization can be described as the process of creating a computer network which does not match the physical topology of a physical network. Usually this is achieved by using software tools of general-purpose computers or by using features of network hardware. A defining characteristic of a virtual network is the ability to re-configure the topology without manipulating any physical objects: devices or cables. Such a virtual network mimics a physical network. Some types of virtual networks, for example virtual LANs (VLANs), can be implemented using features of network switches and computers. However, some other implementations do not require traditional network hardware such as routers and switches. All of the functionality of network hardware has been re-implemented in software, perhaps in the operating system. Benefits of network virtualization (NV) include increased architectural flexibility, better bandwidth and latency characteristics, the ability to prioritize network traffic to meet desired performance goals, and lower cost from fewer devices, reduced total power consumption, etc. The remainder of this blog entry will focus on a software-only implementation of NV. A few years ago, networking engineers at Sun began working on a software project named "Crossbow." The goal was to create a comprehensive set of NV features within Solaris. Just like Solaris Zones, Crossbow would provide integrated features for creation and monitoring of general purpose virtual network elements that could be deployed in limitless configurations. Because these features are integrated into the operating system, they automatically take advantage of - and smoothly interoperate with - existing features. This is most noticeable in the integration of Solaris NV features and Solaris Zones. Also, because these NV features are a part of Solaris, future Solaris enhancements will be integrated with Solaris NV where appropriate. The core NV features were first released in OpenSolaris 2009.06. Since then, those core features have matured and more details have been added. The result is the ability to reimplement entire networks as virtual networks using Solaris 11 Express. Here is an example of a virtual network architecture:
As you can guess from that example, you can create virtually virtual network...
any network topology as a
Oracle Solaris NV does more than is described here. This content focuses on the key features which might be used to consolidate workloads or entire networks into a Solaris system, using zones and NV features.
Virtual Network Elements

Solaris 11 Express implements the following virtual network elements.
y
NIC: OK, this isn't a virtual element, it's just on the list as a starting point. For a very long time, Solaris has managed Network Interface Connectors (NICs). Solaris offers tools to manage NICs, including bringing them up and down, and assigning various characteristics to them, such as IP addresses, assignment to IP Multipathing (IPMP) groups, etc. Note that up through Solaris 10, most of those configuration tasks were accomplished with the ifconfig(1M) command, but in Solaris 11 Express the dladm(1M) and ipadm(1M) commands perform those tasks, and a few more. You can monitor the
use of NICs with dlstat(1M). The term "datalink" is now used consistently to refer to NICs and things like NICs, such as...
y
A VNIC is a pseudo interface created on a datalink (a NIC or an etherstub, described next). Each VNIC has its own MAC address, which can be generated automatically, but can be specified manually. For almost all purposes, a VNIC can be can be managed like a NIC. The dladm command creates, lists, deletes, and modifies VNICs. The dlstat command displays statistics about VNICs. The ipadm(1M) command configures IP interfaces on VNICs. Like NICs, VNICs have a number of properties that can be modified with dladm. These include the ability to force network processing of a VNIC to a certain set of CPUs, setting a cap (maximum) on permitted bandwidth for a VNIC, the relative priority of this VNIC versus other VNICs on the same NIC, and other properties. Etherstubs are pseudo NICs, making internal networks possible. For a general understanding, think of them as virtual switches. The command dladm manages etherstubs. A flow is a stream of packets that share particular attributes such as source IP address or TCP port number. Once defined, a flow can be managed as an entity, including capping bandwidth usage, setting relative priorities, etc. The new flowadm(1M) command enables you to create and manage flows. Even if you don't set resource controls, flows will benefit from dedicated kernel resources and more predictable, consistent performance. Further, you can directly observe detailed statistics on each flow, improving your ability to understand these streams of packets and set proper resource controls. Flows are managed with flowadm(1M) and monitored with flowstat(1M). VLANs (Virtual LANs) have been around for a long time. For consistency, the commands dladm, dlstat and ipadm now manage VLANs. InfiniBand partitions are virtual networks that use an InfiniBand fabric. They are managed with the same commands as VNICs and VLANs: dladm, dlstat, ipadm and others.
Summary
Solaris 11 Express provides a complete set of virtual network components which can be used to deploy virtual networks within a Solaris instance. The next few blog entries will describe how to use these components
This is the second in a series of blog entries that discuss the network virtualization features in Solaris 11 Express. The first entry discussed the basic concepts and the virtual network elements, including virtual NICs, VLANs, virtual switches, and InfiniBanddatalinks. This entry adds to that list the resource controls and security features that are necessary for a well-managed virtual network. Virtual Networks, Real Resource Controls In Oracle Solaris 11 Express, there are four main datalink resource controls: 1. a bandwidth cap, which limits the amount of traffic passing through a datalink in a small amount of elapsed time 2. assignment of packet processing tasks to a subset of the system's CPUs 3. flows, which were introduced in the previous blog post 4. rings, which are hardware or software resources that can be dedicated to a single purpose. Let's take them one at a time. By default, datalinks such as VNICs can consume as much of the physical NIC's bandwidth as they want. That might be the desired behavior, but if it isn't you can apply the property "maxbw" to a datalink. The maximum permitted bandwidth can be specified in Kbps, Mbps or Gbps. This value can be changed dynamically, so if you set this value too low, you can change without affecting the traffic flowing over that link. Solaris will not allow traffic to flow over that datalink at a rate faster than you specify. You can "over-subscribe" this bandwidth cap: the sum of the bandwidth caps on the VNICs assigned to a NIC can exceed the rated bandwidth of the NIC. If that happens, the bandwidth caps become less effective. In addition the bandwidth cap, packet processing computation can be constrained to the CPUs associated with a workload. First some background. When Solaris boots, it assigns interrupt handler threads to the CPUs in the system. (See Solaris CPUs for an explanation of the meaning of "CPU".) Solaris attempts to spread the interrupt handlers out evenly so that one CPU does not become a bottleneck for interrupt handling. If you create non-default CPU pools, the interrupt handlers will retain their CPU assignments. One unintended side effect of this is a situation where the CPUs intended for one workload will be handling interrupts caused by another workload. This can occur even with simple configurations of Solaris Zones. In extreme cases, network packet processing for one zone can severely impact the performance of another zone.
To prevent this behavior, Solaris 11 Express offers the ability to assign a datalink's interrupt handler to a set of CPUs or a pool of CPUs. To simplify this further, the obvious choice is made for you, by default, for a zone which is assigned its own resource pool. When such a zone boots, a resource pool is created for the zone, a sufficient quantity of CPUs is moved from the default pool to the zone's pool, and interrupt handlers for that zone's datalink(s) are automatically reassigned to that resource pool. Network flows enable you to create multiple lanes of traffic. This allows the parallelization of network traffic. You can assign a bandwidth cap to a flow. Flows were introduced in the previous post and will be discussed further in future posts. Finally, the newest high speed NICs support hardware rings: memory resources that can be dedicated to a particular set of network traffic. For inbound packets, this is the first resource control that separates network traffic based on packet information such as destination MAC address. By assigning one or more rings to a stream of traffic, you can commit sufficient hardware resources to it and ensure a greater relative priority for those packets, even if another stream of traffic on the same NIC would otherwise cause congestion and impact packet latency of all streams. If you are using a NIC that does not support hardware rings, Solaris 11 Express support software rings which cause a similar effect. Virtual Networks, Real Security In addition to rescource controls, Solaris 11 Express offers datalink protection controls. These controls are intended to prevent a user from creating improper packets that would cause mischief on the network. The mac-nospoof property requires that outgoing packets have a MAC address which matches the link's MAC address. The ip-nospoof property implements a similar restriction, but for IP addresses. The dhcp-nospoof property prevents improper DHCP assignment. Summary (so far) The network virtualization features in Solaris 11 Express enable the craetion of virtual network devices, leading to the implementation of an entire network inside one Solaris system. Associated resource control features give you the ability to manage network bandwidth as a resource and reduce the potential for one workload to cause network performance problems for another workload. Finally, security features help you minimize the impact of an intruder. With all of the introduction out of the way, next time I'll show some actual uses of these concepts.
This is the third in a series of blog entries that discuss the network virtualization features in Oracle Solaris 11 Express. Part 1 introduced the concept of network virtualization and listed the basic virtual network elements that Solaris 11 Express (S11E) provides. Part 2 expanded on the concepts and discussed the resource management features which can be applied to those virtual network elements (VNEs). This blog entry assumes that you have some experience with Solaris Zones. If you don't, you can read my earlier blog entries, or buy the book "Oracle Solaris 10 System Virtualization Essentials" or read the documentation. This entry will demonstrate the creation of some of these VNEs. For today's examples, I will use an old Sun Fire T2000 that has one SPARC CMT (T1) chip and 32GB RAM. I will pretend that I am implementing a 3-tier architecture in this one system, where each tier is represented by one Solaris zone. The mythical example provides access to an employee database. The 3-tier service is named 'emp' and VNEs will use 'emp' in their names to reduce confusion regarding the dozens of VNEs we expect to create for the services this system will deliver. The commands shown below use the prompt "GZ#" to indicate that the command is entered in the global zone by someone with sufficient privileges. Similarly, the prompt "emp-web1#" indicates a command which is entered in the zone "emp-web1" as a sufficiently privileged user. Fortunately, Solaris network engineers gathered all of the actions regarding the management of network elements (virtual or physical) into one command: dladm(1M). You use dladm to create, destroy, and configure datalinks such as VNICs. You can also use it to list physical NICs: GZ# dladm show-link LINK CLASS MTU STATE BRIDGE OVER e1000g0 phys 1500 up --e1000g2 phys 1500 unknown --e1000g1 phys 1500 down --e1000g3 phys 1500 unknown --We need three VNICs for our three zones, one VNIC per zone. They will also have useful names - one for each of the tiers - and will share e1000g0: GZ# dladm create-vnic -l e1000g0 emp_web1 GZ# dladm create-vnic -l e1000g0 emp_app1 GZ# dladm create-vnic -l e1000g0 emp_db1 GZ# dladm show-link LINK CLASS MTU STATE BRIDGE OVER e1000g0 phys 1500 up --e1000g2 phys 1500 unknown --e1000g1 phys 1500 down --e1000g3 phys 1500 unknown ---
emp_web1 vnic 1500 up -e1000g0 emp_app1 vnic 1500 up -e1000g0 emp_db1 vnic 1500 up -e1000g0 GZ# dladm show-vnic LINK OVER SPEED MACADDRESS MACADDRTYPE VID emp_web1 e1000g0 0 2:8:20:3a:43:c8 random 0 emp_app1 e1000g0 0 2:8:20:36:a1:17 random 0 emp_db1 e1000g0 0 2:8:20:b4:5b:d3 random 0
The system has four NICs and three VNICs. Note that the name of a VNIC may not include a hyphen (-) but may include an underscore (_). VNICs that share a NIC appear to be attached together via a virtual switch. That vSwitch is created automatically by Solaris. This diagram represents the NIC and NVEs we have created.
Now that these datalinks - the VNICs - exist, we can assign them to our zones. I'll assume that the zones already exist, and just need network assignment. GZ# zonecfg -z emp-web1 info zonename: emp-web1 zonepath: /zones/emp-web1 brand: ipkg autoboot: false bootargs:
pool: limitpriv: scheduling-class: ip-type: exclusive hostid: fs-allowed: GZ# zonecfg -z emp-web1 zonecfg:emp-web1> add net zonecfg:emp-web1:net> set physical=emp_web1 zonecfg:emp-web1:net> end zonecfg:emp-web1> exit Those steps can be followed for the other two zones and matching VNICs. After those steps are completed, our earlier diagram would look like this:
Packets passing from one zone to another within a Solaris instance do not leave the computer, if they are in the same subnet and use the same datalink. This greatly improves network bandwidth and latency. Otherwise, the packets will head for the zone's default router. Therefore, in the above diagram packets sent from emp-web1 destined for emp-app1 would traverse the virtual switch, but not pass through e1000g0. This zone is an "exclusive-IP" zone, meaning that it "owns" its own networking. What is its view of networking? That's easy to determine. The zlogin(1M) command inserts a complete command-line into the zone. By default, the command is run as the root user.
GZ# zoneadm -z emp-web1 boot GZ# zlogin emp-web1 dladm show-link LINK CLASS MTU STATE BRIDGE OVER emp_web1 vnic 1500 up -? GZ# zlogin emp-web1 dladm show-vnic LINK OVER SPEED MACADDRESS MACADDRTYPE emp_web1 ? 0 2:8:20:3a:43:c8 random 0
VID
Notice that the zone sees its own VNEs, but cannot see NEs or VNEs in the global zone, or in any other zone. The other important new networking command in Solaris 11 Express is ipadm(1M). That command creates IP address assignments, enables and disables them, displays IP address configuration information, and performs other actions. The following example shows the global zone's view before configuring IP in the zone: GZ# ipadm show-if IFNAME STATE CURRENT PERSISTENT lo0 ok -m-v------46 --e1000g0 ok bm--------4- --GZ# ipadm show-addr ADDROBJ TYPE STATE ADDR lo0/v4 static ok 127.0.0.1/8 lo0/? static ok 127.0.0.1/8 lo0/? static ok 127.0.0.1/8 lo0/? static ok 127.0.0.1/8 e1000g0/_a static ok 10.140.204.69/24 lo0/v6 static ok ::1/128 lo0/? static ok ::1/128 lo0/? static ok ::1/128 lo0/? static ok ::1/128 At this point, not only does the zone know it has a datalink (which we saw above) but the IP tools show that it is there, ready for use. The next example shows this: GZ# zlogin emp-web1 ipadm show-if IFNAME STATE CURRENT PERSISTENT lo0 ok -m-v------46 --GZ# zlogin emp-web1 ipadm show-addr ADDROBJ TYPE STATE ADDR lo0/v4 static ok 127.0.0.1/8 lo0/v6 static ok ::1/128
An ethernetdatalink without an IP address isn't very useful, so let's configure an IP interface and apply an IP address to it: GZ# zlogin emp-web1 ipadm show-if IFNAME STATE CURRENT PERSISTENT lo0 ok -m-v------46 --GZ# zlogin emp-web1 ipadm show-addr ADDROBJ TYPE STATE ADDR lo0/v4 static ok 127.0.0.1/8 lo0/v6 static ok ::1/128 GZ# zlogin emp-web1 ipadm create-if emp_web1 GZ# zlogin emp-web1 ipadm show-if IFNAME STATE CURRENT PERSISTENT lo0 ok -m-v------46 --emp_web1 down bm--------46 -46 GZ# zlogin emp-web1 ipadm create-addr -T static -a local=10.140.205.82/24 emp_web1/v4static GZ# zlogin emp-web1 ipadm show-addr ADDROBJ TYPE STATE ADDR lo0/v4 static ok 127.0.0.1/8 emp_web1/v4static static ok 10.140.205.82/24 lo0/v6 static ok ::1/128 GZ# zlogin emp-web1 ifconfig emp_web1 emp_web1: flags=1000843 mtu 1500 index 2 inet 10.140.205.82 netmask ffffff00 broadcast 10.140.205.255 ether 2:8:20:3a:43:c8 The last command above shows the "old" way of displaying IP address configuration. The command ifconfig(1) is still there, but the new tools dladm and ipadm provide a more consistent interface, with well-defined separation between datalink management and IP management. Of course, if you want the zone's outbound packets to be routed to other networks, you must use the route(1M) command, the /etc/defaultrouter file, or both. Next time, I'll show a new network measurement tool and the ability to control the amount of network bandwidth consumed.
Resource Controls
This is the fourth part of a series of blog entries about Solaris network virtualization. Part 1 introduced network virtualization, Part 2 discussed network resource management capabilities available in Solaris 11 Express, and Part 3 demonstrated the use of virtual NICs and virtual switches.
This entry shows the use of a bandwidth cap on Virtual Network Elements (VNEs). This form of network resource control can effectively limit the amount of bandwidth consumed by a particular stream of packets. In our context, we will restrict the amount of bandwidth that a zone can use. As a reminder, we have the following network topology, with three zones and three VNICs, one VNIC per zone.
All three VNICs were created on one ethernet interface in Part 3 of this series. Capping VNIC Bandwidth
Using a T2000 server in a lab environment, we can measure network throughput with the new dlstat(1) command. This command reports various statistics about data links, including the quantity of packets, bytes, interrupts, polls, drops, blocks, and other data. Because I am trying to illustrate the use of commands, not optimize performance, the network workload will be a simple file transfer using ftp(1). This method of measuring network bandwidth is reasonable for this purpose, but says nothing about the performance of this platform. For example, this method reads data from a disk. Some of that data may be cached, but disk performance may impact the network bandwidth measured here. However, we can still achieve the basic goal: demonstrating the effectiveness of a bandwidth cap.
With that background out of the way, first let's check the current status of our links.
GZ# dladm show-link LINK CLASS MTU STATE BRIDGE OVER e1000g0 phys 1500 up --e1000g2 phys 1500 unknown --e1000g1 phys 1500 down --e1000g3 phys 1500 unknown --emp_web1 vnic 1500 up -e1000g0 emp_app1 vnic 1500 up -e1000g0 emp_db1 vnic 1500 up -e1000g0 GZ# dladm show-linkprop emp_app1 LINK PROPERTY PERM VALUE DEFAULT POSSIBLE emp_app1 autopushrw ---emp_app1 zone rwemp-app --emp_app1 state runknown up up,down emp_app1 mturw 1500 1500 1500 emp_app1 maxbwrw ---emp_app1 cpusrw ---emp_app1 cpus-effective r1-9 --emp_app1 pool rwSUNWtmp_emp-app --emp_app1 pool-effective rSUNWtmp_emp-app --emp_app1 priority rw high highlow,medium,high emp_app1 tagmoderwvlanonlyvlanonlynormal,vlanonly emp_app1 protection rw --mac-nospoof, restricted, ip-nospoof, dhcp-nospoof <some lines deleted>
Before setting any bandwidth caps, let's determine the transfer rates between a zone on this system and a remote system.
It's easy to use dlstat to determine the data rate to my home system while transferring a file from a zone:
GZ# dlstat -i 10 e1000g0 LINK IPKTS emp_app1 27.99M emp_app1 83 emp_app1 339 emp_app1 1.79K emp_app1 2.27K emp_app1 2.35K emp_app1 2.65K emp_app1 600 emp_app1 112 RBYTES 2.11G 6.72K 23.73K 120.09K 153.60K 156.27K 182.81K 44.10K 8.43K OPKTS 54.18M 0 1.36K 6.78K 8.49K 8.88K 5.09K 935 0 OBYTES 77.34G 0 1.68M 8.38M 10.50M 10.98M 6.30M 1.15M 0
The OBYTES column is simply the number of bytes transferred during that data sample. I'll ignore the 1.68MB and 1.15MB data points because the file transfer began and ended during those samples. The average of the other values leads to a bandwidth of 7.6 Mbps (megabits per second), which is typical for my broadband connection.
Let's pretend that we want to constrain the bandwidth consumed by that workload to 2 Mbps. Perhaps we want to leave all of the rest for a higher-priority workload. Perhaps we're an ISP and charge for different levels of available bandwidth. Regardless of the situation, capping bandwidth is easy:
GZ# dladm set-linkprop -p maxbw=2000k emp_app1 GZ# dladm show-linkprop -p maxbw emp__app1 LINK PROPERTY PERM VALUE DEFAULT emp_app1 maxbwrw 2 --GZ# dlstat -i 20 emp_app1 LINK IPKTS RBYTES OPKTS OBYTES emp_app1 18.21M 1.43G 10.22M 14.56G emp_app1 186 13.98K 0 0 emp_app1 613 51.98K 1.09K 1.34M emp_app1 1.51K 107.85K 3.94K 4.87M emp_app1 1.88K 131.19K 3.12K 3.86M emp_app1 2.07K 143.17K 3.65K 4.51M emp_app1 1.84K 136.03K 3.03K 3.75M emp_app1 2.10K 145.69K 3.70K 4.57M emp_app1 2.24K 154.95K 3.89K 4.81M emp_app1 2.43K 166.01K 4.33K 5.35M emp_app1 2.48K 168.63K 4.29K 5.30M emp_app1 2.36K 164.55K 4.32K 5.34M emp_app1 519 42.91K 643 793.01K emp_app1 200 18.59K 0 0
POSSIBLE
Note that for dladm, the default unit for maxbw is Mbps. The average of the full samples is 1.97 Mbps.
Between zones, the uncapped data rate is higher:

GZ# dladm reset-linkprop -p maxbw emp_app1 GZ# dladm show-linkprop -p maxbw emp_app1 LINK PROPERTY PERM VALUE emp_app1 maxbwrw --GZ# dlstat -i 20 emp_app1 LINK IPKTS RBYTES OPKTS emp_app1 20.80M 1.62G 23.36M emp_app1 208 16.59K 0 emp_app1 24.48K 1.63M 193.94K emp_app1 265.68K 17.54M 2.05M emp_app1 266.87K 17.62M 2.06M emp_app1 255.78K 16.88M 1.98M emp_app1 206.20K 13.62M 1.34M emp_app1 18.87K 1.25M 79.81K emp_app1 246 17.08K 0
DEFAULT -OBYTES 33.25G 0 277.50M 2.93G 2.94G 2.83G 1.92G 114.23M 0
POSSIBLE
This five year old T2000 can move at least 1.2 Gbps of data, internally, but that took five simultaneous ftp sessions. (A better measurement method, one that doesn't include the limits of disk drives, would yield better results, and newer systems, either x86 or SPARC, have higher internal bandwidth characteristics.) In any case, the maximum data rate is not interesting for our purpose, which is demonstration of the ability to cap that rate.
You can often resolve a network bottleneck while maintaining workload isolation, by moving two separate workloads onto the same system, within separate zones. However, you might
choose to limit their bandwidth consumption. Fortunately, the NV tools in Solaris 11 Express enable you to accomplish that:
GZ# dladm set-linkprop -t -p maxbw=25m emp_app1 GZ# dladm show-linkprop -p maxbw emp_app1 LINK PROPERTY PERM VALUE emp_app1 maxbwrw 25 --
DEFAULT --
POSSIBLE
Note that the change to the bandwidth cap was made while the zone was running, potentially while network traffic was flowing. Also, changes made by dladm are persistent across reboots of Solaris unless you specify a "-t" on command line.
Data moves much more slowly now:

GZ# # dlstat -i 20 emp_app1 LINK IPKTS RBYTES emp_app1 23.84M 1.82G emp_app1 192 16.10K emp_app1 1.15K 79.21K emp_app1 18.16K 1.20M emp_app1 17.99K 1.20M emp_app1 17.85K 1.19M emp_app1 17.39K 1.15M emp_app1 18.02K 1.19M emp_app1 18.66K 1.24M emp_app1 18.56K 1.23M <many lines deleted> OPKTS 46.44M 0 5.77K 40.24K 39.46K 39.11K 38.16K 39.53K 39.60K 39.24K OBYTES 66.28G 0 8.24M 57.60M 56.48M 55.97M 54.62M 56.58M 56.68M 56.17M
The data show an aggregate bandwidth of 24 Mbps.
Conclusion
The network virtualization tools in Solaris 11 Express include various resource controls. The simplest of these is the bandwidth cap, which you can use to effectively limit the amount of bandwidth that a workload can consume. Both physical NICs and virtual NICs may be capped by using this simple method. This also applies to workloads that are in Solaris Zones - both default zones and Solaris 10 Zones which mimic Solaris 10 systems.
Next time we'll explore some other virtual network architectures.

Virtual Networks in Solaris

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Virtual Networks in Solaris

Transféré par

Droits d'auteur :

Formats disponibles

Virtual Networks

Introduction to Network Virtualization

any network topology as a

Virtual Network Elements

Between zones, the uncapped data rate is higher:

DEFAULT -OBYTES 33.25G 0 277.50M 2.93G 2.94G 2.83G 1.92G 114.23M 0

Data moves much more slowly now:

The data show an aggregate bandwidth of 24 Mbps.

Next time we'll explore some other virtual network architectures.

Vous aimerez peut-être aussi