http://www14.software.ibm.com/webapp/set2/sas/f/vios/documentation/perf.html
Reproduced from the link posted above:
The VIOS online pubs in InfoCenter include sections on sizing for both Virtual SCSI and SEA.
http://publib.boulder.ibm.com/infocenter/eserver/v1r2s/en_US/index.htm
For Virtual SCSI, please see the section titled "Virtual SCSI Sizing Considerations". For SEA, please see the section titled "Planning for shared Ethernet adapters."
QOS considerations
The Virtual I/O server is a shared resource that can be shared concurrently by Virtual SCSI and by Virtual Ethernet / Shared Ethernet.
Depending on the specific configuration of a Virtual I/O server, quality of service issues (long response times) may be encountered if insufficient CPU resources exist on the I/O server partition for the I/O load required. Recommendations for sizing and tuning the Virtual I/O server are discussed in the following paragraphs.
The Virtual Ethernet and Shared Ethernet drivers (LAN environment), like most device drivers, typically drive high interrupt rates and are CPU intensive. Streaming large packet workloads, like file transfers or data backup/restore, requires lower interrupt load than do other workloads that generate a lot of small packets. The Virtual SCSI has similar characteristics, where large IO's require lower CPU interrupt rates than do small I/O's. The Virtual SCSI environment runs at low priority interrupt levels, while the LAN environment runs on high priority interrupts. Because LAN's can be driven to high rates of activity with relatively little hardware and LAN pacing of requests for throughput requires matching adapter speeds, the LAN runs with high interrupt priority compared to disks which tend to be paced by disk latency.
If the combined workload of the LAN and SCSI in an I/O server partition are such that the available CPU capacity of the I/O server partition becomes expended, the response times will increase and the quality of service will be reduced, with the Virtual SCSI workloads being impacted the most. The possible tuning options are discussed below.
- Proper sizing of the Virtual I/O server
- Threading or non-threading of the Shared Ethernet
- Separate micro-partitions for the Virtual I/O server
The Shared Ethernet environment requires about 63% of a single CPU Virtual I/O server for simplex streaming of data or about 80% of a CPU for full duplex (two direction) streaming of data over a single Shared Ethernet adapter and one Virtual Ethernet. The Virtual SCSI requires about 20% of a CPU for large block streaming or as much as 70% for small block, high transaction rate workloads for a dual disk adapter environment. Given normal workloads, a single shared Gigabit Ethernet and single disk adapter can coexist on a dedicated partition utilizing a single CPU server without processor capacity limitations. If dual Gigabit Ethernets were to be configured on the same system, then the processing cycles can be totally consumed by the LAN environment and the disk response time may become poor at high processing utilization. The simplest solution is to configure a second CPU into the partition to provide the CPU resources needed to process the workload.
Another option is to tune the I/O server based on the workload and hardware configuration. One tuning variable is a "threading" option on the Shared Ethernet. Without threading enabled, the Virtual Ethernet and the real Ethernet drivers run on the interrupt level and forward their incoming packets thru the shared Ethernet driver and the destination driver on the interrupt level. This is the most efficient in terms of CPU cycles consumed (minimum path length) but is also the most disruptive for the SCSI software that runs more at a lower interrupt level. Threading can be enabled on a per Shared Ethernet device basis. Threading will cause the incoming packets to be queued to a thread. One of 10 special threads in the kernel will then be dispatched and will process the packet in the shared Ethernet driver and the associated outbound driver. This puts more processing in a user thread context and allows this processing to be shared more equally with the virtual SCSI I/O. However, the threading does increase the total CPU cycles required by the LAN (longer path length). The trade-off is more consistent quality of service but at a lower overall LAN throughput.
APAR IY62264 changes the default behavior of the Shared Ethernet devices to have "threading" enabled. Prior to this APAR, the default was non-threaded.
Depending on the customer environment, the threading option on Shared Ethernet can be tuned to help provide higher LAN performance (non-threaded) or improved shared performance when running with Virtual SCSI by running threaded. The following table shows example CPU utilization and throughput rates for the Virtual I/O server running one shared Ethernet with one Gigabit Ethernet and one Virtual Ethernet with and without threading.
P5, L4 (1.65 Ghz), 1 CPU partition with SMT enabled Virtual I/O server
Test | Threaded=1 | Threaded=0 |
TCP, Simplex streaming | 72% CPU at 940 Mbits/sec | 63% CPU at 940 Mbits/sec |
TCP, Duplex streaming | 87% CPU at 1420 Mbits/sec | 80% CPU at 1420 Mbits/sec |
The shared I/O server commands allow it to be determined if a Shared Ethernet device is threaded or non-threaded. An example is shown at the end of this article. If the threaded mode is enabled, the" Thread queue overflow packets:" statistic can be used as a indicator to tell if the packets are being dropped on the Shared Ethernet threads. These threads can queue up to 8192 input packets before the queue is overflowed and input packets are discarded. If this is occurring, the threads are not getting to run frequently enough, indicating a CPU overloaded condition. A small number of these might indicate normal behavior in a bursty network while a large number of these overflows would indicate the need for more CPU resources.
The third option is to use micro-partitioning to create two I/O Server partitions, one for the Virtual SCSI server and one for the Virtual Ethernet/ Shared Ethernet (LAN). An example might be two tenths of a CPU dedicated to Virtual SCSI and eight tenths of a CPU dedicated to Shared Ethernet. With this configuration, the hypervisor would be controlling the CPU resource and would ensure about 20% of the cycles goes to Virtual SCSI and 80% to the Shared Ethernet server. Because the two I/O servers run in different partitions, there will be minimal impact of one steeling cycles from the other. The hypervisor will ensure each I/O server partition gets its allocated quota.
Even with this configuration, under sizing the CPU's per the number of adapters in either server could still give low throughput. Proper sizing of the CPU resource for the workload is still required. With the disk and LAN servers in separate partitions, it would be easier to trouble shoot performance issues.
Here are the commands to enable or disable threading on Shared Ethernet Adapter (SEA) referred to earlier.
To enable/disable threading on the SEA adapter. Set "thread" attribute to either "0" for disabled or "1" for enabled.
chdev -dev entX -attr thread=0/1
There are two ways to check the state of threading on the SEA adapter.
The direct method
$ lsdev -dev ent5 -attr thread
value
0
The indirect method
$ entstat -all ent5
ETHERNET STATISTICS (ent5) :
Device Type: Shared Ethernet Adapter
Hardware Address: 00:02:55:53:bc:d1
Elapsed Time: 0 days 0 hours 0 minutes 0 seconds
Transmit Statistics: Receive Statistics:
-------------------- -------------------
Packets: 11180525 Packets: 11180525
Bytes: 1893260976 Bytes: 1893260976
Interrupts: 0 Interrupts: 1653132
Transmit Errors: 0 Receive Errors: 0
Packets Dropped: 0 Packets Dropped: 0
Bad Packets: 0
Max Packets on S/W Transmit Queue: 122
S/W Transmit Queue Overflow: 0
Current S/W+H/W Transmit Queue Length: 0
Broadcast Packets: 2 Broadcast Packets: 2
Multicast Packets: 3452 Multicast Packets: 0
No Carrier Sense: 0 CRC Errors: 0
DMA Underrun: 0 DMA Overrun: 0
Lost CTS Errors: 0 Alignment Errors: 0
Max Collision Errors: 0 No Resource Errors: 0
Late Collision Errors: 0 Receive Collision Errors: 0
Deferred: 1706 Packet Too Short Errors: 0
SQE Test: 0 Packet Too Long Errors: 0
Timeout Errors: 0 Packets Discarded by Adapter: 0
Single Collision Count: 0 Receiver Start Count: 0
Multiple Collision Count: 0
Current HW Transmit Queue Length: 0
General Statistics:
-------------------
No mbuf Errors: 0
Adapter Reset Count: 0
Driver Flags: Up Broadcast Running
Simplex 64BitSupport
--------------------------------------------------------------
Statistics for adapters in the Shared Ethernet Adapter ent5
--------------------------------------------------------------
Number of adapters: 2
SEA Flags: 00000000
<> <== This field will read if thread is enabled.
VLAN Ids :
ent4: 0 1
Real Side Statistics:
Packets received: 2079245
Packets bridged: 2079245
Packets consumed: 0
Packets received: 0
Packets transmitted: 9101280
Packets dropped: 0
Virtual Side Statistics:
Packets received: 9101280
Packets bridged: 9101280
Packets consumed: 0
Packets received: 0
Packets received: 2079245
Packets dropped: 0
Other Statistics:
Output packets generated: 0
Output packets dropped: 0
Device output failures: 0
Memory allocation failures: 0
ICMP error packets sent: 0
Non IP packets larger than MTU: 0
Thread queue overflow packets: 0