Welcome to End Point’s blog

Ongoing observations by End Point people

Slow Xen virtualization of RHEL 3 i386 guest on RHEL 5 x86_64

It seems somehow appropriate that this post so closely follows Ethan's recent note about patches vs. complaints in free software. Here's the situation and the complaint (no patch, I'm sorry to say):

We're migrating an old server into a virtual machine on a new server, because our client needs to get rid of the old server very soon. Then afterwards we will migrate the services piecemeal to run natively on RHEL 5 x86_64 with current versions of each piece of the software stack, so we have time to test compatibility and make adjustments without being in a big hurry.

The old server is running RHEL 3 i386 on 2 Xeon @ 2.8 GHz CPUs (hyperthreaded), 4 GB RAM, 2 SCSI hard disks in RAID 1 on MegaRAID, running Red Hat's old 2.4.21-4.0.1.ELsmp kernel.

The new server is running RHEL 5 x86_64 on 2 Xeon quad-core L5410 @ 2.33GHz CPUs, 16 GB RAM, 6 SAS hard disks in RAID 10 on LSI MegaRAID, running Red Hat's recent 2.6.18-92.1.22.el5xen kernel.

The virtual machine is using Xen full virtualization, with 4 virtual CPUs and 4 GB RAM allocated, with a nearly identical copy of the operating system and applications from the old server. And it is bog-slow. Agonizingly slow.

Under the load of even a single repeated web request to web server (Apache) + app server (Interchange) + database server (MySQL), it breathes heavy, and takes 1-2 seconds per request (wildly varying). The old physical machine takes 0.5-0.7 seconds per request under 2 concurrent users. Under heavier load (just a boring day of regular web traffic) the new VM groans and plods along.

The most noticeable metric is that the CPUs get pegged from 50%-90% to more system usage, with under 40% user usage. This is nearly the opposite of the physical machine where system usage was always in the low teens %, and user usage was around 50% per CPU. In both cases there's almost no I/O wait.

First, I'm really surprised it's this bad. We've done Xen full virtualization of RHEL 5 x86_64 and i386 guests on RHEL 5 x86_64, with no special handling, and it's always worked quite well with little performance degradation.

So, we know there are paravirtualized drivers you can use to speed up network and disk devices even of otherwise fully virtualized guests. However, apparently you can't use the paravirtualized drivers in 32-bit RHEL 3 guest on a 64-bit RHEL 5 host. That's really painful, since in my mind a very common use case for virtualization is loading a bunch of old 32-bit machines on a big 64-bit machine with a lot of RAM. But ... not if you want it to even match the speed of the old servers!

We increased the number of virtual CPUs to 8. That took the edge off the worst slowdowns a bit, but only barely.

We tried upgrading the RHEL 3 guest to the very latest versions of everything from Red Hat Network (Update 9 IIRC) and upgrading the RHEL 5 host to the very latest RHEL 5.3, and saw the wild variation in performance from request to request moderate a lot. Also, performance under heavier concurrency was stable: 2.4-2.6 seconds per request in that scenario.

But that's still really slow. I hope we're just missing something obvious here. I'd love to know what the really stupid mistake we're making is. So far, the search has been fruitless and this seemingly ideal use can for Xen virtualization is barely usable.


kabewm said...

You can try using the paravirt kernel from

laclasse said...

Immediate guesses are degradated I/O performance on the guest due to using file-backed disk storage (tap:io).

If this is the case, much better performance can be archieved buy creating the new LV on a dom0 VG, and pointing the new lv to the VM.

Did you contact support?

Donald said...

Without pv drivers in the guest, FV guests exhibit poor block & net performance, and can use significant host resources (in qemu-dm) to trap & emulate the FV IO devices (realtek nic, ide disk).

So, the solution for all FV guests (be it Linux or Windows) is to provide & install pv drivers for nic & block.

For rhel3, RH provides the kmod-xenpv pkg as mentioned in this articles. To use it:

First, ensure you are running rhel3-u9.

second, make sure your (xen) guest config file has type=ioemu *removed* from your (nic) config after you have installed the xen pv drivers.
Note: the driver's don't autoload under rhel3, so ensure you follow the doc instructions to ensure the drivers are loaded on boot.

Note, also, that you can't use the vbd
driver for boot disk (unless you rebuild initrd & set ide=disable), so you want to split
your applic. disk/partition from boot disk/partition/image, and configure the
applic disk/partition/image to xvdb
(but <= xvdg).

We (RedHat) just tested the above,
rhel3-u9-i386 + vnif + vbd on rhel5.3 x86_64 host and it all worked.

The documentation about rhel3 i386 guests not working on x86_64 rhel5 hosts is incorrect. We will rev that document to eliminate that confusion.

Jon Jensen said...

Donald, thanks for your update. We'll give it a shot. For the VM of most interest we'll have to do some surgery since it's one large disk for boot + app and data.

I appreciate your reply.