Nathan Evans' Nemesis of the Moment

Solved: Hyper-V’s VMConnect tool sporadically losing connection to VM

Posted in Windows Environment by Nathan B. Evans on March 5, 2011

Last night we performed a big switch over in our data centre. We moved everything onto a new managed switch and Sonicwall firewall, re-pointed and re-addressed lots and lots of servers, and well basically just done a bunch of stuff we should have done yonks ago! Everything seemed to go really well except for one thing: our Hyper-V hosts were now throwing really annoying and random disconnection errors when connected straight into a VM using its “Connect…” menu item, or otherwise known as VMConnect.exe. The connection would work for at least a couple seconds, sometimes for as long as a minute or two. But then it would barf up and the following error message dialog would be displayed.

Hyper-V's VMConnect tool displaying the disconnection error message

The full description of the error was as follows:

The connection to the virtual machine was lost. This can occur when a virtual machine stops unexpectedly, or when network problems occur. Try to connect again. If the problem persists, contact your system administator.

Would you like to try to reconnnect?

This was really annoying because we were connecting to local VMs that were present on the exact same VM host from which we were connecting. So presumably there wouldn’t be any packets hitting the network, and thus ruling out any of the new hardware and network changes we had just made.

After racking my brains on it for a bit (which included firing up Wireshark to perform a sanity check), I loaded up TCPView. This is a really great little tool from Mark Russinovich‘s stable called Windows Sysinternals. With this tool running I then retried the VMConnect, so that I could see what socket activities it was performing.

TCPView shows the VMConnect appearing to use TCP/IP V6 even for localhost

What this showed is that even when connecting to the local VM host using “localhost” or “127.0.0.1” as the address (i.e. IPv4) the VMConnect tool was seemingly transforming this into a IPV6 address and then forming a TCPV6 connection. This was interesting.

I immediately went to check whether IPV6 was actually enabled on the VM host’s network adapters. Low and behold, it was not. Turns out that when we flicked over the Gateway IP to point to the new firewall, we also subconciously turned off the IPV6 protocol on the list! A fairly innocuous thing to do, one would think, especially on an internal LAN!

Hyper-V's virtual network adapter TCP/IP settings

So there you have it. If you come across this problem with Hyper-V, I would recommend you immediately check to ensure that you have not inadvertently disabled the IPV6 protocol on your virtual network adapter for Hyper-V.

The very moment we re-enabled IPV6, the problem with VMConnect constantly disconnecting every few seconds totally went away!

Not many problems get more obscure than this.

Advertisements
Tagged with:

CentOS 5.5 losing time synchronisation on Hyper-V R2

Posted in Unix Environment, Windows Environment by Nathan B. Evans on February 21, 2011

Recently I deployed a CentOS 5.5 x64 guest on a Server 2008 R2 Hyper-V host for running a basic mail server. However I quickly noticed that the guest was losing time sync very rapidly, in the order of several positive minutes per hour. This was a surprise as I had already installed the Hyper-V Linux Integration components from Microsoft which I had assumed would just take care of everything for me. Not so, apparently.

I might also add that Dovecot (an IMAP/POP3 server) kept crashing due to the periodic NTP sync being so far out that it resulted in the guest’s clock actually going BACK in time! This appears to be an acknowledged bug in Dovecot but it is understandable why they don’t feel a pressing need to fix it. Though I understand the 2.0 release has. For the record, I was running the 1.0.7.7.el5 release.

Anyway, it turned out that the solution was very simple.

Modify the /boot/grub/grub.conf as follows:

Editing the /boot/grub/grub.conf file.

Essentially you need to modify the lines that start with the word “kernel” and add two extra options onto the end:

  • clock=pit

    This sets the clock source to use the Programmable Interrupt Timer (PIT). This is a fairly low level way for the kernel to track time and it works best with Hyper-V and Linux.

  • notsc

    This is included more as belt-and-braces than anything. Because setting the PIT clock source (above) should already imply this setting really. But I include it for pure expressiveness 🙂

  • divider=10

    This adjusts the PIT frequency resolution to be accurate to 10 milliseconds (which is perfectly sufficient for most applications). This isn’t strictly required but it will reduce some CPU load caused by the VM. If the VM will be running time sensitive calculations a lot (such as say a VoIP server or gaming server) then you probably shouldn’t include this option.

Once you’ve done this, save the file and reboot the box.

The time should now be synchronized precisely with the Hyper-V host!

PS: I’ve not tested whether this solution will work without the Hyper-V Linux Integration Components installed, but I believe that it will. As it operates independently of Hyper-V’s clock synchronisation mechanism and relies purely on the virtualised Programmable Interrupt Timer that is exposed by Hyper-V. And the PIT clock source will remain unchanged whether you have the Linux Integration Components installed or not.

PPS: The VMware knowledge base has a great article on this subject: http://kb.vmware.com/kb/1006427. Although take it with a slight pinch of salt (because the subject is Hyper-V) but it certainly gives several more options and ideas to try for different Linux distributions and 32-bit vs 64-bit etc environments.

Tagged with: ,