VMware, Linux Folding and Clock Drift

Credit for this entry goes to Folding Forum user weedacres. The most recent version can be found in the forum at VMware, Linux Folding and Clock Drift

Anyone who has been running the Linux folding client under VMWare very long has probably run into clock drift. From what I can see it doesn't effect folding at all, but it creates all kinds of problems with monitoring software like FAHMon or HFM.NET (my favorite).

I'll start out by saying that I I'm a noob in the Linux world. I've learned to get Ubuntu, notfreds and recently the Slackware bigadv client running under VMWare, but my lack of knowledge about the Linux world is astonishing. I'd like to thank chrisretusn for his help in getting me pointed in the right direction to mitigate this problem. I thought I'd share a beginners approach on how to do that after MtM suggested I do so.

As I understand it, Linux has 2 clocks. The hardware clock is the cmos clock on the host. The system clock is Linux's clock which is generally set from the hardware clock when Linux is booted, and only that one time. Under VMWare, Linux takes the hardware clock settings from VMWare, not the host cmos clock.

In my experience, if you've been changing clock speeds after VMWare has been installed, then VMWare gets confused and delivers the wrong clock ticks to it's client (Linux). This causes the hardware clock to drift and not be the same as the host's hardware cmos clock. This is a pretty common problem on newly built systems, you're trying to figure out your max overclock while trying to run a Linux Folding client under Windows. I've seen many VMWare forum fixes for this problem and have no success with any of them. My best success has been to reinstall VMWare. Generally that takes care of the hardware clock problem, at least it has for me. If it doesn't we can still work around it, if we run a Linux client that utilizes cron.

What translates into is that if you're running a notfreds VMWare Appliance then you're out of luck trying to use this fix since it doen't have the cron service.

At this point I'll put in a plug for the Slackware bigadv client that linuxfah has put together. It was designed for the bigadv client but runs 2 and 4 core smp clients just fine. You can find it at Folding@Home - VMWare Player 3.0 and Folding Bigadv Support - LinuxForge.net It's a very fast way to get a more functional Linux client up and running under VMWare. Since it's all text based (no GUI) it's not near as easy to configure as notfreds but offers a lot more functionality, including cron, the ability to schedule jobs and the way I am using to manage the clock drift problem.

You'll know that you have clock drift when FAHMon or HFM.NET turns your Linux client blue after awhile. Sometimes it'll take several hours or perhaps days, sometimes just minutes. My recent escapade into making changes to a perfectly running folding farm resulted in losing 1 minute in 7 on my 5 smp clients.

If you're experiencing clock drift you'll want to see which clock is out of sync with the host cmos clock. If the hardware clock is out of sync then the system clock will also be out. Obviously if you look just after Linux has booted you won't see much, you'll want to wait until the drift is apparent from your folding monitor.

To display the hardware clock, open a terminal and enter: hwclock

To display the system clock use: date

If the hardware clock is good and the system clock is bad then we'll just copy the hardware clock to the system clock, like we did during boot. We'll just do it on a regular basis using cron to schedule the job. If both clocks are bad then we'll set the clocks from an ntp server, either on you lan or on the internet, again using cron.

To set the system clock from the hardware clock we'll use: hwclock -s which is the short form of hwclock --hctosys

Apparently cron does not like the -- of the long from so be sure to use the short form. You can do either from a terminal to see what happens. You can google hwclock for more details or look at hwclock(8): query/set hardware clock - Linux man page

From here on out the locations are those of Slackware so you might have to search around to find the proper paths if you're using something else.

We're going to edit crontab and add a job to run hwclock every few minutes. I suppose if you clock was drifting slowly you could do it less often but I run it every 1 or 2 minutes with no apparent problem. There are 2 ways to edit crontab. The official method is: crontab -e

I'll warn you right here before you do it that this will bring up the vi editor. If you know what you're doing with vi then you're in good shape. If you don't know vi then I'd spend a little time learning how to use it. It is not intuitave! The advantage of using this method is that it'll apply the changes on the fly (so I'm told), saving a restart of the client. chrisretusn explains:

You can change the editor of crontab -e, by changing the VISUAL environment variable. This will run crontab -e with nano temporarily. env VISUAL=nano crontab -e

To make it permanent. Add the following to /etc/profile or to ~.profile export VISUAL=nano

In this case we have to use pico editor since nano isn't supported in V0.4 of the bigadv client. pico /var/spool/cron/crontabs/root



This is the unedited screen. You'll see several examples of cron jobs that are currently blank. We're going to add a line at the end that will run hwclock -s every 2 minutes: */2 * * * * /sbin/hwclock -s 1> /dev/null

We can change the 2 to whatever you want, 0 to 59 minutes. Pico isn't a gui editor so we need to move the cursor with the arrow keys. When done, crontab should look like this:



Notice that I put a comment line starting with # before my entry. When complete,, close with ctl-x, Y to save and enter to confirm the file. For an explanation of the format you can look here Newbie: Intro to cron or How-to for crontab - Ubuntu Forums Reboot the Linux client and start folding. I've found the reboot command doesn't do as clean a job as shutdown -r now which is what I use.

Here's some additional info from chrisretusn:

One of the things about editing /var/spool/cron/crontabs/root or any user file is the edit will not take effect right away unless you do one of these things:

1. Reboot, using the reboot command or "shutdown -r now" Note: In Slackware reboot when called with out options calls shutdown with the -r option. 2. Use "crontab -e", then quit the editor. 3. Restart the cron daemon, in Slackware you will have to find the pid, then use the kill command, then restart crond. ~# ps -ef | grep crond root    10893     1  0 19:41 ? 00:00:00 /usr/sbin/crond -l10 root    11774 11772  0 21:48 pts/0    00:00:00 grep crond ~# kill 10893 ~# /usr/sbin/crond -l10 Note: The -l option is logging, -l10 is the default in Slacware. *In Debian flavors like MInt use "services restart cron" *In some others it could be "/etc/rc.d/rc.cron restart" or "/ect/init.d/cron restart" *In Slackware if nothing is done, the edit will take effect after about an hour when the cron daemon rereads the crontabs again.

If you see a sendmail error every 2 minutes then you either have a typo, or path problem. Double check the path to hwclock with: whereis hwclock it may be in a different path on yours.

If your hardware clock is drifting then we can't sync from that, so we'll sync from an ntp server out on the internet. There are thousands so you won't have trouble finding one. We'll use ntp.nasa.gov. We'll do the same edit of crontab as before except this time we'll insert: */2 * * * * /usr/bin/ntpdate -s ntp.nasa.gov 1> /dev/null which will send us to one of the NASA time servers every 2 minutes.

I was a bit concerned with all of the overhead of having 5 systems doing this every 2 minutes so decided to install an NTP server on one of my Windows XP machines. Following these directions How to make a Windows XP machine an NTP server within a workgroup followed by:

1. Click on start -> run -> type 'gpedit.msc' without quotes. 2. Click on OK 3. In the Group Policy editor, please navigate to the following: Computer settings -> Administrative Templates -> System -> Windows Time Service -> Time Providers 4. Double click on 'Enable Windows NTP Server', click on enabled. 5. Click on Apply and OK To verify: 6. From the command prompt, type gpupdate/force and press enter. 7. Next type net time and press enter. 8. Your system should display its own time without any error message.

which I found in this forum (local) NTP Service If that all works you now have a functioning NTP Time Server running on your network. We can then replace the ntp.nasa.gov reference in crontab with: */2 * * * * /usr/bin/ntpdate -s 192.168.1.100 1> /dev/null changing the ip address to that of your shiny new time server.

I found that both methods worked for me since my problem was a drifting system clock, the hardware clock was fine. I figured the ntpdate overhead was higher than the hwclock -s overhead so I stuck with hwclock.

Hopefully this will help those that are frustrated by this problem.

--Chrisretusn 15:04, 23 December 2009 (UTC)

Many Thanks to Chrisretusn for his assistance in the original problem and getting this posted. Weedacres