This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] wget and Zope crashes on post-2.0.6 -testing

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] wget and Zope crashes on post-2.0.6 -testing
From: Osma Suominen <osma.suominen@xxxxxxxxxxxx>
Date: Thu, 2 Jun 2005 13:22:55 +0300 (EEST)
Delivery-date: Thu, 02 Jun 2005 10:22:12 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

I reported about time-related problems some days ago, with no replies:

I have problems with e.g. wget and Zope crashing on domU on a recent -testing build. This is on a Debian Sarge system, with kernel and a Xen -testing snapshot from two days ago (2005-05-31). The problems are not as easy to trigger as with earlier versions (e.g. the 2.0.5 demo CD), but they do happen.

The symptom is that during heavy load, wget crashes with the message "acalc_rate: Assertion `msecs >= 0' failed", which probably means that time has stepped backwards (looking at earlier xen-devel posts).

Also, Zope frequently dies with different time-related error messages. Here's the end of a typical traceback:

File "/usr/lib/zope2.7/lib/python/DateTime/DateTime.py", line 694, in _parse_args
    lt = safelocaltime(t)
File "/usr/lib/zope2.7/lib/python/DateTime/DateTime.py", line 437, in safelocaltime
    raise TimeError, 'The time %f is beyond the range ' \
TimeError: The time nan is beyond the range of this Python implementation.

It is fairly easy to crash Zope this way by using a tool such as apache's benchmarking utility ab/ab2 or wget to pound on it. It usually takes a few minutes on an otherwise unloaded machine to bring down Zope. Note that Zope runs just fine on a similar native Linux system, and after running production Zope systems for more than a year, I have never seen the kind of errors Zope on Xen brings up.

To cause the wget error (which I think is a symptom of a very similar problem), it is easiest to run SETI@Home which will put enough load on the system. It might take a few attempts but I can always crash wget this way when SETI is running.

It is my impression that these problems occur during bursts of high timer interrupt activity, but I haven't made detailed studies.

Is there anything I can do to help sort out this? For example, would it be a good idea to test unstable to see if it exhibits this behavior? Any help is appreciated, and since I soon need to run a production Zope system on several Xen hosts, I would like to find a solution to the frequent crashes.


*** Osma Suominen / MB Concert Ky *** osma.suominen@xxxxxxxxxxxx ***

Xen-devel mailing list