This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [PATCH] 3/3: MCA/MCE correctable error handling

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] [PATCH] 3/3: MCA/MCE correctable error handling
From: "Christoph Egger" <Christoph.Egger@xxxxxxx>
Date: Wed, 22 Aug 2007 17:56:00 +0200
Cc: Gavin.Maltby@xxxxxxx, Keir Fraser <keir@xxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxxxx>
Delivery-date: Wed, 22 Aug 2007 08:56:54 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <46CC2785.76E4.0078.0@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <200708211531.44997.Christoph.Egger@xxxxxxx> <200708221100.34795.Christoph.Egger@xxxxxxx> <46CC2785.76E4.0078.0@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.9.6
On Wednesday 22 August 2007 12:09:41 Jan Beulich wrote:
> >>> "Christoph Egger" <Christoph.Egger@xxxxxxx> 22.08.07 11:00 >>>
> >
> >On Tuesday 21 August 2007 18:02:54 Jan Beulich wrote:
> >> >+         if (mc_global->mc_flags & MC_FLAG_UNCORRECTABLE)
> >> >+                 printk(KERN_EMERG);
> >> >+         else
> >> >+                 printk(KERN_INFO);
> >>
> >> KERN_INFO seems gross understatement here - generally, correctable MCs
> >> are considered indicators that within not too distant future
> >> uncorrectable MCs might result, so this generally is a call for action
> >> (and hence shouldn't be hidden with default log level settings).
> >
> >Well, that is what the "old" code did. It used KERN_EMERG for fatal errors
> >and KERN_INFO in the polling service routine. What do you want me to
> > suggest?
> This should be at least KERN_WARNING, probably even KERN_ERR (note
> though that KERN_ERR and KERN_EMERG both resolve to XENLOG_ERR).

I changed to KERN_WARNING. This made the above if block
superflous. Tnx.
I will re-submit this patch as well.

> >> Also, I'm not sure adjusting the polling frequency makes much sense -
> >> 30s seems an awful lot of time to me.
> >
> >It's not clear to me what you are trying to tell me. Please
> > explain/elaborate.
> What I'm trying to say is that I'd think this should be polled at a much
> higher frequency (I'd suggest 1Hz), without adjustments. Typically, a
> healthy system will not encounter problems soon after boot, but after
> running for perhaps a very long time (and a system in bad condition is
> likely to encounter problems right away, so wouldn't be affected by
> changing the polling rate). Thus, in the general case, you'd have a
> comparably long latency, during which some kind of (automated) action could
> already be taken to preserve data consistency.

The polling routine that is in the -unstable tree (the version taken from 
Linux) runs every 15 seconds without adjustments.
1Hz causes too much system load for a healthy system IMO.
That's why I introduced the adjustments with use of hw threshold registers
to come to a compromise solution.

AMD Saxony, Dresden, Germany
Operating System Research Center

Legal Information:
AMD Saxony Limited Liability Company & Co. KG
Sitz (Geschäftsanschrift):
   Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland
Registergericht Dresden: HRA 4896
vertretungsberechtigter Komplementär:
   AMD Saxony LLC (Sitz Wilmington, Delaware, USA)
Geschäftsführer der AMD Saxony LLC:
   Dr. Hans-R. Deppe, Thomas McCoy

Xen-devel mailing list