This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [PATCH] 2/3: MCA/MCE correctable error handling

To: Christoph Egger <Christoph.Egger@xxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] 2/3: MCA/MCE correctable error handling
From: Keir Fraser <keir@xxxxxxxxxxxxx>
Date: Wed, 22 Aug 2007 17:13:30 +0100
Cc: Gavin.Maltby@xxxxxxx, Jan Beulich <jbeulich@xxxxxxxxxx>
Delivery-date: Wed, 22 Aug 2007 09:14:13 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <200708221748.00072.Christoph.Egger@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acfk12ECn6ppdlDKEdyrvwAX8io7RQ==
Thread-topic: [Xen-devel] [PATCH] 2/3: MCA/MCE correctable error handling
User-agent: Microsoft-Entourage/
I'm not sure you should use shared_info at all. This interface should be
low-rate enough that you can add a sysctl (assuming this info is applicable
for dom0 only, which it looks to be). Actually, probably a platform_op
rather than a sysctl...

 -- Keir

On 22/8/07 16:47, "Christoph Egger" <Christoph.Egger@xxxxxxx> wrote:

> I fixed all this in my code and will re-submit the new patch.
> Christoph
> On Wednesday 22 August 2007 12:01:26 Jan Beulich wrote:
>>>>> "Christoph Egger" <Christoph.Egger@xxxxxxx> 22.08.07 10:47 >>>
>>> On Tuesday 21 August 2007 17:53:45 Jan Beulich wrote:
>>>>> +} __attribute__((packed));
>>>> I think it was a general agreement that it is not a good idea
>>>> (non-portable to non-GNU compilers) to put things like this in public
>>>> headers. I think by properly ordering things you can get away without
>>>> this (and almost without padding).
>>> Oh, you're right. I should use #pragma pack(1)  instead.
>>> Is this fine with you?
>> I'm afraid that wouldn't work with the compat mode header generation, as
>> the original use of #pragma pack(push, ...) was banned for (Sun)
>> compatibility reasons. Consequently, I believe the only way of doing this
>> properly is to avoid depending on compiler behavior by arranging things
>> appropriately (including padding members if needed) and avoiding #pragma
>> pack() altogether.
>>>>> +struct mcinfo_global {
>>>>> ...
>>>>> +    uint16_t mc_socketid;
>>>>> +    uint16_t mc_coreid;
>>>>> +    uint16_t mc_vcpu_id;
>>>> Unless mc_vcpu_id is intended for that purpose, this neglects
>>>> hyperthreading (I know, AMD doesn't use this, but the interface should
>>>> be vendor neutral).
>>>> If mc_vcpu_id is meant for that purpose, its naming is ambiguous.
>>>> If mc_vcpu_id is meant as a vcpuid, then ordering things os that
>>>> mc_domid and mc_vcpu_id are contiguous would seem more natural (making
>>>> eventual examination in raw memory less confusing).
>>> mc_coreid is the physical core that reported the machine check
>>> information. mc_socketid is the physical socket the physical core is in.
>>> This is useful to find all other affected physical cores, when an error
>>> in the L3-Cache is reported that is shared over all cores in the chip.
>>> mc_vcpu_id is the id of the active vcpu for the domain, that reported the
>>> machine check information. Inside Xen, this is current->vcpu_id.
>>> mc_vcpu_id is needed to deal properly with the upcoming NUMA support
>>> my collegue is working on.
>> Okay, but then you're really lacking a thread id here, for being filled by
>> respective Intel code (AMD code would set this to zero).
>>>>> +/* sizeof(struct mcinfo_global) + 6 * sizeof(struct mcinfo_bank) ==
>>>>> 200. + * This is enough space to store mc information of up to six
>>>>> banks. + */
>>>>> +#define MCINFO_MAXSIZE (204 - sizeof(uint32_t))
>>>> Why don't you use the sizeof()-s from the description? If this is really
>>>> needed for some reason, then having 200 in the description and 204 in
>>>> the macro is at least confusing...
>>> MCINFO_MAXSIZE is the size for the content of the mi_data array.
>>> MCINFO_MAXSIZE + sizeof(mi_nentries) == 204. That is where is number comes
>>> from.
>> I concluded that, but pointed out that seeing two different numbers made
>> me think of why this is so, whereas identical numbers would have avoided
>> that (and will likely keep others from asking later, too).
>> Also, you don't say what was the reason for you to use constants instead
>> of sizeof() here.
>>>>>     /* Frame containing list of mfns containing list of mfns
>>>>> containing p2m. */ xen_pfn_t     pfn_to_mfn_frame_list_list;
>>>>>     unsigned long nmi_reason;
>>>>> +    struct arch_mc_info mc_info; /* machine check information */
>>>>>     uint64_t pad[32];
>>>>> };
>>>> Are you sure it is appropriate to add a member here without reducing the
>>>> padding member?
>>> You want me to replace "uint64_t pad[32];" with "uint32_t pad[13];" ?
>> It would be my understanding that that's the right thing to do (assuming
>> you calculated the numbers correctly), but I'd rely on Keir to give a final
>> word on this.
>> Jan

Xen-devel mailing list