[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] 2/3: MCA/MCE correctable error handling



I'm not sure you should use shared_info at all. This interface should be
low-rate enough that you can add a sysctl (assuming this info is applicable
for dom0 only, which it looks to be). Actually, probably a platform_op
rather than a sysctl...

 -- Keir

On 22/8/07 16:47, "Christoph Egger" <Christoph.Egger@xxxxxxx> wrote:

> 
> I fixed all this in my code and will re-submit the new patch.
> 
> Christoph
> 
> 
> 
> On Wednesday 22 August 2007 12:01:26 Jan Beulich wrote:
>>>>> "Christoph Egger" <Christoph.Egger@xxxxxxx> 22.08.07 10:47 >>>
>>> 
>>> On Tuesday 21 August 2007 17:53:45 Jan Beulich wrote:
>>>>> +} __attribute__((packed));
>>>> 
>>>> I think it was a general agreement that it is not a good idea
>>>> (non-portable to non-GNU compilers) to put things like this in public
>>>> headers. I think by properly ordering things you can get away without
>>>> this (and almost without padding).
>>> 
>>> Oh, you're right. I should use #pragma pack(1)  instead.
>>> Is this fine with you?
>> 
>> I'm afraid that wouldn't work with the compat mode header generation, as
>> the original use of #pragma pack(push, ...) was banned for (Sun)
>> compatibility reasons. Consequently, I believe the only way of doing this
>> properly is to avoid depending on compiler behavior by arranging things
>> appropriately (including padding members if needed) and avoiding #pragma
>> pack() altogether.
>> 
>>>>> +struct mcinfo_global {
>>>>> ...
>>>>> +    uint16_t mc_socketid;
>>>>> +    uint16_t mc_coreid;
>>>>> +    uint16_t mc_vcpu_id;
>>>> 
>>>> Unless mc_vcpu_id is intended for that purpose, this neglects
>>>> hyperthreading (I know, AMD doesn't use this, but the interface should
>>>> be vendor neutral).
>>>> 
>>>> If mc_vcpu_id is meant for that purpose, its naming is ambiguous.
>>>> 
>>>> If mc_vcpu_id is meant as a vcpuid, then ordering things os that
>>>> mc_domid and mc_vcpu_id are contiguous would seem more natural (making
>>>> eventual examination in raw memory less confusing).
>>> 
>>> mc_coreid is the physical core that reported the machine check
>>> information. mc_socketid is the physical socket the physical core is in.
>>> This is useful to find all other affected physical cores, when an error
>>> in the L3-Cache is reported that is shared over all cores in the chip.
>>> 
>>> mc_vcpu_id is the id of the active vcpu for the domain, that reported the
>>> machine check information. Inside Xen, this is current->vcpu_id.
>>> mc_vcpu_id is needed to deal properly with the upcoming NUMA support
>>> my collegue is working on.
>> 
>> Okay, but then you're really lacking a thread id here, for being filled by
>> respective Intel code (AMD code would set this to zero).
>> 
>>>>> +/* sizeof(struct mcinfo_global) + 6 * sizeof(struct mcinfo_bank) ==
>>>>> 200. + * This is enough space to store mc information of up to six
>>>>> banks. + */
>>>>> +#define MCINFO_MAXSIZE (204 - sizeof(uint32_t))
>>>> 
>>>> Why don't you use the sizeof()-s from the description? If this is really
>>>> needed for some reason, then having 200 in the description and 204 in
>>>> the macro is at least confusing...
>>> 
>>> MCINFO_MAXSIZE is the size for the content of the mi_data array.
>>> MCINFO_MAXSIZE + sizeof(mi_nentries) == 204. That is where is number comes
>>> from.
>> 
>> I concluded that, but pointed out that seeing two different numbers made
>> me think of why this is so, whereas identical numbers would have avoided
>> that (and will likely keep others from asking later, too).
>> 
>> Also, you don't say what was the reason for you to use constants instead
>> of sizeof() here.
>> 
>>>>>     /* Frame containing list of mfns containing list of mfns
>>>>> containing p2m. */ xen_pfn_t     pfn_to_mfn_frame_list_list;
>>>>>     unsigned long nmi_reason;
>>>>> +    struct arch_mc_info mc_info; /* machine check information */
>>>>>     uint64_t pad[32];
>>>>> };
>>>> 
>>>> Are you sure it is appropriate to add a member here without reducing the
>>>> padding member?
>>> 
>>> You want me to replace "uint64_t pad[32];" with "uint32_t pad[13];" ?
>> 
>> It would be my understanding that that's the right thing to do (assuming
>> you calculated the numbers correctly), but I'd rely on Keir to give a final
>> word on this.
>> 
>> Jan
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.