[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?



David Stone wrote:
>>> root@localhost xen]# (XEN) mtrr.c:552:d1 Conflict occurs for a given
>>> guest l1e flags:63 at 10000000 (the effective mm type:6), because
>>> the host mtrr type is:0 (XEN) CPU 1: Machine Check Exception:
>>> 0000000000000005 (XEN) Bank 0: b200004000000800
>>> (XEN) Bank 5: b200121020080400
>>> (XEN)
>>> (XEN) ****************************************
>>> (XEN) Panic on CPU 1:
>>> (XEN) CPU context corrupt****************************************
>>> (XEN) (XEN) Reboot in five seconds..
>> 
>> That looks like the CPU toasted itself. Bits 0-16 == 0x0400 in a
>> machine-check status register means 'CPU internal timer error'.
>> Perhaps this #MC means something else in the context of VT-d though?
>> We probably need someone from Intel to help decode what happened
>> here. 
> 
> Hmm, thanks.    I'll concentrate ont he #MC for now.
> 
> Regarding which, how areyou resolving 0x0400 in the status register to
> 'CPU internal timer error'?  I'm looking at the "System Programming"
> Intel manual and it seems to indicate that an Error Code with bits
> 0000 01xx xxxx xxxx (like 0x0400) is an "Internal Unclassified" error.
> 
For #MC, I read the spec 14.7.2, Bits 0-16 = 0800 means : 
BUSL0_SRC_ERR_M_NOTIMEOUT_ERR.
Seems this one may relate to memory operation.
For MTRR conflict warning, it should not result to a MC...
A little curious about the cache type of the spaddr. Usually, the
conflict 
Occurs when guest wants a strong cache type, but spaddr is a weaker
cache type.  
Can you check the cache type of spaddr/gpaddr manually? Or provide more
information
About guest/host MTRR/PAT, and corresponding pte.  
> For machine-checks, is there the notion of protecting the hypervisor
> from problems encountered in the HVM guest?  I.e., if a #MC happens
> when a guest is executing (non-root mode), is the host equally
> screwed?  I'm guessing not if it is the nature of a #MC is such that
> it is the processor itself that is screwed, not any particular level
> of hardware?
> 
> Finally, one thing I'm still not sure about is exactly what PCI
> devices (as identified by B:D:F) I should hide from Dom0 and pass
> through to the guest.  For my machine, the PCI topology as seen from
> Dom0 is:
> 
> #lspci
> 00:00.0 Host bridge [0600]: Intel Corporation DRAM Controller
> [8086:29b0] (rev 02)
> 00:01.0 PCI bridge [0604]: Intel Corporation PCI Express Root Port
> [8086:29b1] (rev 02)
> 00:1c.0 PCI bridge [0604]: Intel Corporation PCI Express Port 1
> [8086:2940] (rev 02)
> 00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge
> [8086:244e] (rev 92)
> 01:00.0 VGA compatible controller [0300]: ATI Technologies Inc Unknown
> device [1002:94c3]
> 
> 01:00.0 is the 16-lane PCI-Express graphics card I'm trying to pass
> through to my Windows DomU.  00:01.0 is the root complex to which is
> attached (I'm pretty sure based on the below).  I think 00:1c.0 is a
> switch to a one-lane PCI Express slot on the motherboard.
> 
> So I'm hiding/passing through both the root complex (00:01.0) and the
> graphics card (01:00.0).  Interestingly if I explicitly hide the root
> complex only, pciback seems to automagically graps the graphics card.
> 
> Dave
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel



Best Regards,
Disheng, Su

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.