WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [PATCH, RFC, resend] Re: [Xen-devel] granting access to MSI-X table

To: "Yunhong Jiang" <yunhong.jiang@xxxxxxxxx>
Subject: RE: [PATCH, RFC, resend] Re: [Xen-devel] granting access to MSI-X table and pending bit array
From: "Jan Beulich" <JBeulich@xxxxxxxxxx>
Date: Thu, 26 Aug 2010 08:06:46 +0100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Delivery-date: Thu, 26 Aug 2010 00:08:25 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <789F9655DD1B8F43B48D77C5D30659732A128E9C@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4C6566B2020000780000FBD9@xxxxxxxxxxxxxxxxxx> <789F9655DD1B8F43B48D77C5D30659732A128E9C@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>> On 26.08.10 at 08:24, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:
>>From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx 
>>[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Jan Beulich
>>An alternative would be to determine and insert the address ranges
>>earlier into mmio_ro_ranges, but that would require a hook in the
>>PCI config space writes, which is particularly problematic in case
>>MMCONFIG accesses are being used.
> 
> I noticed you stated in your previous mail that this should be done in 
> hypervisor, not tools. Is it because tools is not trusted by xen hypervisor? 
> If tools can be trusted, is it possible to achieve this in tools: Tools tell 
> xen hypervisor the MMIO range that is read-only to this guest after the guest 
> is created, but before the domain is unpaused.

Doing this from the tools would have the unfortunate side effect that
Dom0 (or the associated stubdom) would continue to need special
casing, which really it shouldn't for this purpose (and all the special
casing in the patch is really just because qemu wants to map writably
the ranges in question, and this can only be removed after the
hypervisor handling changed).

Another aspect is that of necessity of denying write access to these
ranges: The guest must be prevented writing to them only once at
least one MSI-X interrupt got enabled. Plus (while not the case with
the current Linux implementation) a (non-Linux or future Linux)
version may choose to (re-)assign device resources only when the
device gets enabled, which would be after the guest was already
launched.

>>A second alternative would be to require Dom0 to report all devices
>>(or at least all MSI-X capable ones) regardless of whether they would
>>be used by that domain, and do so after resources got determined/
>>assigned for them (i.e. a second notification later than the one
>>currently happening from the PCI bus scan would be needed).
> 
> Currently Xen knows about the PCI device's resource assignment already when 
> system boot, since Xen have PCI information. The only situations that Xen may 
> have no idea includes: a) Dom0 re-assign the device resource, may because of 
> resource balance; b) VF device for SR-IOV.
> 
> I think for situation a, IIRC, xen hypervisor can't handle it, because that 
> means all shadow need be rebuild, the MSI information need be updated etc. 

Shadows and p2m table need updating in this case, but MSI information
doesn't afaict (it gets proagated only when the first interrupt is being
set up).
>>@@ -653,7 +653,9 @@ _sh_propagate(struct vcpu *v,
>>     }
>>
>>     /* Read-only memory */
>>-    if ( p2m_is_readonly(p2mt) )
>>+    if ( p2m_is_readonly(p2mt) ||
>>+         (p2mt == p2m_mmio_direct &&
>>+          rangeset_contains_singleton(mmio_ro_ranges, mfn_x(target_mfn))) )
>>         sflags &= ~_PAGE_RW;
> 
> Would it have performance impact if too much mmio rangeset and we need 
> search it for each l1 entry update? Or you assume this range will not be 
> updated so frequently? 

Yes, performance wise this isn't nice.

> I'm not sure if we can add one more p2m type like p2m_mmio_ro? And expand 
> the p2m_is_readonly to cover this also? PAE xen may have trouble for it, but 
> at least it works for x86_64, and some wrap function with #ifdef X86_64 can 
> handle the difference.

That would be a possible alternative, but I'm afraid I wouldn't dare to
make such a change.

>>@@ -707,7 +810,7 @@ static int __pci_enable_msix(struct msi_
>>         return 0;
>>     }
>>
>>-    status = msix_capability_init(pdev, msi, desc);
>>+    status = msix_capability_init(pdev, msi, desc, nr_entries);
>>     return status;
>> }
> 
> As stated above, would it be possible to achive this in tools?
> Also if it is possible to place the mmio_ro_ranges to be per domain 
> structure?

As said above, doing it in the tools has down sides, but if done
there and if excluding/special-casing Dom0 (or the servicing
stubdom) it should be possible to make the ranges per-domain.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel