[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] One question about the hypercall to translate gfn to mfn.



On 10/12/14 09:51, Tian, Kevin wrote:
>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> Sent: Wednesday, December 10, 2014 5:17 PM
>>
>>>>> On 10.12.14 at 09:47, <kevin.tian@xxxxxxxxx> wrote:
>>> two translation paths in assigned case:
>>>
>>> 1. [direct CPU access from VM], with partitioned PCI aperture
>>> resource, every VM can access a portion of PCI aperture directly.
>>>
>>> - CPU page table/EPT: CPU virtual address->PCI aperture
>>> - PCI aperture - bar base = Graphics Memory Address (GMA)
>>> - GPU page table: GMA -> GPA (as programmed by guest)
>>> - IOMMU: GPA -> MPA
>>>
>>> 2. [GPU access through GPU command operands], with GPU scheduling,
>>> every VM's command buffer will be fetched by GPU in a time-shared
>>> manner.
>>>
>>> - GPU page table: GMA->GPA
>>> - IOMMU: GPA->MPA
>>>
>>> In our case, IOMMU is setup with 1:1 identity table for dom0. So
>>> when GPU may access GPAs from different VMs, we can't count on
>>> IOMMU which can only serve one mapping for one device (unless
>>> we have SR-IOV).
>>>
>>> That's why we need shadow GPU page table in dom0, and need a
>>> p2m query call to translate from GPA -> MPA:
>>>
>>> - shadow GPU page table: GMA->MPA
>>> - IOMMU: MPA->MPA (for dom0)
>>
>> I still can't see why the Dom0 translation has to remain 1:1, i.e.
>> why Xen couldn't return some "arbitrary" GPA for the query in
>> question here, setting up a suitable GPA->MPA translation. (I put
>> arbitrary in quotes because this of course must not conflict with
>> GPAs already or possibly in use by Dom0.) And I can only stress
>> again that you shouldn't leave out PVH (where the IOMMU already
>> isn't set up with all 1:1 mappings) from these considerations.
>>
> 
> It's interesting that you think IOMMU can be used in such situation.
> 
> what do you mean by "arbitrary" GPA here? and It's not just about 
> conflicting with Dom0's GPA, it's about confliction in all VM's GPAs 
> when you hosting them through one IOMMU page table, and there's 
> no way to prevent this definitely since GPAs are picked by VMs 
> themselves.
> 
> I don't think we can support PVH here if IOMMU is not 1:1 mapping.
> 

I agree with Jan, there doesn't need to be a fixed 1:1 mapping between
IOMMU and MFN's addresses.

I think all that's required is that there is an IOMMU mapping for the
GPU device connected to dom0 (or driver domain) which allows guest
memory to be accessed by the GPU. This IOMMU address is what is
programmed into shadow GPU page table, I refer to this address as Bus
frame number(BFN) in the PV IOMMU design document.

- shadow GPU page table: GMA->BFN
- IOMMU: BFN->MPA


IOMMU's can almost always address more than the host physical RAM so we
can create IOMMU mappings above the top of host physical RAM in order to
have IOMMU mappings of guest RAM.

The PV-IOMMU design allows the guest to have control of the IOMMU
address space. In theory it could be extended to have permission checks
for mapping guest MFN's and have a mapping interface which takes a domid
and a GMFN. That way the driver domain does not need to know the actual
MFN's being used.

The guest itself (CPU) accesses the GPU via outbound MMIO mappings so we
don't need to be concerned with address translation in that direction.

I think getting Xen to allocate IOMMU mappings for a driver domain will
be problematic for PV based driver domains because the M2P for PV
domains is not kept strictly upto date with what the guest is using for
P2M and so it will be difficult/impossible to determine which addresses
are not in use.

Similarly it may be difficult to HVM guests because P2M mapping are
outbound (CPU to rest of host) and determining what addresses are
suitable for inbound access (rest of host to memory) may be difficult.
I.E should MMIO outbound address space be used for inbound IOMMU mappings?

I hope I've not caused more confusion.

Malcolm

> Thanks
> Kevin
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.