[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC] Xen PV IOMMU interface draft B



On 17/06/15 13:48, Yu, Zhang wrote:
> Hi Malcolm,
> 
>   Thank you very much for accommodate our XenGT requirement in your
> design. Following are some XenGT related questions. :)
> 
> On 6/13/2015 12:43 AM, Malcolm Crossley wrote:
>> Hi All,
<snip>
>>
>> IOMMUOP_map_foreign_page
>> ----------------
>> This subop uses `struct map_foreign_page` part of the `struct pv_iommu_op`.
>>
>> It is not valid to use domid representing the calling domain.
>>
>> The hypercall will only succeed if calling domain has sufficient privilege 
>> over
>> the specified domid
>>
>> If there is no IOMMU support then the MFN is returned in the BFN field (that 
>> is
>> the only valid bus address for the GFN + domid combination).
>>
>> If there IOMMU support then the specified BFN is returned for the GFN + domid
>> combination
>>
>> The M2B mechanism is a MFN to (BFN,domid,ioserver) tuple.
>>
>> Each successful subop will add to the M2B if there was not an existing 
>> identical
>> M2B entry.
>>
>> Every new M2B entry will take a reference to the MFN backing the GFN.
>>
>> All the following conditions are required to be true for PV IOMMU map_foreign
>> subop to succeed:
>>
>> 1. IOMMU detected and supported by Xen
>> 2. The domain has IOMMU controlled hardware allocated to it
>> 3. The domain is a hardware_domain and the following Xen IOMMU options are
>>     NOT enabled: dom0-passthrough
> What if the IOMMU is enabled, and runs in the default mode, which 1:1 maps 
> all memories except owned
> by Xen?

Good question. A PV IOMMU aware guest will know the 1:1 map exists and can use 
the
IOMMUOP_unmap_page to remove any mappings which will conflict with it's planned 
BFN mappings.

For a PV IOMMU unaware guest I think the IOMMUOP_lookup_foreign_page should be 
used instead. This
will allow the IOSERVER to register interest in the Domid + GFN it's using and 
allow ballooning to
be used.


FYI, The 1:1 map on PV guests will be setup without taking a reference to the 
MFN otherwise unaware
PV guests will be unable to create page tables.

>>
>>
>> This subop usage of the "struct pv_iommu_op" and ``struct map_foreign_page`
>> fields are detailed below:
>>
>> --------------------------------------------------------------------
>> Field          Purpose
>> -----          -----------------------------------------------------
>> `domid`        [in] The domain ID for which the gfn field applies
>>
>> `ioserver`     [in] IOREQ server id associated with mapping
>>
>> `bfn`          [in] Bus address frame number for gfn address
>>
>> `gfn`          [in] Guest address frame number
>>
>> `flags`        [in] Details the status of the BFN mapping
>>
>> `status`       [out] status of this subop, 0 indicates success
>> --------------------------------------------------------------------
>>
>> Defined bits for flags field:
>>
>> Name                         Bit                Definition
>> ----                        -----      ----------------------------------
>> IOMMUOP_readable              0        BFN IOMMU mapping is readable
>> IOMMUOP_writeable             1        BFN IOMMU mapping is writeable
>> IOMMUOP_swap_mfn              2        BFN IOMMU mapping can be safely
>>                                         swapped to scratch page
>> Reserved for future use      3-9       Reserved flag bits should be 0
>> IOMMU_page_order            10-15      Returns maximum possible page order 
>> for
>>                                         all other IOMMUOP subops
>>
>> Defined values for map_foreign_page subop status field:
>>
>> Error code  Reason
>> ----------  ------------------------------------------------------------
>> 0            subop successfully returned
>> -EIO         IOMMU unit returned error when attempting to map BFN to GFN.
>> -EPERM       Calling domain does not have sufficient privilege over domid
>> -EPERM       GFN could not be mapped because the GFN belongs to Xen.
>> -EPERM       domid maps to DOMID_SELF
>> -EACCES      BFN address conflicts with RMRR regions for device's attached to
>>               DOMID_SELF
>> -ENODEV      Provided ioserver id is not valid
>> -ENXIO       Provided domid id is not valid
>> -ENXIO       Provided GFN address is not valid
>> -ENOSPC      Page order is too large for either BFN, GFN or IOMMU unit
>>
>> IOMMU_lookup_foreign_page
>> ----------------
>> This subop uses `struct lookup_foreign_page` part of the `struct 
>> pv_iommu_op`.
>>
>> If the BFN is specified as an input and parameter and there is no IOMMU 
>> support
>> for the calling domain then an error will be returned.
>>
>> It is the calling domain responsibility to ensure there are no conflicts
>>
>> The hypercall will only succeed if calling domain has sufficient privilege 
>> over
>> the specified domid
>>
>> If there is no IOMMU support then the MFN is returned in the BFN field (that 
>> is
>> the only valid bus address for the GFN + domid combination).
> Similarly, what if the IOMMU is enabled, and runs in the default mode,
> which 1:1 maps all memories except owned by Xen? Will a MFN be returned?
> Or should we take the query/map ops instead of the lookup op for this
> situation?

The lookup will return the BFN which is 1:1 mapped to the MFN.

Only the hardware domain will have precreated BFN mappings of other domains 
memory.

So the logic could look like this:

If dom0 then lookup use P2M to get MFN then use M2B to lookup BFN if this fails
then check if BFN is mapped to MFN 1:1, if so
return BFN else return -ENOENT.


>>
>> Each successful subop will add to the M2B if there was not an existing 
>> identical
>> M2B entry.
>>
>> Every new M2B entry will take a reference to the MFN backing the GFN.
>>
<snip>
>>
>> IOMMUOP_*_foreign_page interactions with guest domain ballooning
>> ================================================================
>>
>> Guest domains can balloon out a set of GFN mappings at any time and render 
>> the
>> BFN to GFN mapping invalid.
>>
>> When a BFN to GFN mapping becomes invalid, Xen will issue a buffered IO 
>> request
>> of type IOREQ_TYPE_INVALIDATE to the affected IOREQ servers with the now 
>> invalid
>> BFN address in the data field. If the buffered IO request ring is full then a
>> standard (synchronous) IO request of type IOREQ_TYPE_INVALIDATE will be 
>> issued
>> to the affected IOREQ server the with just invalidated BFN address in the 
>> data
>> field.
>>
>> The BFN mappings cannot be simply unmapped at the point of the balloon 
>> hypercall
>> otherwise a malicious guest could specifically balloon out an in use GFN 
>> address
>> in use by an emulator and trigger IOMMU faults for the domains with BFN
>> mappings.
>>
>> For hosts with no IOMMU support: The affected emulator(s) must specifically
>> issue a IOMMUOP_unmap_foreign_page subop for the now invalid BFN address so 
>> that
>> the references to the underlying MFN are removed and the MFN can be freed 
>> back
>> to the Xen memory allocator.
> I do not quite understand this. With no IOMMU support, these BFNs are
> supplied by hypervisor. So why not let hypervisor do this unmap and
> notify the calling domain?

We need the emulators to do the unmap so that they can ensure that hardware is 
not actively using
the BFN (same as MFN in this case) otherwise Xen may allocate that MFN to 
another guest and that
guest will have it's memory corrupted.

Another way to think about it is that a malicious guest could set up a long 
running DMA to it's RAM
and then deliberately balloons out that RAM whilst the DMA is running. The only 
way to secure that
scenario is not let the balloon out RAM to be used until the emulator confirms 
it's safe to do so.

The IOMMUOP_swap_mfn optimisation has been added to allow Xen to drop 
reference's safely.
Unfortunately it requires the IOMMU to be enabled.

>>
<snip>
>> Emulator usage of PV IOMMU interface
>> ====================================
>>
>> Emulators which require bus address mapping of guest RAM must first 
>> determine if
>> it's possible for the domain to control the bus addresses themselves.
>>
>> A IOMMUOP_query_caps subop will return the IOMMU_QUERY_map_cap flag. If this
>> flag is set then the emulator may specify the BFN address it wishes guest 
>> RAM to
>> be mapped to via the IOMMUOP_map_foreign_page subop.  If the flag is not set
>> then the emulator must use BFN addresses supplied by the Xen via the
>> IOMMUOP_lookup_foreign_page.
>>
>> Operating systems which use the IOMMUOP_map_page subop are expected to 
>> provide a
>> common interface for emulators
> 
> According to our previous internal discussions, my understanding about
> the usage is this:
> 1> PV IOMMU has an interface in dom0's kernel to do the query/map/lookup
> all at once, which also includes the BFN allocation algorithm.
> 2> When XenGT emulator tries to construct a shadow PTE, we can just call
> your interface, which returns a BFN whatever.
> 
> However, the above description seems the XenGT device model need to do
> the query/lookup/map by itself?
The above description is to cover emulator which may run in their own domain 
(stub domain).

> Besides, could you please give a more detailed information about this
> 'common interface'? :)

I will try to include more details in the next draft.

My current thinking is to reuse the "struct pv_iommu_op" array of ops and just 
implement a common
function for requesting a BFN mapping. The common function will fill in the 
subOp_field for the caller.

Thanks for your feedback and please trim your replies as Jan suggested. It 
makes it much easier to
find and reply to your inline comments.

> 
> Thanks
> Yu
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxx
>> http://lists.xen.org/xen-devel
>>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.