[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] (v2) Design proposal for RMRR fix

On Thu, Jan 8, 2015 at 1:54 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
> Ideally, rather than detecting conflicts, hvmloader would just
> consume what libxc set up. Obviously that would require awareness
> in libxc of things it currently doesn't care about (like fitting PCI BARs
> into the MMIO hole, enlarging it as necessary). I admit that this may
> end up being difficult to implement.

Yes, the idea of moving all that logic into libxc just seems not very
nice; particularly as, if I remember correctly, the domain builder
code cannot access xenstore, since it would introduce a circular
dependency.  (I might be remembering that incorrectly.)

> Another alternative would be to
> have libxc only populate a limited part of RAM (for hvmloader to be
> loadable), and have hvmloader do the bulk of the populating.

Ah, that's an interesting idea.  It seems like it might make
development of domain-building features quite a bit more complicated
though.  Worth having a think about.

>>>>>3.3 Policies
>> ----
>> An intuitive thought is to fail immediately upon a confliction, however
>> it is not flexible regarding to different requirments:
>> a) it's not appropriate to fail libxc domain builder just because such
>> confliction. We still want the guest to boot even w/o assigned device;
> I don't think that's right (and I believe this was discussed before):
> When device assignment fails, VM creation should fail too. It is the
> responsibility of the host admin in that case to remove some or all
> of the to be assigned devices from the guest config.

Yes; basically, if we get to domain build time and we haven't reserved
the RMRRs, then it's a bug in libxl (since it's apparently told Xen
one thing and libxc another thing).  Having libxc fail in that case is
a perfectly sensible thing to do.

>> We propose report-all as the simple solution (different from last sent
>> version which used report-sel), regarding to the below facts:
>>   - 'warn' policy in user space makes report-all not harmful
>>   - 'report-all' still means only a few entries in reality:
>>     * RMRR reserved regions should be avoided or limited by platform
>> designers, per VT-d specification;
>>     * RMRR reserved regions are only a few on real platforms, per our
>> current observations;
> Few yes, but in the IGD example you gave the region is quite large,
> and it would be fairly odd to have all guests have a strange, large
> hole in their address spaces. Furthermore remember that these
> holes vary from machine to machine, so a migrateable guest would
> needlessly end up having a hole potentially not helping subsequent
> hotplug at all.

Yes, I think that by default VMs should have no RMRRs set up on domain
creation.  The only way to get RMRRs in your address space should be
to opt-in at domain creation time (either by statically assigning
devices, or by requesting your memory layout to mirror the host's).

>> In this way, there are two situations libxc domain builder may request
>> to query reserved region information w/ same interface:
>> a) if any statically-assigned devices, and/or
>> b) if a new parameter is specified, asking for hotplug preparation
>>       ('rdm_check' or 'prepare_hotplug'?)
>> the 1st invocation of this interface will save all reported reserved
>> regions under domain structure, and later invocation (e.g. from
>> hvmloader) gets saved content.
> Why would the reserved regions need attaching to the domain
> structure? The combination of (to be) assigned devices and
> global RMRR list always allow reproducing the intended set of
> regions without any extra storage.

So when you say "(to be) assigned devices", you mean any device which
is currently assigned, *or may be assigned at some point in the

Do you think the extra storage for "this VM might possibly be assigned
this device at some point" wouldn't really be that much bigger than
"this VM might possibly map this RMRR at some point in the future"?

It seems a lot cleaner to me to have the toolstack tell Xen what
ranges are reserved for RMRR per VM, and then have Xen check again
when assigning a device to make sure that the RMRRs have already been

>> 4. Plan
>> =====================================================================
>> We're seeking an incremental way to split above tasks into 2 stages,
>> and in each stage we move forward a step w/o causing regression. Doing
>> so can benefit people who want to use device assignment early, and
>> also benefit newbie developer to rampup, toward a final sane solution.
>> 4.1 Stage-1: hypervisor hardening
>> ----
>>   [Tasks]
>>       1) Setup RMRR identity mapping in p2m layer with confliction
>> detection
>>       2) add a boot option for fail/warn policy
>>       3) remove USB hack
>>       4) Detect and fail device assignment w/ shared reserve regions
>>   [Enhancements]
>>       * fix [Issue-1] and [Issue-3]
> According to what you wrote earlier, [Issue-3] is not intended to be
> fixed, but instead devices sharing the same RMRR(s) are to be
> declared unassignable.

Yes -- fix the security hole by forbidding the situation which causes
it to happen.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.