[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC v2] Add SUPPORT.md



On 09/11/2017 06:53 PM, Andrew Cooper wrote:
> On 11/09/17 18:01, George Dunlap wrote:
>> +### x86/PV
>> +
>> +    Status: Supported
>> +
>> +Traditional Xen Project PV guest
> 
> What's a "Xen Project" PV guest?  Just Xen here.
> 
> Also, a perhaps a statement of "No hardware requirements" ?

OK.

> 
>> +### x86/RAM
>> +
>> +    Limit, x86: 16TiB
>> +    Limit, ARM32: 16GiB
>> +    Limit, ARM64: 5TiB
>> +
>> +[XXX: Andy to suggest what this should say for x86]
> 
> The limit for x86 is either 16TiB or 123TiB, depending on
> CONFIG_BIGMEM.  CONFIG_BIGMEM is exposed via menuconfig without
> XEN_CONFIG_EXPERT, so falls into at least some kind of support statement.
> 
> As for practical limits, I don't think its reasonable to claim anything
> which we can't test.  What are the specs in the MA colo?

At the moment the "Limit" tag specifically says that it's theoretical
and may not work.

We could add another tag, "Limit-tested", or something like that.

Or, we could simply have the Limit-security be equal to the highest
amount which has been tested (either by osstest or downstreams).

For simplicity's sake I'd go with the second one.

Shall I write an e-mail with a more direct query for the maximum amounts
of various numbers tested by the XenProject (via osstest), Citrix, SuSE,
and Oracle?

>> +
>> +## Limits/Guest
>> +
>> +### Virtual CPUs
>> +
>> +    Limit, x86 PV: 512
> 
> Where did this number come from?  The actual limit as enforced in Xen is
> 8192, and it has been like that for a very long time (i.e. the 3.x days)

Looks like Lars copied this from
https://wiki.xenproject.org/wiki/Xen_Project_Release_Features.  Not sure
where it came from before that.

> [root@fusebot ~]# python
> Python 2.7.5 (default, Nov 20 2015, 02:00:19)
> [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from xen.lowlevel.xc import xc as XC
>>>> xc = XC()
>>>> xc.domain_create()
> 1
>>>> xc.domain_max_vcpus(1, 8192)
> 0
>>>> xc.domain_create()
> 2
>>>> xc.domain_max_vcpus(2, 8193)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> xen.lowlevel.xc.Error: (22, 'Invalid argument')
> 
> Trying to shut such a domain down however does tickle a host watchdog
> timeout as the for_each_vcpu() loops in domain_kill() are very long.

For now I'll set 'Limit' to 8192, and 'Limit-security' to 512.
Depending on what I get for the "test limit" survey I may adjust it
afterwards.

>> +    Limit, x86 HVM: 128
>> +    Limit, ARM32: 8
>> +    Limit, ARM64: 128
>> +
>> +[XXX Andrew Cooper: Do want to add "Limit-Security" here for some of these?]
> 
> 32 for each.  64 vcpu HVM guests can excerpt enough p2m lock pressure to
> trigger a 5 second host watchdog timeout.

Is that "32 for x86 PV and x86 HVM", or "32 for x86 HVM and ARM64"?  Or
something else?

>> +### Virtual RAM
>> +
>> +    Limit, x86 PV: >1TB
>> +    Limit, x86 HVM: 1TB
>> +    Limit, ARM32: 16GiB
>> +    Limit, ARM64: 1TB
> 
> There is no specific upper bound on the size of PV or HVM guests that I
> am aware of.  1.5TB HVM domains definitely work, because that's what we
> test and support in XenServer.

Are there limits for 32-bit guests?  There's some complicated limit
having to do with the m2p, right?

>> +
>> +### x86 PV/Event Channels
>> +
>> +    Limit: 131072
> 
> Why do we call out event channel limits but not grant table limits? 
> Also, why is this x86?  The 2l and fifo ABIs are arch agnostic, as far
> as I am aware.

Sure, but I'm pretty sure that ARM guests don't (perhaps cannot?) use PV
event channels.

> 
>> +## High Availability and Fault Tolerance
>> +
>> +### Live Migration, Save & Restore
>> +
>> +    Status, x86: Supported
> 
> With caveats.  From docs/features/migration.pandoc

This would extend the meaning of "caveats" from "when it's not security
supported" to "when it doesn't work"; which is probably the best thing
at the moment.

> * x86 HVM with nested-virt (no relevant information included in the stream)
[snip]
> Also, features such as vNUMA and nested virt (which are two I know for
> certain) have all state discarded on the source side, because they were
> never suitably plumbed in.

OK, I'll list these, as well as PCI pass-through.

(Actually, vNUMA doesn't seem to be on the list!)

And we should probably add a safety-catch to prevent a VM started with
any of these from being live-migrated.

In fact, if possible, that should be a whitelist: Any configuration that
isn't specifically known to work with migration should cause a migration
command to be refused.

What about the following features?

 * Guest serial console
 * Crash kernels
 * Transcendent Memory
 * Alternative p2m
 * vMCE
 * vPMU
 * Intel Platform QoS
 * Remus
 * COLO
 * PV protocols: Keyboard, PVUSB, PVSCSI, PVTPM, 9pfs, pvcalls?
 * FlASK?
 * CPU / memory hotplug?

> * x86 HVM guest physmap operations (not reflected in logdirty bitmap)
> * x86 PV P2M structure changes (not noticed, stale mappings used) for
>   guests not using the linear p2m layout

I'm afraid this isn't really appropriate for a user-facing document.
Users don't directly do physmap operations, nor p2m structure changes.
We need to tell them specifically which features they can or cannot use.

> * x86 HVM with PoD pages (attempts to map cause PoD allocations)

This shouldn't be any more dangerous than a guest-side sweep, should it?
 You may waste a lot of time reclaiming zero pages, but it seems like it
should only be a relatively minor performance issue, not a correctness
issue.

The main "problem" (in terms of "surprising behavior") would be that on
the remote side any PoD pages will actually be allocated zero pages.  So
if your guest was booted with memmax=4096 and memory=2048, but your
balloon driver had only ballooned down to 3000 for some reason (and then
stopped), the remote side would want 3000 MiB (not 2048, as one might
expect).

> * x86 PV ballooning (P2M marked dirty, target frame not marked)

Er, this should probably be fixed.  What exactly is the problem here?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.