[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 14/14] xen/x86: setup PVHv2 Dom0 ACPI tables



On Mon, Dec 12, 2016 at 06:56:36AM -0700, Jan Beulich wrote:
> >>> On 30.11.16 at 17:49, <roger.pau@xxxxxxxxxx> wrote:
> > @@ -567,7 +573,7 @@ static __init void hvm_setup_e820(struct domain *d, 
> > unsigned long nr_pages)
> >      /*
> >       * Craft the e820 memory map for Dom0 based on the hardware e820 map.
> >       */
> > -    d->arch.e820 = xzalloc_array(struct e820entry, e820.nr_map);
> > +    d->arch.e820 = xzalloc_array(struct e820entry, E820MAX);
> 
> I have to admit that I consider this both wasteful and inflexible.
> While commonly you'll need far less entries, what if in fact you
> need more?

I've updated pvh_add_mem_range so that now it expands the memory map when
needed, in order to add new regions. This solves both the waste and the
inflexibility:

map = xzalloc_array(struct e820entry, d->arch.nr_e820 + 1);
if ( !map )
{
    printk(XENLOG_WARNING "E820: out of memory to add region\n");
    return -ENOMEM;
}

memcpy(map, d->arch.e820, i * sizeof(*d->arch.e820));
memcpy(map + i + 1, d->arch.e820 + i,
       (d->arch.nr_e820 - i) * sizeof(*d->arch.e820));
map[i].addr = s;
map[i].size = e - s;
map[i].type = type;
xfree(d->arch.e820);
d->arch.e820 = map;
d->arch.nr_e820++;

> > +static int __init hvm_setup_acpi_madt(struct domain *d, paddr_t *addr)
> > +{
> > +    struct acpi_table_madt *madt;
> > +    struct acpi_table_header *table;
> > +    struct acpi_madt_io_apic *io_apic;
> > +    struct acpi_madt_local_apic *local_apic;
> 
> I think I had asked about the absence of struct acpi_madt_local_x2apic
> here before, but now that I look again I also wonder how you get away
> without struct acpi_madt_nmi_source and struct acpi_madt_local_apic_nmi;

Since we are currently limited to 128 vCPUs (max APIC ID 255), I don't think
acpi_madt_local_x2apic is required in this scenario:

https://lists.xen.org/archives/html/xen-devel/2016-11/msg02815.html

I will work on adding those entries once that limit is lifted, would you be
fine with me adding a comment here regarding the no-need of
acpi_madt_local_x2apic until support for > 128vCPUs is added?

Regarding the local/IO APIC NMI structures, won't those NMI's be delivered to
Xen, or are those then re-injected to the guest?

> I can kind of see struct acpi_madt_local_apic_override not being

Since Xen is the one that sets the local APIC address in the MADT, and it
always matches the position of the emulated local APIC I don't see why we would
want to use acpi_madt_local_apic_override. I see that your worries are related
to what would happen if AML tries to modify the MADT, but wouldn't in that case
modify the native MADT, instead of the crafted one?

> needed here (and I'd really hope anyway that no BIOS actually makes
> use of it). The question mainly is what possible damage there may be
> if what Dom0 gets to see disagrees with what the firmware acts on
> (remember that AML code may alter MADT contents).
> 
> > +    acpi_status status;
> > +    unsigned long size;
> > +    unsigned int i;
> > +    int rc;
> > +
> > +    /* Count number of interrupt overrides in the MADT. */
> > +    acpi_table_parse_madt(ACPI_MADT_TYPE_INTERRUPT_OVERRIDE,
> > +                          acpi_count_intr_ovr, MAX_IRQ_SOURCES);
> 
> What's the reason for using MAX_IRQ_SOURCES here? There's no
> static array involved here limiting the supported count.

Right, this can be INT_MAX, or UINT_MAX if I change the type of
acpi_intr_overrrides.

> > +    madt->address = APIC_DEFAULT_PHYS_BASE;
> > +    /*
> > +     * NB: this is currently set to 4, which is the revision in the ACPI
> > +     * spec 6.1. Sadly ACPICA doesn't provide revision numbers for the
> > +     * tables described in the headers.
> > +     */
> > +    madt->header.revision = 4;
> 
> I can see you wanting to cap the revision here, but is it safe to
> possibly bump it from what firmware has?

Hm, we only use the headers and the interrupt overrides as-is from the original
table, the rest comes from the acpica structs. So I assume it's better to be
safe than sorry, and a min should be used here together with the original table
version.

> > +    /*
> > +     * Setup the IO APIC entry.
> > +     * FIXME: the current vIO-APIC code just supports one IO-APIC instance
> > +     * per domain. This must be fixed in order to provide the same amount 
> > of
> > +     * IO APICs as available on bare metal.
> > +     */
> > +    if ( nr_ioapics > 1 )
> > +        printk("WARNING: found %d IO APICs, Dom0 will only have access to 
> > 1 emulated IO APIC\n",
> > +               nr_ioapics);
> > +    io_apic = (struct acpi_madt_io_apic *)(madt + 1);
> > +    io_apic->header.type = ACPI_MADT_TYPE_IO_APIC;
> > +    io_apic->header.length = sizeof(*io_apic);
> > +    io_apic->id = 1;
> 
> Is it safe to use an ID other than what firmware did assign? Where
> is this 1 coming from? And how does this get sync-ed up with the
> respective struct hvm_hw_vioapic field?

Not sure why I've set this to 1, AFAICT the vioapic code sets this to 0 on
reset, so it should be 0 here (until we have proper support in order to
completely reproduce the native IO APIC topology).

[ ... fixed all the comments below ..]

Thanks for the detailed review!

Roger.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.