[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 18/30] xen/x86: setup PVHv2 Dom0 ACPI tables



>>> On 27.09.16 at 17:57, <roger.pau@xxxxxxxxxx> wrote:
> FWIW, I think that the current approach that I've used in order to craft the
> MADT is not the best one, IMHO it would be better to place the MADT at the
> end of the E820_ACPI region (expanding it's size one page), and modify the
> XSDT/RSDT in order to point to it, that way we avoid shadowing any other
> ACPI data that might be at the same page as the native MADT (and that needs
> to be modified by Dom0).

I agree with the idea of placing MADT elsewhere, but I don't think you
can rely on E820_ACPI to have room to grow into right after its end.

> @@ -50,6 +53,8 @@ static long __initdata dom0_max_nrpages = LONG_MAX;
>  #define HVM_VM86_TSS_SIZE   128
>  
>  static unsigned int __initdata hvm_mem_stats[MAX_ORDER + 1];
> +static unsigned int __initdata acpi_intr_overrrides = 0;
> +static struct acpi_madt_interrupt_override __initdata *intsrcovr = NULL;

Pointless initializers.

> +static void __init acpi_zap_table_signature(char *name)
> +{
> +    struct acpi_table_header *table;
> +    acpi_status status;
> +    union {
> +        char str[ACPI_NAME_SIZE];
> +        uint32_t bits;
> +    } signature;
> +    char tmp;
> +    int i;
> +
> +    status = acpi_get_table(name, 0, &table);
> +    if ( ACPI_SUCCESS(status) )
> +    {
> +        memcpy(&signature.str[0], &table->signature[0], ACPI_NAME_SIZE);
> +        for ( i = 0; i < ACPI_NAME_SIZE / 2; i++ )
> +        {
> +            tmp = signature.str[ACPI_NAME_SIZE - i - 1];
> +            signature.str[ACPI_NAME_SIZE - i - 1] = signature.str[i];
> +            signature.str[i] = tmp;
> +        }
> +        write_atomic((uint32_t*)&table->signature[0], signature.bits);
> +    }
> +}

Together with moving MADT elsewhere we should also move
XSDT/RSDT, at which point we can simply avoid copying the
pointers for tables we don't want Dom0 to see (and down the
road we also have the option of adding tables).

The downside to both approaches is that this once again is a
black-listing model, i.e. new structure types we may need to
also suppress will remain visible to Dom0 until a patched
hypervisor would become available.

> +static int __init hvm_setup_acpi(struct domain *d)
> +{
> +    struct vcpu *saved_current, *v = d->vcpu[0];
> +    unsigned long pfn, nr_pages;
> +    uint64_t size, start_addr, end_addr;
> +    uint64_t madt_addr[2] = { 0, 0 };
> +    struct acpi_table_header *table;
> +    struct acpi_table_madt *madt;
> +    struct acpi_madt_io_apic *io_apic;
> +    struct acpi_madt_local_apic *local_apic;
> +    acpi_status status;
> +    int rc, i;
> +
> +    printk("** Setup ACPI tables **\n");
> +
> +    /* ZAP the HPET, SLIT, SRAT, MPST and PMTT tables. */
> +    acpi_zap_table_signature(ACPI_SIG_HPET);
> +    acpi_zap_table_signature(ACPI_SIG_SLIT);
> +    acpi_zap_table_signature(ACPI_SIG_SRAT);
> +    acpi_zap_table_signature(ACPI_SIG_MPST);
> +    acpi_zap_table_signature(ACPI_SIG_PMTT);
> +
> +    /* Map ACPI tables 1:1 */
> +    for ( i = 0; i < d->arch.nr_e820; i++ )
> +    {
> +        if ( d->arch.e820[i].type != E820_ACPI &&
> +             d->arch.e820[i].type != E820_NVS )
> +            continue;
> +
> +        pfn = PFN_DOWN(d->arch.e820[i].addr);
> +        nr_pages = DIV_ROUND_UP(d->arch.e820[i].size, PAGE_SIZE);
> +
> +        rc = modify_mmio_11(d, pfn, nr_pages, true);
> +        if ( rc )
> +        {
> +            printk(
> +                "Failed to map ACPI region %#lx - %#lx into Dom0 memory 
> map\n",

Please keep the format string on the printk line.

> +                   pfn, pfn + nr_pages);
> +            return rc;
> +        }
> +    }
> +    /*
> +     * Since most memory maps provided by hardware are wrong, make sure each
> +     * top-level table is properly mapped into Dom0.
> +     */

Please be more specific here - wrong in which way? Not all ACPI tables
living in one of the two E820 type regions? But see also below.

In any event you need to be more careful here: Mapping ordinary RAM
1:1 into Dom0 can't end well. Also, once working with a cloned XSDT you
won't need to cover tables here which have got hidden from Dom0.

> +    for( i = 0; i < acpi_gbl_root_table_list.count; i++ )
> +    {
> +        pfn = PFN_DOWN(acpi_gbl_root_table_list.tables[i].address);
> +        nr_pages = DIV_ROUND_UP(acpi_gbl_root_table_list.tables[i].length,
> +                                PAGE_SIZE);
> +        rc = modify_mmio_11(d, pfn, nr_pages, true);
> +        if ( rc )
> +        {
> +            printk(
> +                "Failed to map ACPI region %#lx - %#lx into Dom0 memory 
> map\n",
> +                   pfn, pfn + nr_pages);
> +            return rc;
> +        }
> +    }
> +
> +    /*
> +     * Special treatment for memory < 1MB:
> +     *  - Copy the data in e820 regions marked as RAM (BDA, EBDA...).

Copy? What if some of it needs to get modified to interact with
firmware? I think you'd be better off mapping everything Xen
doesn't use into Dom0, and only mapping fresh RAM pages
over regions Xen does use (namely the trampoline).

> +     *  - Map any reserved regions as 1:1.

Why would reserved regions need such treatment only when below
1Mb?

> +    acpi_get_table_phys(ACPI_SIG_MADT, 0, &madt_addr[0], &size);
> +    if ( !madt_addr[0] )
> +    {
> +        printk("Unable to find ACPI MADT table\n");
> +        return -EINVAL;
> +    }
> +    if ( size > PAGE_SIZE )
> +    {
> +        printk("MADT table is bigger than PAGE_SIZE, aborting\n");
> +        return -EINVAL;
> +    }

This restriction for sure needs to go away.

> +    acpi_get_table_phys(ACPI_SIG_MADT, 2, &madt_addr[1], &size);

Why 2 (and not 1) when the first invocation used 0? But this is not
going to be needed anyway with the alternative model.

> +    io_apic = (struct acpi_madt_io_apic *)(madt + 1);
> +    io_apic->header.type = ACPI_MADT_TYPE_IO_APIC;
> +    io_apic->header.length = sizeof(*io_apic);
> +    io_apic->id = 1;
> +    io_apic->address = VIOAPIC_DEFAULT_BASE_ADDRESS;

I think you need to make as many IO-APICs available to Dom0 as
there are hardware ones.

> +    if ( dom0_max_vcpus() > num_online_cpus() )
> +    {
> +        printk("CPU overcommit is not supported for Dom0\n");

???

> +        xfree(madt);

I don't think such cleanup is needed here - boot won't succeed
anyway.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.