[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8 07/11] vpci/bars: add handlers to map the BARs



>>> On 26.02.18 at 19:00, <roger.pau@xxxxxxxxxx> wrote:
> On Mon, Feb 26, 2018 at 04:20:17AM -0700, Jan Beulich wrote:
>> >>> On 23.01.18 at 16:07, <roger.pau@xxxxxxxxxx> wrote:
>> > +bool vpci_process_pending(struct vcpu *v)
>> > +{
>> > +    while ( v->vpci.mem )
>> > +    {
>> > +        struct map_data data = {
>> > +            .d = v->domain,
>> > +            .map = v->vpci.map,
>> > +        };
>> > +
>> > +        switch ( rangeset_consume_ranges(v->vpci.mem, map_range, &data) )
>> > +        {
>> > +        case -ERESTART:
>> > +            return true;
>> > +
>> > +        default:
>> > +            if ( v->vpci.map )
>> > +            {
>> > +                spin_lock(&v->vpci.pdev->vpci->lock);
>> > +                modify_decoding(v->vpci.pdev, v->vpci.map, v->vpci.rom);
>> > +                spin_unlock(&v->vpci.pdev->vpci->lock);
>> > +            }
>> > +            /* fallthrough. */
>> > +        case -ENOMEM:
>> 
>> You carefully handle this error here.
> 
> On second thought, I'm not sure handling ENOMEM separately makes
> sense. Unless you object I plan to remove this special casing.

Indeed I was never really happy with the special casing, but I
did recall this being done intentionally, so I also didn't complain.

>> > +static void maybe_defer_map(struct domain *d, const struct pci_dev *pdev,
>> > +                            struct rangeset *mem, bool map, bool rom)
>> > +{
>> > +    struct vcpu *curr = current;
>> > +
>> > +    if ( is_idle_vcpu(curr) )
>> > +    {
>> > +        struct map_data data = { .d = d, .map = true };
>> > +
>> > +        /*
>> > +         * Only used for domain construction in order to map the BARs
>> > +         * of devices with memory decoding enabled.
>> > +         */
>> > +        ASSERT(map && !rom);
>> > +        rangeset_consume_ranges(mem, map_range, &data);
>> 
>> What if this produces -ENOMEM? And despite having looked at
>> several revisions of this, I can't make the connection to why this
>> is in an is_idle_vcpu() conditional (neither the direct caller nor the
>> next level up make this obvious to me). There's clearly a need
>> for extending the comment.
> 
> I thought the comment above that mentions domain construction would be
> enough. I can try to expand this, maybe like:
> 
> "This function will only be called from the idle vCPU while building
> the domain, in which case it's not possible to defer the operation
> (like done in the else branch). Call rangeset_consume_ranges in order
> to establish the mappings right away."

And what again is the connection between is_idle_domain() and
domain construction? I think the comment belongs ahead of the
if(), and it needs to make that connection.

>> > +static void modify_bars(const struct pci_dev *pdev, bool map, bool rom)
>> > +{
>> > +    struct vpci_header *header = &pdev->vpci->header;
>> > +    struct rangeset *mem = rangeset_new(NULL, NULL, 0);
>> > +    const struct pci_dev *tmp;
>> > +    unsigned int i;
>> > +    int rc;
>> > +
>> > +    if ( !map )
>> > +        modify_decoding(pdev, false, rom);
>> > +
>> > +    if ( !mem )
>> > +        return;
>> 
>> Similarly here - why is it okay (or what effect will it have) to return
>> here when we're out of memory, but the caller won't know?
> 
> The behaviour here depends on the change to the memory decoding bit:
> 
>  - Clearing: memory decoding on device will be disabled, BARs won't be
>    unmapped.
>  - Setting: no change to device memory decoding bit, BARs won't be
>    mapped.
> 
> Do you think this is suitable? IMO it's fine to disable the memory
> decoding bit on the device and leave the memory regions mapped.

As long as subsequent changes to the decoding bit can't leave
stale mappings. Plus there needs to be a comment to explain this.

>> > +    /*
>> > +     * Create a rangeset that represents the current device BARs memory 
>> > region
>> > +     * and compare it against all the currently active BAR memory 
>> > regions. If
>> > +     * an overlap is found, subtract it from the region to be
>> > +     * mapped/unmapped.
>> > +     *
>> > +     * NB: the rangeset uses inclusive frame numbers.
>> > +     */
>> > +
>> > +    /*
>> > +     * First fill the rangeset with all the BARs of this device or with 
>> > the ROM
>> > +     * BAR only, depending on whether the guest is toggling the memory 
>> > decode
>> > +     * bit of the command register, or the enable bit of the ROM BAR 
>> > register.
>> > +     */
>> > +    for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
>> > +    {
>> > +        const struct vpci_bar *bar = &header->bars[i];
>> > +
>> > +        if ( !MAPPABLE_BAR(bar) ||
>> > +             (rom ? bar->type != VPCI_BAR_ROM
>> > +                  : (bar->type == VPCI_BAR_ROM && !header->rom_enabled)) )
>> > +            continue;
>> 
>> Why does rom_enabled matter for the !rom case rather than for
>> the rom one? I.e.
>> 
>>         if ( !MAPPABLE_BAR(bar) ||
>>              (rom ? bar->type != VPCI_BAR_ROM || !header->rom_enabled
>>                   : bar->type == VPCI_BAR_ROM) )
>>             continue;
>> 
>> ?
> 
> No, for the ROM case we only want to map/unmap the ROM, so that's the
> only thing added to the rangeset. For the non-ROM case Xen will also
> map/unmap the ROM if the enable bit is set.
> 
> Your proposed code would always map/unmap the ROM into the p2m when
> the memory decoding bit is changed even if it's not enabled.

I don't understand. Taking apart the conditional I've suggested,
and converting to human language:
- if the BAR is no mappable, continue
- if we want to deal with ROM (rom=true), if the BAR isn't ROM
  or isn't enabled, continue
- if we want to deal with non-ROM (rom=false), if the BAR is ROM,
  continue

To me this is in line with the 2nd paragraph of your reply. It's not
in line with the first, which makes me wonder whether "rom" is
misnamed and wants to be "rom_only". Still, the question would
remain of why rom_enabled doesn't matter when the variable is
true.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.