[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 07/11] vpci/bars: add handlers to map the BARs



>>> On 15.03.18 at 12:33, <roger.pau@xxxxxxxxxx> wrote:
> On Wed, Mar 14, 2018 at 10:13:16AM -0600, Jan Beulich wrote:
>> >>> On 14.03.18 at 15:04, <roger.pau@xxxxxxxxxx> wrote:
>> > +static void modify_decoding(const struct pci_dev *pdev, bool map, bool 
>> > rom_only)
>> > +{
>> > +    struct vpci_header *header = &pdev->vpci->header;
>> > +    uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
>> > +    unsigned int i;
>> > +
>> > +    for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
>> > +    {
>> > +        if ( rom_only && header->bars[i].type == VPCI_BAR_ROM )
>> > +        {
>> > +            unsigned int rom_pos = (i == PCI_HEADER_NORMAL_NR_BARS)
>> > +                                   ? PCI_ROM_ADDRESS : PCI_ROM_ADDRESS1;
>> > +            uint32_t val = pci_conf_read32(pdev->seg, pdev->bus, slot, 
>> > func,
>> > +                                           rom_pos);
>> > +
>> > +            header->bars[i].enabled = header->rom_enabled = map;
>> > +
>> > +            val &= ~PCI_ROM_ADDRESS_ENABLE;
>> > +            val |= map ? PCI_ROM_ADDRESS_ENABLE : 0;
>> > +            pci_conf_write32(pdev->seg, pdev->bus, slot, func, rom_pos, 
>> > val);
>> > +            break;
>> > +        }
>> > +        if ( !rom_only && (header->bars[i].type != VPCI_BAR_ROM ||
>> > +                           header->rom_enabled) )
>> > +            header->bars[i].enabled = map;
>> 
>> While this second if() has benefited from the rename to "rom_only",
>> I'm now wondering about the validity of the first if(): Why would
>> this need doing only in the "ROM only" case, but not in the
>> "everything" one? Or is the parameter still suffering from its name
>> being misleading? This also needs to be viewed in context of the
>> call here from vpci_process_pending(), which passes (dropping
>> the conditional there) v->vpci.rom, which doesn't exactly mean
>> "ROM only".
> 
> Sorry, I should have changed v->vpci.rom to v->vpci.rom_only.
> 
>> If there's really no name for the parameter that can properly
>> convey its meaning, please attach a clarifying comment. (Having
>> reached the end of the patch I now seem to understand / recall
>> that this is for the case where the ROM BAR's enable bit changes.
>> That's what a comment could usefully say here.)
> 
> I will add:
> 
> /*
>  * The rom_only parameter is used to signal the map/unmap helpers that
>  * the ROM BAR's enable bit has changed with the memory decoding bit
>  * already enabled. If rom_only is not set then it's the memory
>  * decoding bit the one that changed.
>  */

Thanks, but please with the last "the one" dropped, or the last
sentence otherwise corrected.

>> > +static int maybe_defer_map(struct domain *d, struct pci_dev *pdev,
>> > +                           struct rangeset *mem, bool map, bool rom)
>> > +{
>> > +    struct vcpu *curr = current;
>> > +    int rc;
>> > +
>> > +    if ( is_idle_vcpu(curr) )
>> > +    {
>> > +        struct map_data data = { .d = d, .map = true };
>> > +
>> > +        /*
>> > +         * Dom0 building runs on the idle vCPU, in which case it's not 
>> > possible
>> > +         * to defer the operation (like done in the else branch). Call
>> > +         * rangeset_consume_ranges in order to establish the mappings 
>> > right
>> > +         * away.
>> > +         */
>> 
>> For one I think this comment belongs above the if(), as that's what
>> it explains, not the ASSERT() that follows. And then it clarifies only
>> half of what needs clarifying: Why can't we make it here on an idle
>> vCPU outside of Dom0 building (e.g. through a tasklet), or if we can,
>> why is the given behavior the intended one?
> 
> Since this seems to be causing confusion, what about using:
> 
> system_state != SYS_STATE_active
> 
> Instead of checking if running on the idle vpcu. Do you think that
> would make it clearer?

Yes, I think so. That would then make clear that if you moved the
conditional into the only caller and split the function, the Dom0 one
could even become __init.

>> > +        ASSERT(map && !rom);
>> 
>> I can see why you assume it's not an un-mapping request (albeit
>> I wonder whether you couldn't instead simply set .map above to
>> the incoming value), but why the !rom part?
> 
> This branch will only be used at Dom0 build time, when none of the
> BARs are mapped into the p2m, so asking for an unmap in this case
> would be wrong.

Yes, that's the part I've said I understand, yet was saying that the
code wouldn't become incorrect if you set .map to map. You don't
explain the !rom part at all, though.

>> > +     */
>> > +
>> > +    /*
>> > +     * First fill the rangeset with all the BARs of this device or with 
>> > the ROM
>> > +     * BAR only, depending on whether the guest is toggling the memory 
>> > decode
>> > +     * bit of the command register, or the enable bit of the ROM BAR 
>> > register.
>> > +     */
>> > +    for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
>> > +    {
>> > +        const struct vpci_bar *bar = &header->bars[i];
>> > +
>> > +        if ( !MAPPABLE_BAR(bar) ||
>> > +             (rom_only ? bar->type != VPCI_BAR_ROM
>> > +                       : (bar->type == VPCI_BAR_ROM && 
>> > !header->rom_enabled)) )
>> > +            continue;
>> > +
>> > +        rc = rangeset_add_range(mem, PFN_DOWN(bar->addr),
>> > +                                PFN_UP(bar->addr + bar->size - 1));
>> > +        if ( rc )
>> > +        {
>> > +            printk(XENLOG_G_WARNING
>> > +                   "Failed to add [%" PRI_gfn ", %" PRI_gfn "]: %d\n",
>> > +                   PFN_DOWN(bar->addr), PFN_UP(bar->addr + bar->size - 1),
>> 
>> I thought we had agreed that the parenthesization of tuples
>> like this one should match meaning they want to convey. I'm
>> having a hard time to see how PFN_UP() could ever go together
>> with a closing square bracket.
> 
> There's a -1 in the PFN_UP, and it's exactly what we are adding to the
> rangeset. Ie: rangeset_add_range(..., e, e) is adding the range [e,
> e], not [e, e).

Oh, I didn't spot that - it's even worse then afaict, because you
then also have the insertion wrong. Just consider a BAR covering
up to a page and starting at a page boundary: You'd insert two
pages when you mean just one.

>> > +static void cmd_write(const struct pci_dev *pdev, unsigned int reg,
>> > +                      uint32_t cmd, void *data)
>> > +{
>> > +    uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
>> > +    uint16_t current_cmd = pci_conf_read16(pdev->seg, pdev->bus, slot, 
>> > func,
>> > +                                           reg);
>> > +
>> > +    /*
>> > +     * Let Dom0 play with all the bits directly except for the memory
>> > +     * decoding one.
>> > +     */
>> > +    if ( (cmd ^ current_cmd) & PCI_COMMAND_MEMORY )
>> > +        /*
>> > +         * Ignore the error. No memory has been added or removed from the 
>> > p2m,
>> > +         * and the memory decoding has not been changed, so leave 
>> > everything
>> > +         * as-is, hoping the guest will realize and try again.
>> > +         */
>> > +        modify_bars(pdev, cmd & PCI_COMMAND_MEMORY, false);
>> 
>> So that comment appears to be correct, but I wonder if the reader
>> could get a little more assistance, as it's not exactly obvious why no
>> p2m changes would have occurred in case of failure: modify_bars()
>> produces all its errors before doing any mapping, and
>> maybe_defer_map() takes the "else" branch which doesn't cause
>> any (direct) errors. Same for the similar comment in rom_write().
> 
> Let me expand that a little bit then to give some more context:
> 
> /*
>  * Ignore the error. No memory has been added or removed from the p2m
>  * (because the actual p2m changes are deferred in maybe_defer_map)
>  * and the memory decoding bit has not been changed, so leave
>  * everything as-is, hoping the guest will realize and try again.
>  */

That's better, but with the suggested split of maybe_defer_map()
things would end up even less confusing (because the new text
you suggest still has an apparent [but not actual] conflict between
the "maybe" in the function name and what you say is happening).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.