[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 5/8] sysctl: Add sysctl interface for querying PCI topology



>>> On 20.03.15 at 21:01, <boris.ostrovsky@xxxxxxxxxx> wrote:
> On 03/20/2015 12:26 PM, Jan Beulich wrote:
>>>>> On 19.03.15 at 22:54, <boris.ostrovsky@xxxxxxxxxx> wrote:
>>> --- a/xen/common/sysctl.c
>>> +++ b/xen/common/sysctl.c
>>> @@ -399,6 +399,67 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
>>> u_sysctl)
>>>           break;
>>>   #endif
>>>   
>>> +#ifdef HAS_PCI
>>> +    case XEN_SYSCTL_pcitopoinfo:
>>> +    {
>>> +        xen_sysctl_pcitopoinfo_t *ti = &op->u.pcitopoinfo;
>>> +
>>> +        if ( guest_handle_is_null(ti->devs) ||
>>> +             guest_handle_is_null(ti->nodes) ||
>>> +             (ti->first_dev > ti->num_devs) )
>>> +        {
>>> +            ret = -EINVAL;
>>> +            break;
>>> +        }
>>> +
>>> +        while ( ti->first_dev < ti->num_devs )
>>> +        {
>>> +            physdev_pci_device_t dev;
>>> +            uint32_t node;
>>> +            struct pci_dev *pdev;
>>> +
>>> +            if ( copy_from_guest_offset(&dev, ti->devs, ti->first_dev, 1) )
>>> +            {
>>> +                ret = -EFAULT;
>>> +                break;
>>> +            }
>>> +
>>> +            spin_lock(&pcidevs_lock);
>>> +            pdev = pci_get_pdev(dev.seg, dev.bus, dev.devfn);
>>> +            if ( !pdev || (pdev->node == NUMA_NO_NODE) )
>>> +                node = XEN_INVALID_NODE_ID;
>> I really think the two cases folded here should be distinguishable
>> by the caller.
> 
> How about making  ti->devs array an IN/OUT argument and updating the 
> entry with -1s (which I think is an invalid PCI device)? This will make 
> the original deviceID disappear though so the callers would be expected 
> to stash the array before making the call if they want to know which 
> devices were not reported.

Sadly all ones in physdev_pci_device_t still could be a valid device.

> Alternatively, since node is 32-bit value while nodeid_t is 8-bit, I can 
> add another token that signifies an invalid device. The main problem 
> with this approach is that logically we use 'nodes' array for passing 
> nodeIDs, not information about devices.

I realize that. I wonder whether passing in a bad device shouldn't
simply result in -ENODEV, perhaps with first_dev pointing at the
bad slot?

>>> +            else
>>> +                node = pdev->node;
>>> +            spin_unlock(&pcidevs_lock);
>>> +
>>> +            if ( copy_to_guest_offset(ti->nodes, ti->first_dev, &node, 1) )
>>> +            {
>>> +                ret = -EFAULT;
>>> +                break;
>>> +            }
>>> +
>>> +            ti->first_dev++;
>>> +
>>> +            if ( hypercall_preempt_check() )
>>> +                break;
>>> +        }
>>> +
>>> +        if ( !ret )
>>> +        {
>>> +            if ( __copy_field_to_guest(u_sysctl, op, 
>>> u.pcitopoinfo.first_dev) )
>>> +            {
>>> +                ret = -EFAULT;
>>> +                break;
>>> +            }
>>> +
>>> +            if ( ti->first_dev < ti->num_devs )
>>> +                ret = hypercall_create_continuation(__HYPERVISOR_sysctl,
>>> +                                                    "h", u_sysctl);
>> Considering this is a tools only interface, enforcing a not too high
>> limit on num_devs would seem better than this not really clean
>> continuation mechanism. The (tool stack) caller(s) can be made
>> iterate.
> 
> What's a reasonable limit per call? 100?

Commonly we use powers of two for these, even if not strictly
needed to be that way. Hence I'd suggest 64. But please be sure
not to make this implementation detail part of the ABI.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.