|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2.1 8/7] x86/IOMMU: Use altcall, and __initconst_cf_clobber
On 22/02/2022 11:02, Andrew Cooper wrote:
> On 22/02/2022 10:54, Andrew Cooper wrote:
>> On 22/02/2022 09:29, Jan Beulich wrote:
>>> On 21.02.2022 19:03, Andrew Cooper wrote:
>>>> Most IOMMU hooks are already altcall for performance reasons. Convert the
>>>> rest of them so we can harden all the hooks in Control Flow Integrity
>>>> configurations. This necessitates the use of iommu_{v,}call() in debug
>>>> builds
>>>> too.
>>>>
>>>> Move the root iommu_ops from __read_mostly to __ro_after_init now that the
>>>> latter exists. There is no need for a forward declaration of vtd_ops any
>>>> more, meaning that __initconst_cf_clobber can be used for VTD and AMD.
>>> The connection between the forward declaration and the annotation addition
>>> isn't really clear to me.
>>>
>>>> --- a/xen/arch/x86/include/asm/iommu.h
>>>> +++ b/xen/arch/x86/include/asm/iommu.h
>>>> @@ -72,7 +72,6 @@ struct arch_iommu
>>>>
>>>> extern struct iommu_ops iommu_ops;
>>>>
>>>> -#ifdef NDEBUG
>>>> # include <asm/alternative.h>
>>>> # define iommu_call(ops, fn, args...) ({ \
>>>> (void)(ops); \
>>>> @@ -83,7 +82,6 @@ extern struct iommu_ops iommu_ops;
>>>> (void)(ops); \
>>>> alternative_vcall(iommu_ops.fn, ## args); \
>>>> })
>>>> -#endif
>>>>
>>>> static inline const struct iommu_ops *iommu_get_ops(void)
>>>> {
>>>> @@ -106,7 +104,7 @@ int iommu_setup_hpet_msi(struct msi_desc *);
>>>> static inline int iommu_adjust_irq_affinities(void)
>>>> {
>>>> return iommu_ops.adjust_irq_affinities
>>>> - ? iommu_ops.adjust_irq_affinities()
>>>> + ? iommu_call(iommu_ops, adjust_irq_affinities)
>>> While this (and other instances below) is x86-only code, where - with
>>> the removal of the #ifdef above - we now know the first argument is
>>> always ignored, I think it would still better be of the correct type
>>> (&iommu_ops). Perhaps the "(void)(ops)" in the macro definitions would
>>> better become "ASSERT((ops) == &iommu_ops)", which would check both
>>> type (compile time) and value (runtime).
>> I'm happy to fold that change if you want. It ought to optimise out
>> completely for being
> Bah - sent too early. "for being tautological."
Sadly, it turns out it's not.
$ ../scripts/bloat-o-meter -c xen-syms-before xen-syms-after
add/remove: 0/0 grow/shrink: 13/0 up/down: 369/0 (369)
Function old new delta
pci_add_device 1352 1416 +64
pci_remove_device 716 761 +45
iommu_map 341 382 +41
iommu_do_pci_domctl 1666 1704 +38
iommu_unmap 276 310 +34
deassign_device 353 386 +33
iommu_free_pgtables 310 329 +19
iommu_iotlb_flush_all 181 199 +18
iommu_iotlb_flush 260 278 +18
iommu_hwdom_init 68 86 +18
iommu_domain_destroy 54 70 +16
iommu_lookup_page 53 67 +14
iommu_dump_page_tables 261 272 +11
Total: Before=2194756, After=2195125, chg +0.02%
add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
Data old new delta
Total: Before=1699384, After=1699384, chg +0.00%
add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
RO Data old new delta
Total: Before=0, After=0, chg +0.00%
is the delta in debug builds, while
$ ../scripts/bloat-o-meter -c xen-syms-before xen-syms-after
add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-57 (-57)
Function old new delta
iommu_resume 34 16 -18
iommu_suspend 42 23 -19
iommu_crash_shutdown 66 46 -20
Total: Before=2112261, After=2112204, chg -0.00%
add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
Data old new delta
Total: Before=1709424, After=1709424, chg +0.00%
add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
RO Data old new delta
Total: Before=0, After=0, chg +0.00%
is the delta in release builds. This is a little weird - it's because
the ASSERT(), in release builds, short circuits the evaluation of its
condition, meaning that the BUG_ON() inside iommu_get_ops() doesn't get
emitted.
Irritatingly, there's no way I can spot to do this check with a
BUILD_BUG_ON(), which would reduce the impact on the debug builds too.
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |