[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH][XSA-126] xen: limit guest control of PCI command register



On Wed, Apr 01, 2015 at 10:20:06AM +0100, Stefano Stabellini wrote:
> CC'ing the author of the patch and xen-devel.
> FYI I think that Jan is going to be on vacation for a couple of weeks.
> 
> On Wed, 1 Apr 2015, Michael S. Tsirkin wrote:
> > On Tue, Mar 31, 2015 at 03:18:03PM +0100, Stefano Stabellini wrote:
> > > From: Jan Beulich <jbeulich@xxxxxxxx>
> > > 
> > > Otherwise the guest can abuse that control to cause e.g. PCIe
> > > Unsupported Request responses (by disabling memory and/or I/O decoding
> > > and subsequently causing [CPU side] accesses to the respective address
> > > ranges), which (depending on system configuration) may be fatal to the
> > > host.
> > > 
> > > This is CVE-2015-2756 / XSA-126.
> > > 
> > > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> > > Reviewed-by: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
> > > Acked-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
> > 
> > The patch description seems somewhat incorrect to me.
> > UR should not be fatal to the system, and it's not platform
> > specific.
> 
> I think that people have been able to repro this, but I'll let Jan
> comment on it.
> 
> 
> > In particular, there could be more reasons for devices
> > to generate URs, for example, if they get a transaction
> > during FLR. I don't think we ever tried to prevent this.
> 
> That cannot be triggered by guest behaviour.

Emulated bus reset often triggers FLR, but I can't speak for
Xen here.  Many devices have vendor specific reset registers though.
I don't believe anyone has a comprehensive list of such devices.

Fundamendally IOMMU is hosts's only protection. IOMMU doesn't
protect against URs so they mustn't trigger system errors.


> 
> > Here's the description from pci express spec:
> > 
> > IMPLEMENTATION NOTE
> > Software UR Reporting Compatibility with 1.0a Devices
> > 
> >     With 1.0a device Functions, 96 if the Unsupported Request Reporting 
> > Enable bit is set, the Function
> >     when operating as a Completer will send an uncorrectable error Message 
> > (if enabled) when a UR
> >     error is detected. On platforms where an uncorrectable error Message is 
> > handled as a System Error,
> >     this will break PC-compatible Configuration Space probing, so 
> > software/firmware on such
> >     platforms may need to avoid setting the Unsupported Request Reporting 
> > Enable bit.
> >     With device Functions implementing Role-Based Error Reporting, setting 
> > the Unsupported Request
> >     Reporting Enable bit will not interfere with PC-compatible 
> > Configuration Space probing, assuming
> >     that the severity for UR is left at its default of non-fatal. However, 
> > setting the Unsupported Request
> >     Reporting Enable bit will enable the Function to report UR errors 
> > detected with posted Requests,
> >     helping avoid this case for potential silent data corruption.
> >     On platforms where robust error handling and PC-compatible 
> > Configuration Space probing is
> >     required, it is suggested that software or firmware have the 
> > Unsupported Request Reporting Enable
> >     bit Set for Role-Based Error Reporting Functions, but clear for 1.0a 
> > Functions. Software or
> >     firmware can distinguish the two classes of Functions by examining the 
> > Role-Based Error Reporting
> >     bit in the Device Capabilities register.
> > 
> > 
> > What I think you have is a very old 1.0a system, and host set Unsupported
> > Request Reporting Enable.
> > 
> > The right thing is for host kernel to do is what the note says, and disable 
> > UR
> > reporting for this system.
> > 
> > As a work around for broken kernels, this is OK I guess, though
> > guest and host being out of sync is always a risk.
> > But I think it's a good idea to add documentation explaining
> > this, and work on the correct fix in linux, before we
> > add workarounds.
> 
> There can be guest kernels other than Linux, we cannot fix them all, and
> we cannot allow PCI passthrough only with Linux guests.

Are you trying to fix guest crashes, or host crashes?
If guest crashes, how is this a CVE?

What I said above has nothing to do with guest bugs.
I think the bug is in whoever configured the host pci bridge.
That's host kernel, not guest kernel.


> 
> > > diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> > > index f2893b2..d095c08 100644
> > > --- a/hw/xen/xen_pt.c
> > > +++ b/hw/xen/xen_pt.c
> > > @@ -388,7 +388,7 @@ static const MemoryRegionOps ops = {
> > >      .write = xen_pt_bar_write,
> > >  };
> > >  
> > > -static int xen_pt_register_regions(XenPCIPassthroughState *s)
> > > +static int xen_pt_register_regions(XenPCIPassthroughState *s, uint16_t 
> > > *cmd)
> > >  {
> > >      int i = 0;
> > >      XenHostPCIDevice *d = &s->real_device;
> > > @@ -406,6 +406,7 @@ static int 
> > > xen_pt_register_regions(XenPCIPassthroughState *s)
> > >  
> > >          if (r->type & XEN_HOST_PCI_REGION_TYPE_IO) {
> > >              type = PCI_BASE_ADDRESS_SPACE_IO;
> > > +            *cmd |= PCI_COMMAND_IO;
> > >          } else {
> > >              type = PCI_BASE_ADDRESS_SPACE_MEMORY;
> > >              if (r->type & XEN_HOST_PCI_REGION_TYPE_PREFETCH) {
> > > @@ -414,6 +415,7 @@ static int 
> > > xen_pt_register_regions(XenPCIPassthroughState *s)
> > >              if (r->type & XEN_HOST_PCI_REGION_TYPE_MEM_64) {
> > >                  type |= PCI_BASE_ADDRESS_MEM_TYPE_64;
> > >              }
> > > +            *cmd |= PCI_COMMAND_MEMORY;
> > >          }
> > >  
> > >          memory_region_init_io(&s->bar[i], OBJECT(s), &ops, &s->dev,
> > > @@ -638,6 +640,7 @@ static int xen_pt_initfn(PCIDevice *d)
> > >      XenPCIPassthroughState *s = DO_UPCAST(XenPCIPassthroughState, dev, 
> > > d);
> > >      int rc = 0;
> > >      uint8_t machine_irq = 0;
> > > +    uint16_t cmd = 0;
> > >      int pirq = XEN_PT_UNASSIGNED_PIRQ;
> > >  
> > >      /* register real device */
> > > @@ -672,7 +675,7 @@ static int xen_pt_initfn(PCIDevice *d)
> > >      s->io_listener = xen_pt_io_listener;
> > >  
> > >      /* Handle real device's MMIO/PIO BARs */
> > > -    xen_pt_register_regions(s);
> > > +    xen_pt_register_regions(s, &cmd);
> > >  
> > >      /* reinitialize each config register to be emulated */
> > >      if (xen_pt_config_init(s)) {
> > > @@ -736,6 +739,11 @@ static int xen_pt_initfn(PCIDevice *d)
> > >      }
> > >  
> > >  out:
> > > +    if (cmd) {
> > > +        xen_host_pci_set_word(&s->real_device, PCI_COMMAND,
> > > +                              pci_get_word(d->config + PCI_COMMAND) | 
> > > cmd);
> > > +    }
> > > +
> > 
> > Is this writing to host device, forcing cmd and io bits to enabled simply
> > because they are present? If yes, I think that's wrong: you don't know 
> > whether
> > bios enabled them, if it didn't host BARs might conflict with other devices,
> > and this will crash host.  I don't see why you are touching host command
> > register at all.  The point in the description is to avoid touching host.
> 
> pciback clears the command register when seizing the device:
> 
> pcistub_seize() -> pcistub_init_device() -> xen_pcibk_reset_device()


I think you have a bug then.  You must keep memory/io enable bits that
bios set for you, otherwise you risk conflicts with other devices,
or conflicts between BARs. I guess you do this for BAR values,
same applies here.




> 
> > >      memory_listener_register(&s->memory_listener, &s->dev.bus_master_as);
> > >      memory_listener_register(&s->io_listener, &address_space_io);
> > >      XEN_PT_LOG(d,
> > > diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
> > > index d99c22e..95a51db 100644
> > > --- a/hw/xen/xen_pt_config_init.c
> > > +++ b/hw/xen/xen_pt_config_init.c
> > > @@ -286,23 +286,6 @@ static int 
> > > xen_pt_irqpin_reg_init(XenPCIPassthroughState *s,
> > >  }
> > >  
> > >  /* Command register */
> > > -static int xen_pt_cmd_reg_read(XenPCIPassthroughState *s, XenPTReg 
> > > *cfg_entry,
> > > -                               uint16_t *value, uint16_t valid_mask)
> > > -{
> > > -    XenPTRegInfo *reg = cfg_entry->reg;
> > > -    uint16_t valid_emu_mask = 0;
> > > -    uint16_t emu_mask = reg->emu_mask;
> > > -
> > > -    if (s->is_virtfn) {
> > > -        emu_mask |= PCI_COMMAND_MEMORY;
> > > -    }
> > > -
> > > -    /* emulate word register */
> > > -    valid_emu_mask = emu_mask & valid_mask;
> > > -    *value = XEN_PT_MERGE_VALUE(*value, cfg_entry->data, 
> > > ~valid_emu_mask);
> > > -
> > > -    return 0;
> > > -}
> > >  static int xen_pt_cmd_reg_write(XenPCIPassthroughState *s, XenPTReg 
> > > *cfg_entry,
> > >                                  uint16_t *val, uint16_t dev_value,
> > >                                  uint16_t valid_mask)
> > > @@ -310,18 +293,13 @@ static int 
> > > xen_pt_cmd_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry,
> > >      XenPTRegInfo *reg = cfg_entry->reg;
> > >      uint16_t writable_mask = 0;
> > >      uint16_t throughable_mask = 0;
> > > -    uint16_t emu_mask = reg->emu_mask;
> > > -
> > > -    if (s->is_virtfn) {
> > > -        emu_mask |= PCI_COMMAND_MEMORY;
> > > -    }
> > >  
> > >      /* modify emulate register */
> > >      writable_mask = ~reg->ro_mask & valid_mask;
> > >      cfg_entry->data = XEN_PT_MERGE_VALUE(*val, cfg_entry->data, 
> > > writable_mask);
> > >  
> > >      /* create value for writing to I/O device register */
> > > -    throughable_mask = ~emu_mask & valid_mask;
> > > +    throughable_mask = ~reg->emu_mask & valid_mask;
> > >  
> > >      if (*val & PCI_COMMAND_INTX_DISABLE) {
> > >          throughable_mask |= PCI_COMMAND_INTX_DISABLE;
> > > @@ -603,9 +581,9 @@ static XenPTRegInfo xen_pt_emu_reg_header0[] = {
> > >          .size       = 2,
> > >          .init_val   = 0x0000,
> > >          .ro_mask    = 0xF880,
> > > -        .emu_mask   = 0x0740,
> > > +        .emu_mask   = 0x0743,
> > >          .init       = xen_pt_common_reg_init,
> > > -        .u.w.read   = xen_pt_cmd_reg_read,
> > > +        .u.w.read   = xen_pt_word_reg_read,
> > >          .u.w.write  = xen_pt_cmd_reg_write,
> > >      },
> > >      /* Capabilities Pointer reg */
> > 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.