[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] gross qemu behavior



On Fri, 28 Mar 2014, Stefano Stabellini wrote:
> CC'ing Paolo, hoping that he has a better idea on how to solve this
> problem.
> 
> 
> On Fri, 28 Mar 2014, Jan Beulich wrote:
> > Hi,
> > 
> > so while doing all that EPT work I naturally also happened to look more
> > closely at the EPT table dumps, spotting an odd range of 16 pages
> > outside any supposedly populated address range. This range only
> > exists when guest memory doesn't extend past (by default) 0xf0000000
> > (the start of MMIO, i.e. normally the frame buffer). After spending quite
> > a bit of time I finally figured that this must be a left over of the Cirrus
> > VGA ROM, and I would have thought that this
> > 
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -1976,6 +1976,9 @@ static int pci_add_option_rom(PCIDevice 
> >      }
> >  
> >      pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom);
> > +    memory_region_add_subregion_overlap(pdev->bus->address_space_mem,
> > +                                        pdev->rom.ram_addr, &pdev->rom, 1);
> > +    memory_region_del_subregion(pdev->bus->address_space_mem, &pdev->rom);
> >  
> >      return 0;
> >  }
> > 
> > should fix it. It does appear to work as far generic qemu is concerned,
> > but once looking at the Xen backend I had to conclude that this just
> > can't work: For one, xen_add_to_physmap() and
> > xen_remove_from_physmap() are _documented_ (in a comment) to
> > only be capable of a single region (VRAM). And the latter - even worse -
> > is implemented with a call to xc_domain_add_to_physmap(), completely
> > contrary to its name.
> 
> xen_add_to_physmap and xen_remove_from_physmap are just to deal with the
> VRAM in their current implementation.
> 
> 
> > Instrumenting xen_region_{add,del}(), I can see that all regions get
> > properly reported to the Xen backend, just that it doesn't handle them
> > (this is with above patch in place):
> > 
> > xra(fee00000,100000)
> > xra(fec00000,1000)
> > xra(fed00000,400)
> > xra(80000000,10000)
> > xrd(80000000,10000)
> > xra(f0000000,800000)
> > xra(f1000000,400000)
> > xra(f2000000,1000000)
> > xra(f3010000,4000)
> > xra(f3014000,1000)
> > xra(f3015000,3000)
> > xra(f3018000,1000)
> > xra(f3000000,10000)
> > xrd(f3000000,10000)
> > xrd(f0000000,800000)
> > xra(f0000000,800000)
> > mapping vram to f0000000 - f0800000
> > 
> > Having wasted enough time getting to this point, I'd like to ask you
> > to advise a proper fix for this. We definitely shouldn't be leaving
> > stuff sitting at arbitrary positions in the physical address space of
> > the guest. And the fact that the range gets removed (from Xen's
> > perspective, but not from qemu's) when RAM extends beyond
> > 0xf0000000 (due to it being replaced with what is actually
> > intended to be there) makes me wonder what would happen if the
> > ROM got enabled by the guest.
> 
> This is a thorny issue, fixing this behavior is not going to be trivial:
> 
> - The hypervisor/libxc does not currently expose a
>   xc_domain_remove_from_physmap function.
> 
> - QEMU works by allocating memory regions at the end of the guest
>   physmap and then moving them at the right place.
> 
> - QEMU can destroy a memory region and in that case we could free the
>   memory and remove it from the physmap, however that is NOT what QEMU
>   does with the vga ROM. In that case it calls
>   memory_region_del_subregion, so we can't be sure that the ROM won't be
>   mapped again, therefore we cannot free it. We need to move it
>   somewhere else, hence the problem.
> 
> 
> But fortunately we don't actually need to add the VGA ROM to the guest
> physmap for it to work, QEMU can trap and emulate. In fact even today we
> are not mapping it at the right place anyway, see xen_set_memory:
> 
>     if (add) {
>         if (!memory_region_is_rom(section->mr)) {
>             xen_add_to_physmap(state, start_addr, size,
>                                section->mr, section->offset_within_region);
>         } else {
> 
> 
> So the only solution I can see right now is:
> 
> - avoid allocating guest memory for the VGA ROM
> That means that at the beginning of xen_ram_alloc we need to realize
> that the memory region we are dealing with is the VGA ROM memory region
> and avoid calling xc_domain_populate_physmap_exact for it.
> 
> - call g_malloc instead
> Simply use g_malloc to allocate QEMU memory for the VGA ROM,
> keep track of the allocation in a data structure internal to xen-all.c.
> 
> - make sure that qemu_get_ram_ptr can deal with the different allocation
> Now that the VGA ROM is QEMU memory, we need to make sure that
> qemu_get_ram_ptr returns the right pointer for it.
> 
> 
> This is all very fiddly and hackish, but I can't see a better way of
> solving the issue.


Given that I feel that the explanation is not very clear, I am appending
a proof of concept patch. It is obviously horrible, I am by no means
suggesting it should be applied. 


diff --git a/exec.c b/exec.c
index 91513c6..bdecc70 100644
--- a/exec.c
+++ b/exec.c
@@ -1453,6 +1453,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
    It should not be used for general purpose DMA.
    Use cpu_physical_memory_map/cpu_physical_memory_rw instead.
  */
+extern uint8_t* vga_rom;
 void *qemu_get_ram_ptr(ram_addr_t addr)
 {
     RAMBlock *block = qemu_get_ram_block(addr);
@@ -1462,7 +1463,9 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
          * because we don't want to map the entire memory in QEMU.
          * In that case just map until the end of the page.
          */
-        if (block->offset == 0) {
+        if (!strcmp(block->mr->name,"cirrus_vga.rom")) {
+            return vga_rom;
+        } else if (block->offset == 0) {
             return xen_map_cache(addr, 0, 0);
         } else if (block->host == NULL) {
             block->host =
diff --git a/xen-all.c b/xen-all.c
index ba34739..6211946 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -101,6 +101,8 @@ typedef struct XenIOState {
     Notifier wakeup;
 } XenIOState;
 
+uint8_t* vga_rom;
+
 /* Xen specific function for piix pci */
 
 int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
@@ -217,6 +219,11 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, 
MemoryRegion *mr)
         return;
     }
 
+    if (!strcmp(mr->name,"cirrus_vga.rom")) {
+        vga_rom = g_malloc(size);
+        return;
+    }
+
     trace_xen_ram_alloc(ram_addr, size);
 
     nr_pfn = size >> TARGET_PAGE_BITS;

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.