[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges



On 2015/6/16 13:47, Tian, Kevin wrote:
From: Chen, Tiejun
Sent: Friday, June 12, 2015 3:54 PM

           bar_data |= (uint32_t)base;
           bar_data_upper = (uint32_t)(base >> 32);
+        for ( j = 0; j < memory_map.nr_map ; j++ )
+        {
+            if ( memory_map.map[j].type != E820_RAM )
+            {
+                reserved_end = memory_map.map[j].addr +
memory_map.map[j].size;
+                if ( check_overlap(base, bar_sz,
+                                   memory_map.map[j].addr,
+                                   memory_map.map[j].size) )
+                {
+                    base = (reserved_end  + bar_sz - 1) & ~(uint64_t)(bar_sz - 
1);
+                    goto reallocate_mmio;

That is because our previous implementation is just skipping that
conflict region,

"But you do nothing to make sure the MMIO regions all fit in the
available window (see the code ahead of this relocating RAM if
necessary)." and "...it simply skips assigning resources. Your changes
potentially growing the space needed to fit all MMIO BARs therefore also
needs to adjust the up front calculation, such that if necessary more
RAM can be relocated to make the hole large enough."

And then I replied as follows,

"You're right.

Just think about we're always trying to check pci_mem_start to populate
more RAM to obtain enough PCI mempry,

      /* Relocate RAM that overlaps PCI space (in 64k-page chunks). */
      while ( (pci_mem_start >> PAGE_SHIFT) < hvm_info->low_mem_pgend )
      {
          struct xen_add_to_physmap xatp;
          unsigned int nr_pages = min_t(
              unsigned int,
              hvm_info->low_mem_pgend - (pci_mem_start >> PAGE_SHIFT),
              (1u << 16) - 1);
          if ( hvm_info->high_mem_pgend == 0 )
              hvm_info->high_mem_pgend = 1ull << (32 - PAGE_SHIFT);
          hvm_info->low_mem_pgend -= nr_pages;
          printf("Relocating 0x%x pages from "PRIllx" to "PRIllx\
                 " for lowmem MMIO hole\n",
                 nr_pages,
                 PRIllx_arg(((uint64_t)hvm_info->low_mem_pgend)<<PAGE_SHIFT),

PRIllx_arg(((uint64_t)hvm_info->high_mem_pgend)<<PAGE_SHIFT));
          xatp.domid = DOMID_SELF;
          xatp.space = XENMAPSPACE_gmfn_range;
          xatp.idx   = hvm_info->low_mem_pgend;
          xatp.gpfn  = hvm_info->high_mem_pgend;
          xatp.size  = nr_pages;
          if ( hypercall_memory_op(XENMEM_add_to_physmap, &xatp) != 0 )
              BUG();
          hvm_info->high_mem_pgend += nr_pages;
      }
"

I hope this can help you understand this background. And I will update
that code comment like this,

      /*
       * We'll skip all space overlapping with reserved memory later,
       * so we need to decrease pci_mem_start to populate more RAM
       * to compensate them.
       */


Jan's comment is correct. However I don't think adjusting pci_mem_start
is the right way here (even now I don't quite understand how it's adjusted
in your earlier code). There are other limitations on that value. We can simply
adjust mmio_total to include conflicting reserved ranges, so more bars will

Agreed.

I'm trying to walk into this direction:

    /*
     * We'll skip all space overlapping with reserved memory later,
     * so we need to increase mmio_total to compensate them.
     */
    for ( j = 0; j < memory_map.nr_map ; j++ )
    {
        uint64_t conflict_size = 0;
        if ( memory_map.map[j].type != E820_RAM )
        {
            reserved_start = memory_map.map[j].addr;
            reserved_size = memory_map.map[j].size;
            reserved_end = reserved_start + reserved_size;
            if ( check_overlap(pci_mem_start, pci_mem_end - pci_mem_start,
                               reserved_start, reserved_size) )
            {
                /*
                 * Calculate how much mmio range conflict with
                 * reserved device memory.
                 */
                conflict_size += reserved_size;

                /*
                 * But we may need to subtract those sizes beyond the
                 * pci memory, [pci_mem_start, pci_mem_end].
                 */
                if ( reserved_start < pci_mem_start )
                    conflict_size -= (pci_mem_start - reserved_start);
                if ( reserved_end > pci_mem_end )
                    conflict_size -= (reserved_end - pci_mem_end);
            }
        }

        if ( conflict_size )
        {
            uint64_t conflict_size = max_t(
                    uint64_t, conflict_size, max_bar_sz);
            conflict_size &= ~(conflict_size - 1);
            mmio_total += conflict_size;
        }
    }


Note max_bar_sz just represents the most biggest bar size among all pci devices.

Thanks
Tiejun

be moved to high_mem_resource automatically by below code:

  380         using_64bar = bars[i].is_64bar && bar64_relocate
  381             && (mmio_total > (mem_resource.max - mem_resource.base));
  382         bar_data = pci_readl(devfn, bar_reg);
  383
  384         if ( (bar_data & PCI_BASE_ADDRESS_SPACE) ==
  385              PCI_BASE_ADDRESS_SPACE_MEMORY )
  386         {
  387             /* Mapping high memory if PCI device is 64 bits bar */
  388             if ( using_64bar ) {
  389                 if ( high_mem_resource.base & (bar_sz - 1) )
  390                     high_mem_resource.base = high_mem_resource.base -
  391                         (high_mem_resource.base & (bar_sz - 1)) + bar_sz;
  392                 if ( !pci_hi_mem_start )
  393                     pci_hi_mem_start = high_mem_resource.base;
  394                 resource = &high_mem_resource;
  395                 bar_data &= ~PCI_BASE_ADDRESS_MEM_MASK;
  396             }
  397             else {
  398                 resource = &mem_resource;
  399                 bar_data &= ~PCI_BASE_ADDRESS_MEM_MASK;
  400             }
  401             mmio_total -= bar_sz;
  402         }

Thanks
Kevin



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.