[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V4 00/13] x86/Hyper-V: Add Hyper-V Isolation VM support

To: Christoph Hellwig <hch@xxxxxx>, Michael Kelley <mikelley@xxxxxxxxxxxxx>
From: Tianyu Lan <ltykernel@xxxxxxxxx>
Date: Thu, 2 Sep 2021 19:21:18 +0800
Cc: KY Srinivasan <kys@xxxxxxxxxxxxx>, Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>, Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>, "wei.liu@xxxxxxxxxx" <wei.liu@xxxxxxxxxx>, Dexuan Cui <decui@xxxxxxxxxxxxx>, "catalin.marinas@xxxxxxx" <catalin.marinas@xxxxxxx>, "will@xxxxxxxxxx" <will@xxxxxxxxxx>, "tglx@xxxxxxxxxxxxx" <tglx@xxxxxxxxxxxxx>, "mingo@xxxxxxxxxx" <mingo@xxxxxxxxxx>, "bp@xxxxxxxxx" <bp@xxxxxxxxx>, "x86@xxxxxxxxxx" <x86@xxxxxxxxxx>, "hpa@xxxxxxxxx" <hpa@xxxxxxxxx>, "dave.hansen@xxxxxxxxxxxxxxx" <dave.hansen@xxxxxxxxxxxxxxx>, "luto@xxxxxxxxxx" <luto@xxxxxxxxxx>, "peterz@xxxxxxxxxxxxx" <peterz@xxxxxxxxxxxxx>, "konrad.wilk@xxxxxxxxxx" <konrad.wilk@xxxxxxxxxx>, "boris.ostrovsky@xxxxxxxxxx" <boris.ostrovsky@xxxxxxxxxx>, "jgross@xxxxxxxx" <jgross@xxxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, "joro@xxxxxxxxxx" <joro@xxxxxxxxxx>, "davem@xxxxxxxxxxxxx" <davem@xxxxxxxxxxxxx>, "kuba@xxxxxxxxxx" <kuba@xxxxxxxxxx>, "jejb@xxxxxxxxxxxxx" <jejb@xxxxxxxxxxxxx>, "martin.petersen@xxxxxxxxxx" <martin.petersen@xxxxxxxxxx>, "gregkh@xxxxxxxxxxxxxxxxxxx" <gregkh@xxxxxxxxxxxxxxxxxxx>, "arnd@xxxxxxxx" <arnd@xxxxxxxx>, "m.szyprowski@xxxxxxxxxxx" <m.szyprowski@xxxxxxxxxxx>, "robin.murphy@xxxxxxx" <robin.murphy@xxxxxxx>, "brijesh.singh@xxxxxxx" <brijesh.singh@xxxxxxx>, "thomas.lendacky@xxxxxxx" <thomas.lendacky@xxxxxxx>, Tianyu Lan <Tianyu.Lan@xxxxxxxxxxxxx>, "pgonda@xxxxxxxxxx" <pgonda@xxxxxxxxxx>, "martin.b.radev@xxxxxxxxx" <martin.b.radev@xxxxxxxxx>, "akpm@xxxxxxxxxxxxxxxxxxxx" <akpm@xxxxxxxxxxxxxxxxxxxx>, "kirill.shutemov@xxxxxxxxxxxxxxx" <kirill.shutemov@xxxxxxxxxxxxxxx>, "rppt@xxxxxxxxxx" <rppt@xxxxxxxxxx>, "hannes@xxxxxxxxxxx" <hannes@xxxxxxxxxxx>, "aneesh.kumar@xxxxxxxxxxxxx" <aneesh.kumar@xxxxxxxxxxxxx>, "krish.sadhukhan@xxxxxxxxxx" <krish.sadhukhan@xxxxxxxxxx>, "saravanand@xxxxxx" <saravanand@xxxxxx>, "linux-arm-kernel@xxxxxxxxxxxxxxxxxxx" <linux-arm-kernel@xxxxxxxxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "rientjes@xxxxxxxxxx" <rientjes@xxxxxxxxxx>, "ardb@xxxxxxxxxx" <ardb@xxxxxxxxxx>, "iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx" <iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx>, "linux-arch@xxxxxxxxxxxxxxx" <linux-arch@xxxxxxxxxxxxxxx>, "linux-hyperv@xxxxxxxxxxxxxxx" <linux-hyperv@xxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, "linux-scsi@xxxxxxxxxxxxxxx" <linux-scsi@xxxxxxxxxxxxxxx>, "netdev@xxxxxxxxxxxxxxx" <netdev@xxxxxxxxxxxxxxx>, vkuznets <vkuznets@xxxxxxxxxx>, "parri.andrea@xxxxxxxxx" <parri.andrea@xxxxxxxxx>, "dave.hansen@xxxxxxxxx" <dave.hansen@xxxxxxxxx>
Delivery-date: Thu, 02 Sep 2021 11:21:49 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>



On 9/2/2021 3:59 PM, Christoph Hellwig wrote:

On Tue, Aug 31, 2021 at 05:16:19PM +0000, Michael Kelley wrote:

As a quick overview, I think there are four places where the
shared_gpa_boundary must be applied to adjust the guest physical
address that is used.  Each requires mapping a corresponding
virtual address range.  Here are the four places:

1)  The so-called "monitor pages" that are a core communication
mechanism between the guest and Hyper-V.  These are two single
pages, and the mapping is handled by calling memremap() for
each of the two pages.  See Patch 7 of Tianyu's series.


Ah, interesting.

3)  The network driver send and receive buffers.  vmap_phys_range()
should work here.


Actually it won't.  The problem with these buffers is that they are
physically non-contiguous allocations.  We really have two sensible
options:

  1) use vmap_pfn as in the current series.  But in that case I think
     we should get rid of the other mapping created by vmalloc.  I
     though a bit about finding a way to apply the offset in vmalloc
     itself, but I think it would be too invasive to the normal fast
     path.  So the other sub-option would be to allocate the pages
     manually (maybe even using high order allocations to reduce TLB
     pressure) and then remap them

Agree. In such case, the map for memory below shared_gpa_boundary is notnecessary. allocate_pages() is limited by MAX_ORDER and needs to becalled repeatedly to get enough memory.

  2) do away with the contiguous kernel mapping entirely.  This means
     the simple memcpy calls become loops over kmap_local_pfn.  As
     I just found out for the send side that would be pretty easy,
     but the receive side would be more work.  We'd also need to check
     the performance implications.

kmap_local_pfn() requires pfn with backing struct page and this doesn'twork pfn above shared_gpa_boundary.

4) The swiotlb memory used for bounce buffers.  vmap_phys_range()
should work here as well.


Or memremap if it works for 1.


Now use vmap_pfn() and the hv map function is reused in the netvsc driver.

Case #2 above does unusual mapping.  The ring buffer consists of a ring
buffer header page, followed by one or more pages that are the actual
ring buffer.  The pages making up the actual ring buffer are mapped
twice in succession.  For example, if the ring buffer has 4 pages
(one header page and three ring buffer pages), the contiguous
virtual mapping must cover these seven pages:  0, 1, 2, 3, 1, 2, 3.
The duplicate contiguous mapping allows the code that is reading
or writing the actual ring buffer to not be concerned about wrap-around
because writing off the end of the ring buffer is automatically
wrapped-around by the mapping.  The amount of data read or
written in one batch never exceeds the size of the ring buffer, and
after a batch is read or written, the read or write indices are adjusted
to put them back into the range of the first mapping of the actual
ring buffer pages.  So there's method to the madness, and the
technique works pretty well.  But this kind of mapping is not
amenable to using vmap_phys_range().


Hmm.  Can you point me to where this is mapped?  Especially for the
classic non-isolated case where no vmap/vmalloc mapping is involved
at all?


This is done via vmap() in the hv_ringbuffer_init()

182/* Initialize the ring buffer. */
183int hv_ringbuffer_init(struct hv_ring_buffer_info *ring_info,

184 struct page *pages, u32 page_cnt, u32max_pkt_size)

185{
186        int i;
187        struct page **pages_wraparound;
188
189        BUILD_BUG_ON((sizeof(struct hv_ring_buffer) != PAGE_SIZE));
190
191        /*

192 * First page holds struct hv_ring_buffer, do wraparoundmapping for

193         * the rest.
194         */

195 pages_wraparound = kcalloc(page_cnt * 2 - 1, sizeof(structpage *),

196                                   GFP_KERNEL);
197        if (!pages_wraparound)
198                return -ENOMEM;
199
/* prepare to wrap page array */
200        pages_wraparound[0] = pages;
201        for (i = 0; i < 2 * (page_cnt - 1); i++)
202                pages_wraparound[i + 1] = &pages[i % (page_cnt - 1) + 1];
203
/* map */
204        ring_info->ring_buffer = (struct hv_ring_buffer *)

205 vmap(pages_wraparound, page_cnt * 2 - 1, VM_MAP,PAGE_KERNEL);

206
207        kfree(pages_wraparound);
208
209
210        if (!ring_info->ring_buffer)
211                return -ENOMEM;
212
213        ring_info->ring_buffer->read_index =
214                ring_info->ring_buffer->write_index = 0;

References:
- Re: [PATCH V4 00/13] x86/Hyper-V: Add Hyper-V Isolation VM support
  - From: Christoph Hellwig

Prev by Date: Re: [XEN PATCH v7 00/51] xen: Build system improvements, now with out-of-tree build!
Next by Date: Re: [PATCH v2 0/6] x86/PVH: Dom0 building adjustments
Previous by thread: Re: [PATCH V4 00/13] x86/Hyper-V: Add Hyper-V Isolation VM support
Next by thread: RE: [PATCH V4 00/13] x86/Hyper-V: Add Hyper-V Isolation VM support
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.