xen-devel

[Top] [All Lists]

Re: [Xen-devel] [PATCH 10 of 24] libxc: infrastructure for hypercall saf

from [Jeremy Fitzhardinge]

[Permanent Link][Original]

To:	Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
Subject:	Re: [Xen-devel] [PATCH 10 of 24] libxc: infrastructure for hypercall safe data buffers
From:	Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date:	Wed, 08 Sep 2010 09:31:58 +1000
Cc:	Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date:	Tue, 07 Sep 2010 16:32:48 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<19590.29997.199583.59386@xxxxxxxxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<bf7fb64762eb7decea9a.1283780310@xxxxxxxxxxxxxxxxxxxxx> <4C85FB75.9070905@xxxxxxxx> <1283853410.14311.87.camel@xxxxxxxxxxxxxxxxxxxxxx> <19590.29997.199583.59386@xxxxxxxxxxxxxxxxxxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.8) Gecko/20100806 Fedora/3.1.2-1.fc13 Lightning/1.0b2pre Thunderbird/3.1.2

 On 09/08/2010 03:23 AM, Ian Jackson wrote:
> Ian and I discussed this extensively on IRC, during which conversation
> I became convinced that mlock() must do what we want.  Having read the
> code in the kernel I'm not not so sure.
>
> The ordinary userspace access functions are all written to cope with
> pagefaults and retry the access.  So userspace addresses are not in
> general valid in kernel mode even if you've called functions to try to
> test them.  It's not clear what mlock prevents; does it prevent NUMA
> page migration ?  If not then I think indeed the page could be made
> not present by one VCPU editing the page tables while another VCPU is
> entering the hypercall, so that the 2nd VCPU will get a spurious
> EFAULT.

As IanC said, the only thing mlock() guarantees is that accessing the
page won't cause a major fault - ie, need to go to disk to satisfy it. 
You can and will get minor faults on mlocked pages, as a result of the
pte being either non-present or RO.  It can be non-present as a result
of page migration (not necessarily NUMA migration, just defragging
kernel memory to make it possible to allocate higher-order pages), and
RO when doing page-dirtiness tracking.  And I think they can happen
concurrently on different vcpus, so you may end up with a hypercall
being able to start reading the memory, but then fail writing back the
results.

I think the only way to do this properly is to do ioctls out of kernel
memory rather than user process memory.  Perhaps the easiest way to do
this is add an mmap operation to privcmd which allocates a set of kernel
pages and maps them into the process memory, which it can then use as
its hypercall buffer.  The alternatives would be to copy the argument
memory into/out of kernel space around the call, or do some ad-hoc
pinning of pages around the call.  But if we can arrange for all
argument memory to come from a particular buffer, then its easier to
just make sure that buffer has the right properties.

> OTOH: there must be other things that work like Xen - what about user
> mode device drivers of various kinds ?  Do X servers not mlock memory
> and expect to be able to tell the video card to DMA to it ?  etc.
> I think if linux-kernel think that people haven't assumed that mlock()
> actually pins the page, they're mistaken - and it's likely to be not
> just us.

Not really - nothing much depends on keeping a page physically resident
and having a pte in a specific state.  DMA just cares about physical
residency, and you can't do usermode DMA without some way of also
getting the physical address of the page, which would mean you've
already got some kind of kernel driver.  And there would be no way to
make such DMA safe anyway (mlock wouldn't protect against a process
being killed, for example).

Trying to share memory via virtual addresses with an entity which is
entirely external to the kernel is just plain weird.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
[Xen-devel] [PATCH 06 of 24] libxc: add to xc_domain_maximum_gpfn, (continued) [Xen-devel] [PATCH 06 of 24] libxc: add to xc_domain_maximum_gpfn, Ian Campbell [Xen-devel] [PATCH 03 of 24] libxc: pass an xc_interface handle to page locking functions, Ian Campbell [Xen-devel] [PATCH 08 of 24] libxc: simplify performance counters API, Ian Campbell [Xen-devel] [PATCH 09 of 24] libxc: simplify lock profiling API, Ian Campbell [Xen-devel] [PATCH 11 of 24] libxc: convert xc_version over to hypercall buffers, Ian Campbell [Xen-devel] [PATCH 10 of 24] libxc: infrastructure for hypercall safe data buffers, Ian Campbell Re: [Xen-devel] [PATCH 10 of 24] libxc: infrastructure for hypercall safe data buffers, Jeremy Fitzhardinge Re: [Xen-devel] [PATCH 10 of 24] libxc: infrastructure for hypercall safe data buffers, Ian Campbell Re: [Xen-devel] [PATCH 10 of 24] libxc: infrastructure for hypercall safe data buffers, Ian Jackson Re: [Xen-devel] [PATCH 10 of 24] libxc: infrastructure for hypercall safe data buffers, Ian Campbell Re: [Xen-devel] [PATCH 10 of 24] libxc: infrastructure for hypercall safe data buffers, Jeremy Fitzhardinge <= [Xen-devel] [PATCH 13 of 24] libxc: convert shadow domctl interfaces and save/restore over to hypercall buffers, Ian Campbell [Xen-devel] [PATCH 15 of 24] libxc: convert watchdog interface over to hypercall buffers, Ian Campbell [Xen-devel] [PATCH 16 of 24] libxc: convert acm interfaces over to hypercall buffers, Ian Campbell [Xen-devel] [PATCH 12 of 24] libxc: convert domctl interfaces over to hypercall buffers, Ian Campbell [Xen-devel] [PATCH 14 of 24] libxc: convert sysctl interfaces over to hypercall buffers, Ian Campbell [Xen-devel] [PATCH 17 of 24] libxc: convert evtchn interfaces over to hypercall buffers, Ian Campbell [Xen-devel] [PATCH 18 of 24] libxc: convert schedop interfaces over to hypercall buffers, Ian Campbell [Xen-devel] [PATCH 19 of 24] libxc: convert physdevop interface over to hypercall buffers, Ian Campbell [Xen-devel] [PATCH 21 of 24] libxc: convert hvmop interfaces over to hypercall buffers, Ian Campbell [Xen-devel] [PATCH 22 of 24] libxc: convert mca interface over to hypercall buffers, Ian Campbell

Previous by Date:	Re: [Xen-devel] [Question] Has anyone tried kexec/kdump, Bruce Edge
Next by Date:	Re: [Xen-devel] [PATCH] xenbus: add missing wakeup in concurrent read/write, Jeremy Fitzhardinge
Previous by Thread:	Re: [Xen-devel] [PATCH 10 of 24] libxc: infrastructure for hypercall safe data buffers, Ian Campbell
Next by Thread:	[Xen-devel] [PATCH 13 of 24] libxc: convert shadow domctl interfaces and save/restore over to hypercall buffers, Ian Campbell
Indexes:	[Date] [Thread] [Top] [All Lists]