WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH 2 of 6] REDO: mem_access & mem_access 2: mem even

To: Joe Epstein <jepstein98@xxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH 2 of 6] REDO: mem_access & mem_access 2: mem event additions for access
From: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Date: Wed, 5 Jan 2011 14:15:19 +0000
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 05 Jan 2011 06:16:44 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTikGw=9=2jM8H5mFk9-v1DLzj_UUwSfhwWEG2wvD@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTikGw=9=2jM8H5mFk9-v1DLzj_UUwSfhwWEG2wvD@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.20 (2009-06-14)
Hi, 

Thanks for the patches.  As Keir points out, we'll need them in a form
that applies cleanly (maybe send as attachments next time).  It's also
useful when reviewing patches if they're in 'diff -p' format.  In
mercurial, you can arrabnge that by adding these lines to your ~/.hgrc
file:

 [diff]
 showfunc = True

Further comments inline below.  Nothing too hard to address, I think.

At 22:07 +0000 on 04 Jan (1294178833), Joe Epstein wrote:
> diff -r a3cec4b94150 -r 06b0916eb91d xen/include/public/domctl.h
> --- a/xen/include/public/domctl.h       Tue Jan 04 10:30:15 2011 -0800
> +++ b/xen/include/public/domctl.h       Tue Jan 04 11:59:48 2011 -0800
> @@ -714,7 +714,7 @@
>  /*
>   * Page memory in and out.
>   */
> -#define XEN_DOMCTL_MEM_EVENT_OP_PAGING (1 << 0)
> +#define XEN_DOMCTL_MEM_EVENT_OP_PAGING            1
> 
>  /* Domain memory paging */
>  #define XEN_DOMCTL_MEM_EVENT_OP_PAGING_NOMINATE   0
> @@ -722,6 +722,12 @@
>  #define XEN_DOMCTL_MEM_EVENT_OP_PAGING_PREP       2
>  #define XEN_DOMCTL_MEM_EVENT_OP_PAGING_RESUME     3
> 
> +/*
> + * Access permissions
> + */
> +#define XEN_DOMCTL_MEM_EVENT_OP_ACCESS            2
> +#define XEN_DOMCTL_MEM_EVENT_OP_ACCESS_RESUME     0

These could do with a nice big block comment that describes what they
do. 

[...]

> diff -r a3cec4b94150 -r 06b0916eb91d xen/include/asm-x86/mem_event.h
> --- a/xen/include/asm-x86/mem_event.h   Tue Jan 04 10:30:15 2011 -0800
> +++ b/xen/include/asm-x86/mem_event.h   Tue Jan 04 11:59:48 2011 -0800
> @@ -24,6 +24,8 @@
>  #ifndef __MEM_EVENT_H__
>  #define __MEM_EVENT_H__
> 
> +/* Returns true if a listener exists, else pauses VCPU */
> +int mem_event_check_listener(struct domain *d);

That's a pretty odd thing for a function to do.  I see below that the
only caller already knows there's no listener and ignores the return
value.   I think you can just get rid of this function entirely.

[...] 

> +
> +    /*
> +     * If this GFN is emulated MMIO or marked as read-only (which old
> MMIO is),
> +     * pass the fault to the mmio handler first.
> +     */

Why did you change this comment?  Pages marked as p2m_ram_ro are not
"old MMIO", they're ROM images &c, and go through the emulator so the
write can be correctly discarded. 

> +    if ( (p2mt == p2m_mmio_dm) || (p2mt == p2m_ram_ro) )
>      {
> -        /*
> -         * Page log dirty is always done with order 0. If this mfn resides in
> -         * a large page, we do not change other pages type within that large
> -         * page.
> -         */
> -        paging_mark_dirty(v->domain, mfn_x(mfn));
> -        p2m_change_type(p2m, gfn, p2m_ram_logdirty, p2m_ram_rw);
> +        if ( !handle_mmio() )
> +        {
> +            guest_fault = 1;
> +            goto check_access_handler;
> +        }
>          return 1;
>      }
> 
> -    /* Shouldn't happen: Maybe the guest was writing to a r/o grant mapping? 
> */
> -    if ( p2mt == p2m_grant_map_ro )
> +    /* Was it a write access: log-dirty, etc... */
> +    if ( access_w ) {
> +        /* PoD and log-dirty also take this path. */

Moving the log-dirty check inside the test for access_w reintroduces a
race condition in multi-vcpu systems (where the same log-dirty fault
happens on two CPUs and the first CPU has changed the type back to 
p2m_ram_rw by the time the second one looks up the type).  Please put
this case back unconditionally.

> +        if ( p2mt != p2m_ram_rw && p2m_is_ram(p2mt) )
> +        {
> +            /*
> +             * Page log dirty is always done with order 0. If this
> mfn resides in
> +             * a large page, we do not change other pages type within
> that large
> +             * page.
> +             */
> +            paging_mark_dirty(v->domain, mfn_x(mfn));
> +            p2m_change_type(p2m, gfn, p2m_ram_logdirty, p2m_ram_rw);
> +            return 1;
> +        }
> +
> +        /* Shouldn't happen: Maybe the guest was writing to a r/o
> grant mapping? */
> +        if ( p2mt == p2m_grant_map_ro )
> +        {
> +            gdprintk(XENLOG_WARNING,
> +                     "trying to write to read-only grant mapping\n");
> +            guest_fault = 1;
> +            goto check_access_handler;
> +        }
> +    } /* end access_w */
> +
> + check_access_handler:
> +    /* Even if it is the guest's "fault", check with the mem_event
> interface instead if
> +     * one is there */
> +    if ( p2m_mem_access_check(gpa, gla_valid, gla, access_r,
> access_w, access_x) )
> +        return 1;

p2m_mem_access_check() appears to _always_ return 1.  Was that the
intention? 

> +    /* If there is no handler, then fault if guest_fault = 1 */
> +    if ( guest_fault )
>      {
> -        gdprintk(XENLOG_WARNING,
> -                 "trying to write to read-only grant mapping\n");
>          hvm_inject_exception(TRAP_gp_fault, 0, 0);
>          return 1;
>      }
> 
> +    /* Nothing handled it: it must be an access error with no memory
> handler, so fail */
>      return 0;
>  }
> 
> diff -r a3cec4b94150 -r 06b0916eb91d xen/arch/x86/hvm/svm/svm.c
> --- a/xen/arch/x86/hvm/svm/svm.c        Tue Jan 04 10:30:15 2011 -0800
> +++ b/xen/arch/x86/hvm/svm/svm.c        Tue Jan 04 11:59:48 2011 -0800
> @@ -979,7 +979,7 @@
>          __trace_var(TRC_HVM_NPF, 0, sizeof(_d), &_d);
>      }
> 
> -    if ( hvm_hap_nested_page_fault(gfn) )
> +    if ( hvm_hap_nested_page_fault(gpa, 0, ~0ull, 0, 0, 0) )

Surely this totally breaks the AMD NPT case -- you need to get the
access-type arguments right or else not rely on them in
hvm_hap_nested_page_fault().

[...]

> diff -r a3cec4b94150 -r 06b0916eb91d xen/arch/x86/mm/mem_access.c
> --- /dev/null   Thu Jan 01 00:00:00 1970 +0000
> +++ b/xen/arch/x86/mm/mem_access.c      Tue Jan 04 11:59:48 2011 -0800
> @@ -0,0 +1,59 @@
> +/******************************************************************************
> + * arch/x86/mm/mem_access.c
> + *
> + * Memory access support.
> + *
> + * Copyright (c) 2009 Citrix Systems, Inc.

Eh, probably not. :)

[...]

> diff -r a3cec4b94150 -r 06b0916eb91d xen/arch/x86/mm/p2m.c
> --- a/xen/arch/x86/mm/p2m.c     Tue Jan 04 10:30:15 2011 -0800
> +++ b/xen/arch/x86/mm/p2m.c     Tue Jan 04 11:59:48 2011 -0800
> @@ -2853,6 +2853,98 @@
>  }
>  #endif /* __x86_64__ */
> 
> +int p2m_mem_access_check(unsigned long gpa, bool_t gla_valid,
> unsigned long gla,
> +                         bool_t access_r, bool_t access_w, bool_t access_x)
> +{
> +    struct vcpu *v = current;
> +    mem_event_request_t req;
> +    unsigned long gfn = gpa >> PAGE_SHIFT;
> +    struct domain *d = v->domain;
> +    struct p2m_domain* p2m = p2m_get_hostp2m(d);
> +    int res;
> +    mfn_t mfn;
> +    p2m_type_t p2mt;
> +    p2m_access_t p2ma;
> +
> +    /* First, handle rx2rw conversion automatically */
> +    mfn = p2m->get_entry(p2m, gfn, &p2mt, &p2ma, p2m_query);
> +
> +    if ( access_w && p2ma == p2m_access_rx2rw ) {
> +        p2m_lock(p2m);
> +        p2m->set_entry(p2m, gfn, mfn, 0, p2mt, p2m_access_rw);

Might be better to use p2m_change_type here; it's atomic so avoids
potential races. 

> +        p2m_unlock(p2m);
> +
> +        return 1;  /* handled */
> +    }
> +
> +    /* Otherwise, check if there is a memory event listener, and send
> the message along */
> +    res = mem_event_check_ring(d);
> +    if ( res < 0 )
> +    {
> +        /* No listener */
> +        if ( p2m->access_required )
> +        {
> +            printk(XENLOG_INFO
> +                   "Memory access permissions failure, no mem_event
> listener: pausing VCPU %d, dom %d\n",
> +                   v->vcpu_id, d->domain_id);
> +
> +            /* Will pause the VCPU */
> +            (void) mem_event_check_listener(d);

Why does this need a special case?  Could you just post the violation 
to the ring and pause as normal, and then if a listener ever arrives it
can handle it?

In fact, I don't see why access_required needs to be a separate flag at
all - it's implied by setting the access permissions on a page or
setting the default to anything other than rwx.  That would address
Keir's concern about adding a separate domcrf flag.

> +        }
> +        else
> +        {
> +            /* A listener is not required, so clear the access restrictions 
> */
> +            p2m_lock(p2m);
> +            p2m->set_entry(p2m, gfn, mfn, 0, p2mt, p2m_access_rwx);
> +            p2m_unlock(p2m);
> +        }
> +
> +        return 1;
> +    }
> +    else if ( res > 0 )
> +        return 1;  /* No space in buffer */

DYM "return 0" here?  I think this function needs a comment describing
what the return value means. 

[...]
> diff -r a3cec4b94150 -r 06b0916eb91d xen/include/asm-x86/mem_access.h
> --- /dev/null   Thu Jan 01 00:00:00 1970 +0000
> +++ b/xen/include/asm-x86/mem_access.h  Tue Jan 04 11:59:48 2011 -0800
> @@ -0,0 +1,35 @@
> +/******************************************************************************
> + * include/asm-x86/mem_paging.h
> + *
> + * Memory access support.
> + *
> + * Copyright (c) 2009 Citrix Systems, Inc.

Again, no. 

[...]

> diff -r a3cec4b94150 -r 06b0916eb91d tools/xenpaging/xenpaging.c
> --- a/tools/xenpaging/xenpaging.c       Tue Jan 04 10:30:15 2011 -0800
> +++ b/tools/xenpaging/xenpaging.c       Tue Jan 04 11:59:48 2011 -0800
> @@ -658,7 +658,7 @@
>              {
>                  DPRINTF("page already populated (domain = %d; vcpu = %d;"
>                          " p2mt = %x;"
> -                        " gfn = %"PRIx64"; paused = %"PRId64")\n",
> +                        " gfn = %"PRIx64"; paused = %d)\n",
>                          paging->mem_event.domain_id, req.vcpu_id,
>                          req.p2mt,
>                          req.gfn, req.flags & MEM_EVENT_FLAG_VCPU_PAUSED);
> 

this belongs in a separate patch; it's not part of the change described
at the top.

Cheers,

Tim.

-- 
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel