|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RFC 1/4] x86/mm: Shadow and p2m changes for PV mem_access
>>> On 29.04.14 at 06:45, <aravindp@xxxxxxxxx> wrote:
> +void p2m_access_to_flags(u32 *flags, u32 gflags, p2m_access_t access)
> +{
> +
> + /* Restrict with access permissions */
> + switch (access)
> + {
> + case p2m_access_r:
> + *flags &= ~(_PAGE_RW);
> + *flags |= (_PAGE_NX_BIT|_PAGE_PRESENT);
> + break;
> + case p2m_access_rx:
> + case p2m_access_rx2rw:
> + *flags &= ~(_PAGE_NX_BIT|_PAGE_RW);
> + *flags |= _PAGE_PRESENT;
> + break;
> + case p2m_access_rw:
> + *flags |= (_PAGE_NX_BIT|_PAGE_RW|_PAGE_PRESENT);
> + break;
> + case p2m_access_rwx:
> + default:
> + *flags &= ~(_PAGE_NX_BIT);
> + *flags |= (_PAGE_RW|_PAGE_PRESENT);
> + break;
> + }
Tim will have the final say here, but I'd suggest dropping all these
needless parentheses (I see only one case where they're really
needed).
> +
> + // Allow more restrictive guest flags to be propagated instead of access
> + // permissions
Coding style (there are more of these, which I'm not going to
individually comment on).
> + if ( !(gflags & _PAGE_RW) )
> + *flags &= ~(_PAGE_RW);
> +
> + if ( gflags & _PAGE_NX_BIT )
> + *flags |= _PAGE_NX_BIT;
And now you get things even into inconsistent state wrt parentheses.
> +static int
> +p2m_mem_access_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t
> mfn,
> + unsigned int page_order, p2m_type_t p2mt,
> + p2m_access_t p2ma)
> +{
> + struct domain *d = p2m->domain;
> + mfn_t *access_lookup_table = p2m->access_lookup_table;
> + uint table_idx;
> + uint page_idx;
> + uint8_t *access_table_page;
> +
> + ASSERT(shadow_mode_mem_access(d) && access_lookup_table != NULL);
Better split into two ASSERT()s, so that if it triggers one would know
which of the two conditions wasn't met?
> +
> + /* For PV domains we only support rw, rx, rx2rw, rwx access permissions
> */
> + if ( unlikely(p2ma != p2m_access_r &&
> + p2ma != p2m_access_rw &&
> + p2ma != p2m_access_rx &&
> + p2ma != p2m_access_rwx &&
> + p2ma != p2m_access_rx2rw) )
> + return -EINVAL;
Perhaps better done as a switch() statement.
> +
> + if ( page_get_owner(mfn_to_page(mfn)) != d )
> + return -ENOENT;
> +
> + gfn = get_gpfn_from_mfn(mfn_x(mfn));
You get "gfn" passed in and blindly overwrite it here? _If_ you need to
do this lookup, shouldn't you instead check it matches the passed in one?
> +
> + /*
> + * Values with the MSB set denote MFNs that aren't really part of the
> + * domain's pseudo-physical memory map (e.g., the shared info frame).
> + * Nothing to do here.
> + */
> + if ( unlikely(!VALID_M2P(gfn)) )
> + return 0;
> +
> + if ( gfn > (d->tot_pages - 1) )
> + return -EINVAL;
Hardly - a PV domain can easily make its address space sparse, and
iirc pv-ops Linux even does so by default to avoid PFN/MFN collisions
on MMIO space. (And as a side note, this would better be "gfn >=
d->tot_pages".)
> + paging_lock(d);
> +
> + table_idx = MEM_ACCESS_TABLE_IDX(gfn);
> + page_idx = MEM_ACCESS_PAGE_IDX(gfn);
> + access_table_page =
> map_domain_page(mfn_x(access_lookup_table[table_idx]));
> + access_table_page[page_idx] = p2ma;
> + unmap_domain_page(access_table_page);
> +
> + if ( sh_remove_all_mappings(d->vcpu[0], mfn) )
Is there anything guaranteeing d->vcpu and d->vcpu[0] being non-
NULL?
> +static mfn_t
> +p2m_mem_access_get_entry(struct p2m_domain *p2m, unsigned long gfn,
> + p2m_type_t *t, p2m_access_t *a, p2m_query_t q,
> + unsigned int *page_order)
> +{
> + struct domain *d = p2m->domain;
> + mfn_t *access_lookup_table = p2m->access_lookup_table;
> + uint table_idx;
> + uint page_idx;
> + uint8_t *access_table_page;
> + mfn_t mfn = _mfn(gfn); // For PV guests mfn == gfn
> +
> + ASSERT(shadow_mode_mem_access(d) && access_lookup_table != NULL);
> +
> + /* Not necessarily true, but for non-translated guests, we claim
> + * it's the most generic kind of memory */
I think you copied this comment verbatim from elsewhere, but I don't
think it's correct as is.
> + *t = p2m_ram_rw;
> +
> + if ( page_get_owner(mfn_to_page(mfn)) != d )
> + return _mfn(INVALID_MFN);
> +
> + gfn = get_gpfn_from_mfn(mfn_x(mfn));
Same comment as earlier.
> + if ( gfn > (d->tot_pages - 1) )
Dito.
> + table_idx = MEM_ACCESS_TABLE_IDX(gfn);
> + page_idx = MEM_ACCESS_PAGE_IDX(gfn);
> +
> + access_table_page =
> map_domain_page(mfn_x(access_lookup_table[table_idx]));
> +
> + /* This is a hint to take the default permissions */
> + if ( access_table_page[page_idx] == p2m_access_n )
> + access_table_page[page_idx] = p2m->default_access;
We're in "get" here - why does that modify any global state?
> +void p2m_mem_access_teardown(struct p2m_domain *p2m)
> +{
> + struct domain *d = p2m->domain;
> + mfn_t *access_lookup_table = p2m->access_lookup_table;
> + uint32_t nr_access_table_pages;
> + uint32_t ctr;
> +
> + /* Reset the set_entry and get_entry function pointers */
> + p2m_pt_init(p2m);
> +
> + if ( !access_lookup_table )
> + return;
> +
> + nr_access_table_pages = get_domain_nr_access_table_pages(d);
> +
> + for ( ctr = 0; ctr < nr_access_table_pages; ctr++ )
No new effectively unbounded loops please.
> +int p2m_mem_access_init(struct p2m_domain *p2m)
> +{
> + struct domain *d = p2m->domain;
> + mfn_t *access_lookup_table;
> + uint32_t nr_access_table_pages;
> + uint32_t ctr;
> +
> + nr_access_table_pages = get_domain_nr_access_table_pages(d);
> + access_lookup_table = xzalloc_array(mfn_t, nr_access_table_pages);
This surely is going to be an order > 0 allocation, and you even risk
it being an order > MAX_ORDER one. The former is disallowed at
runtime by convention/agreement, the latter is an outright bug.
(And again just as a side note - you don't appear to need the zero
initialization here.)
> + if ( !access_lookup_table )
> + return -ENOMEM;
> +
> + p2m->access_lookup_table = access_lookup_table;
> +
> + for ( ctr = 0; ctr < nr_access_table_pages; ctr++ )
Same comment regarding the effective unboundedness of the loop.
> @@ -1414,6 +1419,8 @@ long p2m_set_mem_access(struct domain *d, unsigned long
> pfn, uint32_t nr,
> if ( pfn == ~0ul )
> {
> p2m->default_access = a;
> + if ( is_pv_domain(d) )
> + return p2m_mem_access_set_default(p2m);
Is it really correct to set p2m->default_access _before_ calling that
function? Perhaps it wouldn't be correct doing it the other way
around either - I suppose you need to hold the necessary lock across
both operations.
> @@ -585,9 +585,17 @@ int paging_domctl(struct domain *d,
> xen_domctl_shadow_op_t *sc,
> {
>
> case XEN_DOMCTL_SHADOW_OP_ENABLE:
> + /*
> + * Shadow mem_access mode should only be enabled when mem_access is
> + * enabled in XEN_DOMCTL_MEM_EVENT_OP_ACCESS_ENABLE.
> + */
> + if ( sc->mode & XEN_DOMCTL_SHADOW_ENABLE_MEM_ACCESS )
> + return -EINVAL;
> +
> if ( !(sc->mode & XEN_DOMCTL_SHADOW_ENABLE_LOG_DIRTY) )
> break;
> /* Else fall through... */
> +
> case XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY:
Stray blank line - the fall through here makes it preferable to keep the
two case blocks un-separated.
> @@ -2443,7 +2443,8 @@ int sh_remove_all_mappings(struct vcpu *v, mfn_t gmfn)
> if ( !(shadow_mode_external(v->domain)
> && (page->count_info & PGC_count_mask) <= 3
> && ((page->u.inuse.type_info & PGT_count_mask)
> - == !!is_xen_heap_page(page))) )
> + == !!is_xen_heap_page(page)))
> + && !(shadow_mode_mem_access(v->domain)) )
You're breaking indentation, there are pointless parentheses again,
but most importantly - why?
> @@ -625,6 +627,20 @@ _sh_propagate(struct vcpu *v,
> }
> }
>
> + /* Propagate access permissions */
> + if ( unlikely((level == 1) &&
> + mem_event_check_ring(&d->mem_event->access) &&
> + !sh_mfn_is_a_page_table(target_mfn)) )
Just a general remark here - unlikely() used like this is pointless, it
ought to be used on the individual constituents of && or ||.
> + {
> + struct p2m_domain *p2m = p2m_get_hostp2m(d);
> + p2m_access_t a;
> + p2m_type_t t;
> + mfn_t mfn;
> + mfn = p2m->get_entry(p2m, mfn_x(target_mfn), &t, &a, 0, NULL);
Missing blank line between declarations and statements. Also this
statement could instead be an initializer.
> @@ -2822,6 +2838,8 @@ static int sh_page_fault(struct vcpu *v,
> int r;
> fetch_type_t ft = 0;
> p2m_type_t p2mt;
> + p2m_access_t p2ma;
This variable ...
> @@ -3009,7 +3027,80 @@ static int sh_page_fault(struct vcpu *v,
>
> /* What mfn is the guest trying to access? */
> gfn = guest_l1e_get_gfn(gw.l1e);
> - gmfn = get_gfn(d, gfn, &p2mt);
> + if ( likely(!mem_event_check_ring(&d->mem_event->access)) )
> + gmfn = get_gfn(d, gfn, &p2mt);
> + /*
> + * A mem_access listener is present, so we will first check if a
> violation
> + * has occurred.
> + */
> + else
> + {
> + struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
... seems to be used only in this scope, and with you already adding
scope restricted variables it should go here.
> +
> + gmfn = get_gfn_type_access(p2m, gfn_x(gfn), &p2mt, &p2ma, 0, NULL);
> + if ( mfn_valid(gmfn) && !sh_mfn_is_a_page_table(gmfn)
> + && (regs->error_code & PFEC_page_present)
> + && !(regs->error_code & PFEC_reserved_bit) )
> + {
> + int violation = 0;
> + bool_t access_w = !!(regs->error_code & PFEC_write_access);
> + bool_t access_x = !!(regs->error_code & PFEC_insn_fetch);
> + bool_t access_r = access_x ? 0 : !(access_w);
Stray parentheses again. I'm not going to repeat this again - please
just check your code before submitting.
> + /*
> + * Do not police writes to guest memory emanating from the Xen
> + * kernel. Trying to do so will cause the same pagefault to occur
> + * over and over again with an event being sent to the access
> + * listener for each fault. If the access listener's vcpu is not
> + * scheduled during this time, the violation is never resolved
> and
> + * will eventually end with the host crashing.
> + */
> + if ( (violation && access_w) &&
> + (regs->eip >= XEN_VIRT_START && regs->eip <= XEN_VIRT_END) )
> + {
> + violation = 0;
> + rc = p2m->set_entry(p2m, gfn_x(gfn), gmfn, PAGE_ORDER_4K,
> + p2m_ram_rw, p2m_access_rw);
> + }
This looks more like a hack than a proper solution - shouldn't the
listener be allowed to know of hypervisor side accesses?
> +/* Number of access table pages for a PV domain */
> +#define get_domain_nr_access_table_pages(d) \
> + DIV_ROUND_UP(P2M_ACCESS_SIZE * (d->tot_pages - 1), PAGE_SIZE)
And once again. I wonder whether this code was tested on a suitably
big pv-ops PV guest.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |