This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-ia64-devel] [RFC] grant table API clean up and performance tuning m

Hi all xen/ia64 developers.
Here is a memo on grant table API clean up and performance tuning ideas.
Please comment/request.


     Xen/IA64 dom0 grant table API clean up and P2M/VP model tuning memo

                                               2006 VA Linux Systems Japan K.K.
                                     Isaku Yamahata <yamahata at valinux co jp>

* Introduction
This document targets xen/IA64 developers to discuss on grant table
API clean up and performance tuning idea.
This describes xen/IA64 detailed grant table implementation/behavior
and discusses on its possible performance tuning ideas.
The grant table basic concept isn't explained.
Grant table API clean up and performance tuning will be done based
on discussions.

* Reference
"A Rough Introduction to Using Grant Tables" xen/docs/misc/grant-table.txt
"Xen/IA64 dom0 virtual physical model design memo"

* grant table mapping behavior
In this section grant table mapping behavior is described.

granter: a domain which grant page-mapping to a foreign domain
mapper: a domain which maps a page which is granted by granter
gnttab map address: a address which is used for grant table mapping
                    mfn on xen/x86
                    pfn on xen/IA64

        granter                         mapper
        -------                         ------
                                        a mapper allocate a area which is
                                        used for mapping.
                                        Xen/x86 allocates virtual address area.
                                        Xen/IA64 allocates pseudo physical
                                        (before GNTTABOP_map_grant_ref)

        *grant mapping
        a granter grants a mapping
        by updating grant_entry_t
           xen/x86: mfn
           xen/IA64: pfn

                granter sends a grant reference to mapper somehow.
                usually via a I/O ring

                                        Here a mapping domain knows grant
                                        Here this domain implicitly knows
                                        mfn. So a mapper can determine a place
                                        to receive the page

                                                host_addr: receive place
                                                  on x86 kernel virtual address
                                                     xen/x86 maps a page to
                                                     this kernel virtual
                                                  on IA64 pseudo pfn
                                                     xen/IA64 assigns a page
                                                     to this mfn.
                                        handle: mapping handle

                                        Now a mapping domain can access to
                                        the page. get a kernel virtual address
                                        to access the page.

                                        access to the page


* grant table transfer behavior
In this section grant table transfer behaviour is described.

sender: a domain which transfer a page to a foreign domain
receiver: a domain which grants page-transfer from a foreign domain.

        sender                          receiver(granter)
        ------                          -----------------
                                        a receiver allocates an area to which
                                        a page is transfered.
                                        xen/x86: virtual address area
                                        xen/IA64: pseudo physical area

                                        a receiver grants transfer by
                                        updating grant_entry_t
                                             x86: unused
                                                  pseudo physical address
                                             IA64:pseudo physical address
                                                  into which a page is
                                                  (this should be used.)

        a sender unmaps a page which
        is to be transfered.
        xen/x86: allocates new mfs
                from xen and maps it
                to kernel virtual address
                updates P2M table
                updates M2P table
        xen/IA64: nop
                  grant table transfer
                  updates automatically

        grant transfer hypercall
         mfn: page frame number corresponding to a page
              x86: mfn
              IA64: pfn

                                page transfer
                xen updates grant_entry_t.
                xen/x86: page ownership change
                xen/IA64: page ownership change
                          a page is disassociated from sender's pseudo
                          physical address.
                          (The current implementation associates the page
                           to receiver's pseudo physical address.
                           but this shouldn't)

                                        a receiver is notified of page transfer
                                        somehow. usually via a I/O ring

                                        a receiver get a page reference.
                                            x86: mfn
                                            IA64: mfn(unused)

                                        Now a receiver knows mfn which it
                                        xen/x86: maps the page to kernel
                                                 virtual address space
                                                 updates P2M table
                                                 updates M2P table
                                        xen/IA64: nop (current implementation)
                                                  Ideally a receiver should
                                                  determine receiving pseudo
                                                  physical address based on
                                                  And then it associates mfn
                                                  to its pseudo physical

* grant table API proposal
The current grant table API depends on xen/x86 deeply.
There are four kind of addresses related to grant table.
user virtual address, kernel virtual address, pseudo physical address and
machine address.
xen/x86 grant table uses user virtual address, kernel virtual address and
machine address.
On the other hand xen/IA64 can use pseudo physical address and machine
address because xen/IA64 fully virtualizes TLB, so it is difficult for
xen/IA64 to handle user/kernel virtual address.
Virtual address related API might be emulated by xenLinux/IA64 without
xen/IA64. However kernel virtual address is a issue.
The right way is that to re-define grant table API separating arch-independent
part and arch-dependent part (or define a entirely new clean replacement)
and to rewrite existing codes including common xen code, xen/x86 code,
and xenLinux/x86 code.

grant table mapping
grant_entry_t::frame should be architecture specific.
  - xen/x86: mfn
  - xen/IA64: pfn
define a arch-specific macro like virt_to_gnttab_mapaddr().
define a arch-specific function which allocates a mapping area.

grant table transfer
define a arch-specific function which allocates a receiving area.
define a arch-specific function which associates a received page to a domain.
  - xen/x86 maps to kernel virtual address.
  - xen/IA64 associates a page to a pseudo physical address.
rewrite netback.c

* grant table read-only mapping
Current it is not implemented on xen/IA64 yet.
Xen/IA64 software address translation page table entry has unused bits.
In fact only present bit and ppn entry is only used. So other unused
bits can be used to record read-only.
Perhaps translate_domain_pte() needs modification.

* grant table entry size
In Xen/IA64 with P==M model
A number of shared pages for grant tables is defined as follows.

In Xen/IA64 with P2M/VP model,
These must be increased as follows.

Because blkback, netback determines a number of entries which is used
based I/O ring entry size which is proportional to page size.
On the other hand netback limits maximum entries another way.
Not based on I/O ring size or page size.
These parameter should be adjusted somehow. Perhaps benchmark is needed.

* performance tuning idea
All tuning must be evaluated by benchmark.
So all tuning should be optional somehow(compile time or run-time.)

- Physical to Machine conversion
  reduce/eliminate p2m conversion hypercall.
  - batched conversion
    Unfortunately address translation is single.
  - cache p2m conversion
  - direct map P2M table into dom0
    A mechanism similar to grant table can be used.

- track virtual address of tlb insert (for grant table mapping)
  Currently tlb/VHPT is globally flushed. This should be avoided and
  finer grained tlb/VHPT flush is desirable.
  When a page is mapped to a foreign domain, xen checks tlb insert whose
  associates the page. Doing so xen knows virtual addresses on which
  the page is mapped. When the page is unmapped, xen can flush corresponding
  virtual address.
  A unused bit of Xen/IA64 software address translation page table entry
  can be used.
  - pre-register for virtual address tracking.(for grant table transfer)
    grant table page transfer is used only for vnif currently.
    skbuff_ctor()/skbuff_dtor() can be modified to register/unregister
    its page for xen to track its virtual address.
    When registered xen may disassociate its machine page and associate
    a new machine page to track its virtual address.
    The old disassociated page freeing can be deferred.(described below)

- defer page freeing (for disassociating a page from pseudo physical address)
  When a underlying machine page is zapped from pseudo physical address space,
  Xen doesn't free the page immediately. Instead xen queues the page to
  a queue deferring freeing the page.
  When tlb/VHPT are flushed, queued pages can be freed.
  struct page_info::tlbflush_timestamp can be used for this purpose.
  If the queue becomes too long or memory allocation pressure is high,
  xen flushes tlb/VHPT and then frees queued pages.
  When a domain allocates a page, a page can be removed fron the queue.
  - A receiver domain(grant table page transfer)
    A mapping domain(grant table mapper) must flush tlb cache.
    Otherwise a domain might see an old page.
    This is acceptable.

- background VHPT flush
  When a machine page is disassociated with pseudo physical address,
  tlb/VHPT flush need not be done simultaneously. VHPT flush need not be
  flushed at once. VHPT flush can be done backgroundly and gradually
  For example, soft-interrupt or timer can be used.
  When a machine page is disassociated, record its time and VHPT background
  flush index.
  If recorded (disassociated time, background VHPT index + VHPT index size) >
     current  (time, VHPT index),
  then the page can be freed.

- read only grant table mapping
  When unmapping of read only grant table mapping, tlb won't be flushed.
  A malicious domain might be able to read a unmapped page, but it
  can't modify a page.
  A granting domain must not use a granted page for important data.

- trust privileged domains
  Xen/IA64 trust privileged domain(dom0) to flush tlb cache.

- reserve pseudo physical address space or virtual address space
  (for grant table mapping)
  With hypercall, a domain registers grant table mapping regions to xen.
  So xen can flush registered virtual address region.

- memory copy
  abandon grant table. resort to memory copy.

- 64MB of contiguous (P==M+delta)
  - move I/O area, efi area and acpi table to high pseudo physical address.

- XENMEM_populate_physmap, XENMEM_decrease_reservation
  when extent_order > 0, loop optimization is possible.
  This can eliminate some cpu cycles.
  (They doesn't work yet though.)

- super page
  another way of shortcut of p2m 3-level table lookup.

- fast path
  fast path is disabled for now. It can be re-implemented later.


Xen-ia64-devel mailing list