[Xen-devel] Xen Developer Summit Storage Performance BoF notes

[ I forgot to make notes so these are from memory, please respond with
any corrections or omissions. ]

Felipe introduced the session, highlighting the change in storage (i.e.,
low latency SSDs and fast SANs) were exposing bottlenecks in the current
architecture which is designed with slow disks.  Refer to his
presentation from Friday for more details.

Felipe noted that persistent grants were causing performance regressions
when the backend did not support them and system where copy cost > map
cost (e.g., when dom0 has few VCPUs).  Roger agreed on restoring the
zero-copy path in the frontends was a good idea. [He has now posted
patches for this.]

Felipe mentioned that persistent grants were most beneficial when using
user space backend.  David pointed out that this is most likely caused
by a poor implementation of the gntdev device.

Matt mentioned contention on the m2p override lock as causing
performance problems and suggested making this a read/write lock.

David listed some of the key bottlenecks already identified and plans to
resolve them without any protocol changes.

1. Unmap TLB flushes can be eliminated if the mapping is not used.
Experiments by XenServer suggest grant mapped pages by blkback are never
accessed thus eliminating all TLB flushes.

2. Grant table lock contention can be reduced by finer grained locked.
e.g., by having buckets of map tracking structures and hashing
domid+grant ref to a bucket.

3. gntdev device does a double map/unmap (for userspace and kernel
mapping) and does the kernel mapping a page at a time.  Userspace
mappings could be done at page fault time (in the expectation that
userspace doesn't touch them) and the kernel side should batch grant
table ops using a new GNTOP_unmap_and_duplicate hypercall sub-op for the
unmap.  Roger said he'd posted patch for the sub-op, but received no

Anil mentioned PV filesystems but this wasn't discussed in any depth.

