[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] enable port accesses with (almost) full register context



>IMO you're doing code building anyway, but just of one instruction. You get
>rid of the locking by doing it to a per-CPU buffer, and the stack is the
>obvious place, calling out to register save/restore code. I don't really
>care about the performance of the save/restore code -- it's obviously going
>to be trivial compared with the unavoidable trap-and-emulate cost. Also, do
>you need separate save/restore code for IN vs. OUT instructions?

Actually, in the code I currently have I do. This is because for out-s I need
to merge the value output with the user-specified rAX, under the
assumption that output value and register contents are not always identical
(i.e. if particular bits within a port would need to be special treated by Xen,
which I can easily imagine to be required at some point).

>Something like:
>    call save_host_restore_guest
>    <IN or OUT>
>    call save_guest_restore_host
>    ret
>
>Would that be reasonable?

It would, provided the above assumption about the need to modify the
output value would never become true. Additionally, for 64-bits, I'm
concerned about the potential need for using indirect calls here (as well
as in the syscall trampolines): there's nothing keeping a user from making
the Xen heap 2Gb or more in size. These would further slow things down,
but depending on the nature of allocations made from the Xen heap it
may also be possible to simply place an upper limit on the heap size, as
it currently is assumed adjacent to the Xen image (but taking memory
holes at rather low addresses into account a user may even be required
to bump the heap size significantly - what if only a few Mb of memory
below 4Gb existed? - since, after all, the heap size is the size of address
space consumed, not the amount of memory used).

>Alternatively, perhaps we could get rid of the distinction and emulate all
>port accesses in this way? I suspect that the cost of state save/restore and
>building the trampoline is dwarfed by the cost of the GPF and even the cost
>of the I/O port access itself (they don't tend to be super fast). Could you
>do a few quick measurements to determine this? If the extra cost is less
>than, say, 10%, I'd be inclined to take the hit to avoid interface changes.

Percentages of full-context relative to simply emulated i/o, without having
changed the assembly file approach to the stub building one, yet (as per
above issues):

PentiumIII (32-bit) with locking        67%
PentiumIII (32-bit) without locking     84%
Pentium4 (64-bit) with locking          86%
Pentium4 (64-bit) without locking       89%

Revised patch (domctl->sysctl, naming) attached.

Jan

Attachment: xen-x86-io-register-context-2.patch
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.