[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 5/8] arm/vm_event: get/set registers



On 06/01/16 21:21, Tamas K Lengyel wrote:
> On Wed, Jun 1, 2016 at 5:24 AM, Julien Grall <julien.grall@xxxxxxx> wrote:
>> Hi,
>>
>>
>> On 01/06/16 09:41, Jan Beulich wrote:
>>>>>>
>>>>>> On 31.05.16 at 18:28, <tamas@xxxxxxxxxxxxx> wrote:
>>>>
>>>> On May 31, 2016 01:48, "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
>>>>>
>>>>>
>>>>>>>> On 30.05.16 at 21:47, <tamas@xxxxxxxxxxxxx> wrote:
>>>>>>
>>>>>> On Mon, May 30, 2016 at 5:50 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>>>>>>>>>
>>>>>>>>>> On 30.05.16 at 00:37, <tamas@xxxxxxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>> +struct vm_event_regs_arm32 {
>>>>>>>> +    uint32_t r0_usr;
>>>>>>>> +    uint32_t r1_usr;
>>>>>>>> +    uint32_t r2_usr;
>>>>>>>> +    uint32_t r3_usr;
>>>>>>>> +    uint32_t r4_usr;
>>>>>>>> +    uint32_t r5_usr;
>>>>>>>> +    uint32_t r6_usr;
>>>>>>>> +    uint32_t r7_usr;
>>>>>>>> +    uint32_t r8_usr;
>>>>>>>> +    uint32_t r9_usr;
>>>>>>>> +    uint32_t r10_usr;
>>>>>>>> +    uint32_t r12_usr;
>>>>>>>> +    uint32_t lr_usr;
>>>>>>>> +    uint32_t fp;
>>>>>>>> +    uint32_t pc;
>>>>>>>> +    uint32_t sp_usr;
>>>>>>>> +    uint32_t sp_svc;
>>>>>>>> +    uint32_t spsr_svc;
>>>>>>>> +};
>>>>>>>
>>>>>>>
>>>>>>> It would seem more natural for the "ordinary" registers to be
>>>>>>> arranged in the numerical sequence, i.e. fp, r12, sp, lr, pc.
>>>>>>
>>>>>>
>>>>>> Not sure I follow.
>>>>>
>>>>>
>>>>> For one it is quite natural for someone looking at a sequence of
>>>>> register values to assume / expect them to be in natural order.
>>>>> And then, having them in natural (numeric) order allows e.g.
>>>>> extracting the register identifying bits from an instruction to use
>>>>> them as an array index into (part of) this structure.
>>>>>
>>>>> (For some background: I've been bitten a number of times by
>>>>> people sorting x86 registers alphabetically instead or naturally,
>>>>> i.e. EAX, EBX, ECX, EDX instead of EAX, ECX, EDX, EBX).
>>>>
>>>>
>>>> I see, however I believe that would be a very careless use of this struct
>>>> from the user as the register sizes are not even necessarily match the
>>>> architecture. For example we only define the 64bit x86 registers, so
>>>> trying
>>>> to access it as an array of 32bit registers wouldn't work at all. Plus we
>>>> are not doing a full set of registers, and I rather not imply that every
>>>> element in the "natural sequence" is present. It may be, it may be not.
>>>
>>>
>>> Once an ABI is set in stone, and if that ABI allows for optimizations
>>> (by consumers) like the one mentioned, I don't think this would be
>>> careless use. The resulting code is very clearly much more efficient
>>> than e.g. a switch() statement with a case label for each and every
>>> register. Well, yes, I already hear the "memory is cheap and hence
>>> code size doesn't matter" argument, but as said elsewhere quite
>>> recently I don't buy this.
>>
>>
>> I agree with Jan here.
>>
>> ARM64 has 31 general purposes registers (x0-x30). The switch to find a
>> register based on the index will be quite big.
>>
>> If you order the register and provide all the general purposes registers (x0
>> - x30), you will be able to replace by a single line (for instance see
>> select_user_reg in arch/arm/traps.c).
> 
> The issue is that the x86 vm_event struct right now has 32*uint64_t
> size. So if we would want to transmit all ARM64 GPRs + cpsr, PC and
> TTBR0/1 we would end up growing this structure beyond what it is
> currently. I want to avoid that as it affects both ARM32 and x86
> introspection applications as well as now we can hold fewer events on
> the ring. I would say it is better to avoid that then potentially
> saving some on a switch in ARM64 introspection applications.
> 
>>
>> This will also likely speed up your introspection application.
> 
> Win some, lose some. I would prefer if these changes had no
> cross-architectural effect unless absolutely required. Razvan, what do
> you think?

I think that if we keep a single page sized event ring buffer it would
be best to only send what consumers need right now.

The only purpose of having that information in the request is to quickly
get things that are immediately necessary - otherwise the full context
can be obtained with xc_domain_hvm_getcontext_partial().

As long as there's a comment present that says that this is all the
information needed at this time, and the struct can grow if needs
change, it might be best not to add unnecessary data just in case
somebody will need it later.

Having said that, growing the struct by about 16 bytes or so shouldn't
pose huge problems (I'm not sure I've calculated the correct size based
on your previous message). Our application only uses sync-style events,
so the ring buffer can only be full when all VCPUs are blocked, so even
a one page ring buffer should be enough for most sensible applications
right now.

But, the way to go for the future, as I've said before, is IMHO to grow
the ring buffer and possibly add all the data that
xc_domain_hvm_getcontext_partial() now retrieves to the events.

To summarize, FWIW I think we should hold back on including all
available data to requests as long as we keep using a one page ring
buffer, but growing the structure by not much is acceptable.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.