[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 28/45] xen: arm: arm64 trap handling.



On Thu, 2013-02-14 at 14:39 +0000, Ian Campbell wrote:
> > > +hyp_sync:
> > > +        entry   hyp=1
> > > +        msr     daifclr, #2
> > > +        adr     lr, return_to_hypervisor
> > > +        mov     x0, sp
> > > +        b       do_trap_hypervisor
> > 
> > This pattern (call fnX with arg0 == sp and lr == fnY) is repeated
> > quite a few times.  Could we have another tidying macro for it?
> 
> I'm half considering doing away with the preload lr+b and just using bl
> instead and putting the tail stuff in a macro like the entry stuff.
> 
> But if we do stick with this way then sure.

I didn't go the exit macro route (still undecided about that) but I did
decide to drop the lr+b stuff. I have a feeling that the branch
predictor will do better with the simple bl version, which ISTR reading
incorporates a hint to the predictor that this is a subroutine call (but
I can't find that reference right now. TBH even that smells a bit too
much of premature optimisation given there's no actual hardware yet, but
the bl version is the straightforward/obvious thing to use so lets go
with that.

I did experiment with a macro:

        @@ -96,6 +96,14 @@ lr      .req    x30             // link register
                 .endm
         
         /*
        + * Call a function, passing sp as the first and only argument
        + */
        +        .macro  call_with_sp, fn
        +        mov     x0, sp
        +        bl      \fn
        +        .endm
        +
        +/*
          * Bad Abort numbers
          *-----------------
          */
        @@ -130,15 +138,14 @@ hyp_error_invalid:
         hyp_sync:
                 entry   hyp=1
                 msr     daifclr, #2
        -        adr     lr, return_to_hypervisor
        -        mov     x0, sp
        -        b       do_trap_hypervisor
        +        call_with_sp do_trap_hypervisor
        +        b       return_to_hypervisor
         
         hyp_irq:
        
        
But TBH I think that looks worse than the open coded:
         hyp_irq:
                 entry   hyp=1
        -        adr     lr, return_to_hypervisor
                 mov     x0, sp
        -        b       do_trap_irq
        +        bl      do_trap_irq
        +        b       return_to_hypervisor

> 
> > 
> > > +ENTRY(return_to_new_vcpu)
> > > +ENTRY(return_to_guest)
> > > +ENTRY(return_to_hypervisor)
> > > +        ldp     x21, x22, [sp, #UREGS_PC]       // load ELR, SPSR
> > > +
> > > +        pop     x0, x1
> > > +        pop     x2, x3
> > > +        pop     x4, x5
> > > +        pop     x6, x7
> > > +        pop     x8, x9
> > > +
> > > +        /* XXX handle return to guest tasks, soft irqs etc */
> > > +        
> > > +        msr     elr_el2, x21                    // set up the return data
> > > +        msr     spsr_el2, x22
> > 
> > Done here becasue it's roughly half-way between the load and the
> > overwrite below?  Should we be using x28/x29 for this to give ourselves
> > even more pipeline space?

x21/22 are used as scratch registers in this way in the rest of the file
too and I'd rather be consistent about that. As it stands there are 5
instructions either side of this usage, I think that will do, at least
in the absence of any useful ability to measure things...

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.