Dong, Eddie wrote:
> Alex Williamson wrote:
> > On Fri, 2008-02-15 at 00:43 +0800, Dong, Eddie wrote:
> >> I agree with your catagory, but I think #C is the 1st challenge we
> >> need to address for now. #A could be a future task for performance
> >> later after pv_ops functionality is completed. I don't worry about
> >> those several cycles difference in the primitive ops right now,
> >> since we already spend 500-1000 cycles to enter the C code.
> > IMHO, #A and #C are both blockers for getting into upstream
> > Linux/ia64. Upstream isn't going to accept a performance hit for a
> > paravirt enabled kernel on bare metal, so I'm not sure we should
> > prioritize one over the other, especially since Isaku has already
> made such good progress on #A.
> I guess we are talking in different angle which hide the real
> issues. We
> have multiple alternaitves:
> 1: pv_ops
> 2: pv_ops + binary patching to convert those indirect function call to
> direct function call like in X86
> 3: pure binary patching
> For community,
> #1 need many effort like Jeremy spent in X86 side, it could last for
> 6-12 months,
> #2 is based on #1, the additional effort is very small, probably 2-4
> #3 is not pv_ops, it may need 2-3 months effort.
> Per my understanding to previous Yamahata san's patch, it address part
> of #3 effort. I.e. #A of #3.
> What I want to suggest is #2.
Hmm, by "pv_ops" you mean a set of functions which are grouped, right?
My current implementation does
#define ia64_fc(addr) paravirt_fc(addr)
But do you want to make them indirect call?
i.e. something like
#define ia64_fc(addr) pv_ops->fc(addr)
> With pv_ops, all those instruction both in A/B/C are already replaced
> source level pv_ops code, so no binary patching is needed. The only
> needed in #2 is to convert indirect function call to direct function
> call for
> some hot APIs, for example X86 does for cli/sti. The majority of
> pv_ops are not patched.
> So basically #2 & #3 approach is kind of conflict, and we probably
> need to decide which way to go earlier.
It's not difficult to make #A of #3 to #A of #2.
(At least for making the current implementation into #A of #2,
but it requires more work and performance degrade.)
However I don't see any advantage #A of #2 than #A of #3.
If it is necessary to call some other function for #A of #3,
it is possible to rewrite instructions into something like
mov reg = 1f
br <target25> (relocation is necessary)
So left issues are how many instructions (or bundles) should be
reserved for each operations and what is their calling convention.
Although currently I put instructions for native as default case,
you can put the above sequence if you desire.
Given that #A of #2 is for performance critical path, so that
not using usual stacked calling convension would be acceptable.
As you already proposed, PAL static calling convention is a candidate.
However I don't see any advantage to switch from the current
convention (using r8, r9...) for #A at this moment.
It is necessary to discuss with linux-ia64 people to see if it's
acceptable or not. If we found it necessary to change the convention,
it wouldn't be so difficult to do so. But it should be after
discussion with linux-ia64. Not now.
> For #1 effort, adopting pv_ops in IVT code is one of the major effort,
> i.e. item #C in previous email.
Yes, I agree.
> >> The major challenge to #C is listed in my previous thread, it is
> >> an easy thing to address for now, especially if we need to change
> >> original IVT code a lot.
> > The question of how to handle the IVT needs to be decided on
> > Linux-ia64. There are a couple approaches we could take, but it
> > really comes down to what Tony and the other developers feel is
> > cleanest and most maintainable.
> 100% agree! I will start a session there soon.
Xen-ia64-devel mailing list