Le Jeudi 28 Septembre 2006 10:07, Isaku Yamahata a écrit :
> Hi Anthony.
>
> On Wed, Sep 27, 2006 at 09:56:11PM +0800, Xu, Anthony wrote:
> > Currently, memory allocated for domU and VTI-domain is 16K contiguous.
> > That means all huge page TLB entries must be broken into 16K TLB
> > entries. This definitely impact overall performance, for instance, in
> > linux, region 7 is using 16M page size. IA64 is supposed to be used at
> > high end server, many services running on IA64 are using huge page, like
> > Oracle is using 256M page in region 4, if XEN/IA64 still use 16K
> > contiguous physical page, we can image, this can impact performance
> > dramatically. So domU, VTI-domain and dom0 need to support huge page.
> >
> > Attached patch is an experiment to use 16M page on VTI-domain. A very
> > tricky way is used to allocate 16M contiguous memory for VTI-domain, so
> > it's just for reference.
> > Applying this patch, I can see 2%~3% performance gains when running KB
> > on VTI-domain(UP), you may know performance of KB on VTI-domain is not
> > bad:-), that means the improvement is somewhat big.
> > As we know, KB doesn't use 256M, the performance gain is coming from 16M
> > page in region 7, if we run some applications, which use 256M huge page,
> > and then we may get more improvement.
>
> I agree with you that supporting tlb insert with large page size and
> hugetlbfs would be a big gain.
>
> > In my mind, we need do below things (there may be more) if we want to
> > support huge page.
> > 1. Add an option "order" in configure file vtiexample.vti. if order=0,
> > XEN/IA64 allocate 16K contiguous memory for domain, if order=1, allocate
> > 32K, and so on. Thus user can chose page size for domain.
>
> A fall back path should be implemented in case that
> large page allocation fails.
> Or do you propose introducing new page allocator with very large chunk?
> With order option, page fragmentation should be taken care of.
>
> > 2. This order option will be past to increase_reservation() function as
> > extent_order argument, increase_reservation() will allocate contiguous
> > memory for domain.
> >
> > 3. There may be some memory blocks, which we also want
> > increase_reservation to allocate for us, such as shared page, or
> > firmware memory for VTI domain etc. So we may need to call
> > increase_reservation() several times to allocate memories with different
> > page size.
> >
> > 4. Per_LP_VHPT may need to be modified to support huge page.
>
> Do you mean hash collision?
>
> > 5. VBD/VNIF may need to be modified to use copy mechanism instead of
> > flipping page.
> >
> > 6. Ballon driver may need to be modified to increase or decrease domain
> > memory by page size not 16K.
> >
> > Magnus, would you like to take this task?
> >
> > Comments are always welcome.
>
> Those are my some random thoughts.
>
> * Presumably there are two goals
> - Support one large page size(e.g. 16MB) to map kernel.
> - Support hugetlbfs whose page size might be different from 16MB.
>
> I.e. support three page sizes, normal page size 16KB, kernel mapping
> page size 16MB and hugetlbfs page size 256MB.
> I think hugetlbfs support can be addressed specialized way.
>
> hugetlbfs
> * Some specialized path can be implemented to support hugetlbfs.
> - For domU
> paravirtualize hugetlbfs for domU.
> Hook to alloc_fresh_huge_page() in Linux. Then xen/ia64 is aware of
> large pages.
> Probably a new flag of the p2m entry, or other data structure might be
> introduced.
> For xenLinux, the region number, RGN_HPAGE can be used to check before
> entering hugetlbfs specialized path.
> - For domVTI
> Can the use of hugetlbfs be detected somehow?
> Probably some Linux-specific heuristic can be used.
> e.g. check the region, RGN_HPAGE.
>
> kernel mapping with large page size.
> * page fragmentation should be addressed.
> Both 16KB and 16MB page should be able to co-exist in a same domain.
> - Allocating large contiguous region might fail.
> So fall back path should be implemented.
> - domain should be able to have pages with both page size (16KB and 16MB)
> for smooth code merge.
> probably a new bit of the p2m entry, something like _PAGE_HPAGE,
> would be introduce to distinguish large page from normal page.
>
> * paravirtualized driver(VBD/VNIF)
> This is a really issue.
> For first prototype it is reasonable to not support page flipping
> resorting grant table memory copy.
>
> There are two kinds of page flipping, page mapping and page transfer.
> I guess page mapping should be supported somehow assuming only dom0
> (or driver domain) maps.
> We should measure page flipping and memory copy before giving it a try.
> I have no figures about it.
> I'm not sure which has better-performance.
> (I'm biased. I know that vnif analysis on xen/x86.
> It said memory copy was cheaper on x86 than page flipping...)
> If dom0 does only DMA, I/O request can be completed without copy and tlb
> flush for VBD with tlb tracking patch.
> Page transfer is difficult. I'm not sure that it's worth while to support
> page transfer because I'm suffering in optimize it.
>
> Another approach is
> * increase xen page size.
> Probably simply increasing page size wouldn't work well.
> In that case, increase only domheap page size,
> Or introduce new zone like MEMZONE_HPAGE,
> Or introduce specialized page allocator for it.
Hi,
thank you for your thoughts.
If we want to support linux page-size != Xen page-size some similar issues are
encountered.
I therefore suppose the both features should be worked together.
Tristan.
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
|