[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] mm.c ??



> -----Original Message-----
> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx 
> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of 
> PUCCETTI Armand
> Sent: 16 February 2006 17:53
> To: xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: [Xen-devel] mm.c ??
> 
> I am trying to understand XEN 3.0.1 internals.
> 
> Could someone please point me to some explanation/document of 
> what file arch/x86/mm.c is doing precisely? Especially the 
> functioning of function map_pages_to_xen?

Since I'm not 100% sure about this, I'll make my best attempt to explain
what it does, but if I'm wrong, don't blame me... ;-)

The function maps a virtual address to a set of mfn's (memory? frame
number - essentially the page-number of the physical address).

The function starts by finding the second level page table entry, using
virt_to_xen_l2e.

When mapping between virtual and physical memory, we can choose to use
big pages (aka super pages) or small pages. 

Small pages are 4KB each.

The size of big page is depending on the page-table structure being used
and is a compile-time option. When using 64-bit address (i.e. PAE or
x86_64) each page-table entry takes up 64 bit, 8 bytes, which means that
a 4KB page will contain 512 entries (2^9), whilst 32-bit addresses
(32-bit non PAE) uses 32-bit, 4 byte entries, making 1024 entries in a
4KB page, (2^10). Big pages essentially skip the last level of page
table entry and use that part of the virtual address as an offset. The
lowest 12 bits of the address is ALWAYS used as an offset into the
block, and in big pages, the next 9 or 10 bits is also used, giving
either 2MB (2^(9+12)) or 4MB(2^(10+12)) pages.

The first big if-statement in the code determines if we can map a big
page or not. Note that the first mapping(s) may be small pages, and then
a big page, followed one or more small pages, if for instance we map an
area that isn't big-page aligned and not an even number of big pages. 

We first save the old value of the page-table-entry.
Then we make up the new page-table entry from the mfn, setting _PAGE_PSE
tells that it's a BIG page. 
Then we check if the old page-entry was in use (_PAGE_PRESENT being set
- i.e. there is a mapping of this page entry), and if so, we flush the
TLB to make sure that the NEW mapping is read in, rather than some old
page(s), and if the previous mapping wasn't a big page, we should free
the page-table entries previously used for the lowest level. [I'm not
sure if the code trying to free the page is correct - I think it should
use ol2e, but I could well be wrong.]

ELSE - we're mapping small pages. 
This is essentially the same mechanism, but we walk across every 4KB
instead of 2/4MB of the virtual address. 

We first check if there is a valid entry in the second level page table.
If not, we create one. This is only one per every 2^9 or 2^10 pages
mapped, so not very common. 

Then check if the second level used to be a big page. If so, clean it up
and copy down the page-entries from the upper level - this heppens
useful if a single page in a big slab is being remapped to another
physcal page, we still want the rest of the big slab to remain where it
was... 

Again, checks are made to see if the page was present before and flush
the TLB for each time the page is present. 

Please feel free to ask questions, I'm sure there's things I haven't
explained exactly right, and I'm assuming anyone reading this has a
basic understanding of page-tables. If not, I suggest that you get a
couple of books on the subject [yes, a couple, because one book may not
explain some aspect very clearly, and having another author's
explanation on the same aspect will then help you understand the matter
better]. You can use the freely available documentation from AMD64
Architecture Programmer's Manual Volume 2 as one reference:
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/
24593.pdf

Books about the x86 architecture is available from for example Amazon,
and I can't really say which particular book is better than another.
I've read books from Mindshare that are usually pretty good.

--
Mats

> 
> thanks in advance.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.