Logo Search packages:      
Sourcecode: virtualbox-ose version File versions  Download package

GMM - The Global Memory Manager

As the name indicates, this component is responsible for global memory management. Currently only guest RAM is allocated from the GMM, but this may change to include shadow page tables and other bits later.

Guest RAM is managed as individual pages, but allocated from the host OS in chunks for reasons of portability / efficiency. To minimize the memory footprint all tracking structure must be as small as possible without unnecessary performance penalties.

The allocation chunks has fixed sized, the size defined at compile time by the GMM_CHUNK_SIZE #define.

Each chunk is given an unquie ID. Each page also has a unique ID. The relation ship between the two IDs is:

  GMM_CHUNK_SHIFT = log2(GMM_CHUNK_SIZE / PAGE_SIZE);
  idPage = (idChunk << GMM_CHUNK_SHIFT) | iPage;
Where iPage is the index of the page within the chunk. This ID scheme permits for efficient chunk and page lookup, but it relies on the chunk size to be set at compile time. The chunks are organized in an AVL tree with their IDs being the keys.

The physical address of each page in an allocation chunk is maintained by the RTR0MEMOBJ and obtained using RTR0MemObjGetPagePhysAddr. There is no need to duplicate this information (it'll cost 8-bytes per page if we did).

So what do we need to track per page? Most importantly we need to know which state the page is in:

For the page replacement operations (sharing, defragmenting and freeing) to be somewhat efficient, private pages needs to be associated with a particular page in a particular VM.

Tracking the usage of shared pages is impractical and expensive, so we'll settle for a reference counting system instead.

Free pages will be chained on LIFOs

On 64-bit systems we will use a 64-bit bitfield per page, while on 32-bit systems a 32-bit bitfield will have to suffice because of address space limitations. The GMMPAGE structure shows the details.

Page Allocation Strategy

The strategy for allocating pages has to take fragmentation and shared pages into account, or we may end up with with 2000 chunks with only a few pages in each. Shared pages cannot easily be reallocated because of the inaccurate usage accounting (see above). Private pages can be reallocated by a defragmentation thread in the same manner that sharing is done.

The first approach is to manage the free pages in two sets depending on whether they are mainly for the allocation of shared or private pages. In the initial implementation there will be almost no possibility for mixing shared and private pages in the same chunk (only if we're really stressed on memory), but when we implement forking of VMs and have to deal with lots of COW pages it'll start getting kind of interesting.

The sets are lists of chunks with approximately the same number of free pages. Say the chunk size is 1MB, meaning 256 pages, and a set consists of 16 lists. So, the first list will contain the chunks with 1-7 free pages, the second covers 8-15, and so on. The chunks will be moved between the lists as pages are freed up or allocated.

Costs

The per page cost in kernel space is 32-bit plus whatever RTR0MEMOBJ entails. In addition there is the chunk cost of approximately (sizeof(RT0MEMOBJ) + sizof(CHUNK)) / 2^CHUNK_SHIFT bytes per page.

On Windows the per page RTR0MEMOBJ cost is 32-bit on 32-bit windows and 64-bit on 64-bit windows (a PFN_NUMBER in the MDL). So, 64-bit per page. The cost on Linux is identical, but here it's because of sizeof(struct page *).

Legacy Mode for Non-Tier-1 Platforms

In legacy mode the page source is locked user pages and not RTR0MemObjAllocPhysNC, this means that a page can only be allocated by the VM that locked it. We will make no attempt at implementing page sharing on these systems, just do enough to make it all work.

Serializing

One simple fast mutex will be employed in the initial implementation, not two as metioned in Serializing Access.

See also:
Serializing Access

Memory Over-Commitment Management

The GVM will have to do the system wide memory over-commitment management. My current ideas are:

There are some challenges here, the main ones are configurability and security. Should we for instance permit anyone to request 100% memory commitment? Who should be allowed to do runtime adjustments of the config. And how to prevent these settings from being lost when the last VM process exits? The solution is probably to have an optional root daemon the will keep VMMR0.r0 in memory and enable the security measures.

NUMA

NUMA considerations will be designed and implemented a bit later.

The preliminary guesses is that we will have to try allocate memory as close as possible to the CPUs the VM is executed on (EMT and additional CPU threads). Which means it's mostly about allocation and sharing policies. Both the scheduler and allocator interface will to supply some NUMA info and we'll need to have a way to calc access costs.


Generated by  Doxygen 1.6.0   Back to index