Todo

Todo List

Documantation

Usage (Interface, etc....)

User Memory Hotplug

Removable attribute (by Yasunori Goto)

To remove memory easily, definition of removable area is effective. For example, if kernel memory scattered to each node, it will become very difficult that each node will be removed. If kernel memory is collected on same place, hotremove of other place will be easier than nothing to do.

Improvement of performance

Avoiding performance deterioration with memory hotplug code.

Many of memory hotplug code will be cause of performance deterioration.
It should be less as much as possible.

Performance of memory hot-remove

If there is sticky memory which can't be migrate(and swap out), the time of hot remove will be long. At least, memory removing time should be less than the rebooting time of huge machine.

Case of trouble of Network

In NFS case, if there is no response from server, the file system keeps HIGHMEM page and waits eternally. Probably, there are many cases like this problem when network trouble occur.

Interfaces for driver

Some drivers like Infiniband use user memory as a DMA buffer. The drivers have to support page migration. This interfaces it to call driver's specific work.

Mem_section

Specifing mem_section size by boot option.

Kernel configuration and compiling isn't convenient for user.

Page migration

Handle VM_NONLINEARed pages.

Sys_remap_file_pages() system call make VM_NONLINEARed pages. Mmigrate code can't find PTEs of these pages, because objmap rmap doesn't treat them.

Implementation of page_migration for each filesystem.

Each filesystem uses page->private for each reason. So, when page migration occurs, each filesystem has to move their private data.

Sysfs and API

Correlation between ACPI memory object and mem_section

Mem section is just logical definition. To remove physically, ACPI information is also necessary.

Hotplug agent of Memory hotplug

Hotplug whole of the memory on the Node

Arrangement of pgdat and mem_map

Pgdat and mem_map should be arranged on each node because of performance and size. They have to be allocated before initialization and have to be freed after freed pages. And if a mem_section includes pgdat area, the mem_section must not be removed before other mem_section which is on the same node is removed.

Disable page allocation from mem_section on the removing node. And graceful removing

There will be many pages which must be migrated at node hotplug time. And its time must be short as much as possible. In scheduled hotplug case for capacity on demand, if the mem_sections which is removed are disabled of allocation beforehand, hotremove will be graceful.

Selection of destination of migration.

Destination of migration should be appropriate area, because memory access speed should be kept as much as possible. If one mem_section on the node is removed, the content should be moved to the same node. However, if whole of mem_sections on the node have to be removed, contents on the removing mem_section have to be moved to other node.

Architecture/CPU depending code

IA64 (by Fujitsu & HP)

mem_section size

Appropriate size of mem_section for IA64 should be researched.

To treat memory hole easily, mem_section size should be small than its hole size. However, if it is too small, mem_section array will be big to treat TB class memory. And if its size is smaller than GRANULE_SIZE or MAX_ORDER, memory hotplug will become more difficult.

TLB fault handler with Mem_section for region 6/7.

Virtual address of region 6/7 is calculated by just register in current implementation. (No memory access). But kernel with mem_section must read memory at tlb handler. If mem_section size is too small, section array area will be bigl, and TLB fault will occur. It become cause of performance deterioration.

ppc (by IBM)

x86_64 (EM64T) (by Intel)

etc ?

Kernel memory Hotplug (Future Work)

Some hardware can change destination of DMA to support memory migration. When memory migration, the driver has to set this destination too.
PTE, PMD, PGD, and page table must moved theirown (if it is necessary).
vmalloc, kernel text need rmap for migration.
and so on.... (many)

Related development

Memory defragmentation (by Marcelo Tosatti)

Now implementing migration cache.
(Hugetlbpage migration will be used this implementation.)