Todo List


User Memory Hotplug

 To remove memory easily, definition of removable area is effective. For example, if kernel memory scattered to each node, it will become very difficult that each node will be removed. If kernel memory is collected on same place, hotremove of other place will be easier than nothing to do.
 Many of memory hotplug code will be cause of performance deterioration.
It should be less as much as possible.
 If there is sticky memory which can't be migrate(and swap out), the time of hot remove will be long. At least, memory removing time should be less than the rebooting time of huge machine.
 In NFS case, if there is no response from server, the file system keeps HIGHMEM page and waits eternally. Probably, there are many cases like this problem when network trouble occur.
 Some drivers like Infiniband use user memory as a DMA buffer. The drivers have to support page migration. This interfaces it to call driver's specific work.


Kernel configuration and compiling isn't convenient for user.

Page migration

 Sys_remap_file_pages() system call make VM_NONLINEARed pages. Mmigrate code can't find PTEs of these pages, because objmap rmap doesn't treat them. 
  Each filesystem uses page->private for each reason. So, when page migration occurs, each filesystem has to move their private data.

Sysfs and API

 Mem section is just logical definition. To remove physically, ACPI information is also necessary.

Hotplug whole of the memory on the Node

 Pgdat and mem_map should be arranged on each node because of performance and size. They have to be allocated before initialization and have to be freed after freed pages. And if a mem_section includes pgdat area, the mem_section must not be removed before other mem_section which is on the same node is removed.
  There will be many pages which must be migrated at node hotplug time. And its time must be short as much as possible. In scheduled hotplug case for capacity on demand, if the mem_sections which is removed are disabled of allocation beforehand, hotremove will be graceful.
  Destination of migration should be appropriate area, because memory access speed should be kept as much as possible. If one mem_section on the node is removed, the content should be moved to the same node.  However, if whole of mem_sections on the node have to be removed, contents on the removing mem_section have to be moved to other node.

Architecture/CPU depending code

IA64 (by Fujitsu & HP)

 Appropriate size of mem_section for IA64 should be researched.
 To treat memory hole easily, mem_section size should be small than its hole size. However, if it is too small, mem_section array will be big to treat TB class memory. And if its size is smaller than GRANULE_SIZE or MAX_ORDER,  memory hotplug will become more difficult.
Virtual address of region 6/7  is calculated  by just register in current implementation.  (No memory access).  But kernel with mem_section must read memory at tlb handler. If mem_section size is too small, section array area will be bigl, and TLB fault will occur. It become cause of performance deterioration.

ppc (by IBM)

x86_64 (EM64T) (by Intel)

etc ?

Kernel memory Hotplug (Future Work)

Related development

Now implementing migration cache.
(Hugetlbpage migration will be used this implementation.)