8. Memory Allocation - What is the goal of the memory allocation? - allocate the memory space for processes - static (compile-time) allocation - Why static allocation is not always a good choice? - dynamic allocation - stack allocation - heap allocation - stack allocation - the allocation happens on contiguous blocks of memory - temporary memory allocation: - It allocates or de-allocates the memory automatically as soon as the corresponding method completes its execution - data stored can only be accessed by the owner thread - When will we use stack allocation? - memory allocation and free are partially predictable - example: recursion, procedure call frames - pros: - keeps all the free space contiguous - cons: - Not appropriate for all data structures - heap allocation - The memory is allocated during the execution of instruction written by programmers - it is called a heap because it is a pile of memory space available to programmers to allocate and de-allocate - data stored in the memory space is accessible or visible to all threads - memory leak would occur - programmers create a memory in heap and forget to delete it - reduce the performance of computer by reducing the amount of available memory - no automatic de-allocation feature is provided - need to use a garbage collector to remove the old unused objects - not as threaded-safe as stack-memory (why?) - resizing is possible - main issue in heap memory is fragmentation - fragmentation - internal fragmentation - waste space when you round an allocation up - external fragmentation - when you end up with small chunks of free memory that are too small to be useful - full of little holes of free space - no single contiguous space that can satisfy the request - compaction is expansive - when does external fragmentation occur? - the free space consists of variable-sized units - segregated list - a particular application has one or a few popular-sized request - keep a separate list to manage objects of that size - all other requests are forwarded to a more general memory allocator - dedicated for one particular size of requests, fragmentation is much less of a concern - no complex search of a list -> fast allocation and request - buddy allocator - fast, simple allocation for blocks that are 2^n bytes - allocation restrictions - block size: 2^n - allocation strategy - raise allocation request to nearest 2^n - search free list for appropriate size - recursively divide large blocks until reach block of correct size - memory free - recursively coalesce block with buddy if buddy free - memory fragmentation - still leads to few reserved pages - has internal fragmentation? why? when? - malloc() issues - how to impelment malloc() or new? - call sbrk() to request more contiguous memory from OS - separate free list for each popular size - allocation is fast - when will be inefficient? - reclaim free memory - when can dynamically-allocated memory be freed? - explicitly call free() - can't be recycled until all sharers are finished - two possible problems - dangling pointers - recycle storage while it's still being used - when will this dangling pointer happen? - a danging pointer as it points to the deleted object - for example: int main() { int *ptr = (int*)malloc(sizeof(int)); int a = 560; ptr = &a; free(ptr); return 0; } - after calling free(ptr), the 'ptr' becomes dangling as it is pointing to the de-allocated memory - if we assign the NULL value to the 'ptr', then 'ptr' will not point to the deleted memory - assigning NULL value to the pointer means that the pointer is not pointing to any memory location ------------------------------------------------------------------ #include int *fun() { int y = 10; return &y; } int main() { int *p = fun(); printf("%d\n", *p); return 0; } ------------------------------------------------------------------ What is the output of the above code? - point p is a dangling pointer, why? - when control comes back to the context of the main(), the variable y is no longer available - memory leak - forget to free storage even when can't be used again - Linux kernel allocator (See slide 27) - page allocator - allocate contiguous areas of physical pages (4k, 8k, ...) - buddy allocator strategy - only allocations of power of two numbers of pages - the allocated area is contiguous in the kernel virtual address space - maps to physically contiguous pages - slab allocator - Linux kernel has temporary kernel objects such as mm_struct, inode ... - these temporary kernel objects - very small and very large size - often allocated and freed - the allocation of small memory blocks - two caches of small memory buffers - kmalloc() allocates objects in these small cache buffers - the caching of commonly used objects - what is slab ? (See slide 31) - a chunk of contiguous pages - allocates a number of objects to the slabs associated with that cache - cache chain - a variable number of caches linked on a doubly linked circular list - manages the objects in a cache - a slab contains one or more pages, divided into equal-sized objects - when cached created, allocate a slab, divided the slab into free objects - if a slab is full of used objects, next object comes from an new/empty slab - vmalloc() allocator - used to obtain memory zones that are contiguous in the virtual addressing space - areas cannot be use for DMA, since DMA usually requires physically contiguous buffers - vmalloc vs kmalloc - kmalloc uses slab allocator or buddy system to ask for contiguous physical memory space - vmalloc uses alloc_page to obtain the page of the order = 0, then maps to contiguous virtual memory address space, non-contiguous physical address space - I/O hardware - bus, port, controller - I/O access can use polling or interrupt - devices usually provide registers for data and control I/O - memory-mapped I/O - data and command registers mapped to processor address space - access to the I/O device registers using normal load/store instruction - isolated I/O (See slide 10) - separate address spaces for memory and I/O devices - with special instructions that access the I/O space - in 32-bit address space, 8086 machines - regular instructions like MOV reference RAM - the special instructions IN and OUT access a separate 64 KB I/O address space - memory-mapped vs isolated I/O - memory-mapped -> the same instructions that access memory can also access I/O devices - isloated ->special instructions are used to access devices -> less flexible for programming - programmed I/O - the cpu makes a request and waits for the device to be ready - large data transfer -> repeated writes words to main memory - a lot of CPU time is needed - if the device is slow, the CPU might have to wait for a long time - polling - continually checking to see if a device is ready - it is not an efficient use of the CPU - the CPU cannot do much else while waiting - interrupt-driven I/O - the device interrupts the processor when the data is ready - direct memory access (DMA) - copy data directly between devices and RAM and bypass the CPU - OS issues commands to the DMA controller - the pointer of the command written into the command register - when done, device interrupts CPU to signal completion - the DMA controller is a simple processor - the CPU asks the DMA controller to transfer data between a device and main memory - after that, the CPU can continue with other tasks - the DMA controller issues requests to the right I/O devices - DMA waits and manages the transfer between the device and main memory - once completed, the DMA controller interrupts the CPU - constraints with a DMA - a DMA deals with physical addresses - need to use contiguous physical memory space