4. Process - What is a process ? - A program in execution - What is inside a process ? - code, data, stack, heap -> user space - state in the OS, kernel stack -> kernel space - The first process - Unix: /sbin/init - created by the Linux kernel during booting - responsible for forking all other processes - Who create the first process in Linux? - start_kernel() first calls sched_init() to create first user space process init - Process tree - Linux: pstree - processes in the system arranged in the form of a tree - init->init.d->dhclient - What are difference between a program and a process ? - Program - a program can create several processes - ELF header + program-header table + .text + .data + .bss - placed on hard drive - Process - a process is unique isolated entity - dynamic instruction of code + heap + stack + process state - placed on main memory - Process address space - Each process has a different address space - Virtual address mapping - Each process has its own page table - What are advantages of virtual address map ? - Isolation (private address space) - Relocatable - data and code within the process is relocatable - Size - Processes can be much larger than physical memory - Process control block (PCB) - Holds important process information for process management - What are inside the PCB ? - size of process memory, page directory pointer, kernel stack pointer ... - context pointer - contains registers used for context switches - %edi, %esi, %ebx, %ebp, %eip - in kernel stack space - process identifier (PID) - sequential incremented number - Reset and continue to increment when maximum is reached - process state - embryo: the new process is created - runnable: ready to run - running: currently executing - sleeping: blocked for an I/O - trapframe - keep the process state during trap handling - used to resume after trap - Process table - An array of PCB - Store PCBs of all current running processes - Include process id, priority, state, resource usage - Why does a process need PCB ? - Allow process to resume execution after a while - Keep track of resources used - Track the process state - Process's stack - User space stack - used when executing user code - Kernel space stack - Used when executing kernel code (e.g. system call) - Why does OS create user and kernel space stack ? - Secure -> kernel can execute even user stack is corrupted - fork() - cloning process - in a parent process, fork() returns child pid - in a child process, fork() returns 0 - All pages are shared betwen parent and child - Process termination - exit (status) - OS frees process resources - kill (pid, signal) - Signal can be sent by another process - Signal enforces the process to be killed - Zombie process - PCB still exists in OS even though program is no longer executing - Orphan process - When a parent process terminates before its child - Threads - A process can have multiple threads - Each of thread executes independently - Share the same address space (code, heap) - Each thread can run over different part of the program - Each thread has separate stack for independent function calls - Thread creation - pthread_create() - T1 and T2 thread share parts of the address space - Global variables can be used for communication - The context of a thread (PC, register) is saved in thread control block (TCB) - Why do we need threads in OS? - Make a single process to effectively utilize multiple CPU cores - concurrency - Running multiple threads even on a single CPU core by interleaving their executions - Concurrency ensures effective use of the CPU even if no parallelism (e.g. overlapping I/O with other activities) - Context switch - Why do we need the context switch ? - What kind of contexts are saved ? - 5 registers, edi, esi, ebx, ebp, eip - Contexts are always stored at the bottom of the process's kernel stack - How to switch between process ? - A cooperative approach: use system calls (e.g. yield, wait, trap) - A non-cooperative approach: the OS takes control - A timer interrupt - The OS informs the hardware which code to run through the time interrupt - steps - save current process state - save register value of the currently-executing process onto the kernel stack - load state of the next process - restore a few for the soon-to-be-executing process from its kernel stack - continue execution of the next process - context switch overhead - direct factors - timer interrupt latency - saving/restoring contexts - find the next process to execute - indirect factors - TLB needs to be reloaded - Loss of cache locality (more cache miss) - processor pipeline flush - context switch quantum - A short quantum - A long quantum