How many times I’ve heard “well, a container is like a super light-weight virtual machine“. And yes, true, I admit as well, that I was one of them.
But I wasn’t happy about this answer, so I did some researches and I think now I have a better understanding and I feel the pain of my friends where I was simplistically (and wrongly) saying that – public apologies 😛 🙂
So… let’s start…
Concept 1: Virtual memory.
Virtual memory is the collective memory used by processes (RAM, disk swap, etc).
Of this virtual memory, we have generally a separation beween 2 types:
- kernel space: reserverd for the kernel and generally drivers
- user space: for the applications, incluse libraries
This separation serves to provide memory protection and hardware protection from malicious or errant software behavior.
NOTE1: User space is not namespace.
NOTE2: FUSE is not really related with this topic, but could confuse someone. So, just to clarify: FUSE – (Filesystem in Userspace) is a software interface for Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in user space while the FUSE module provides only a “bridge” to the actual kernel interfaces.
Modern kernels have cgroups and namespace capabilities.
- Cgroups can restrict what you can USE -> CPU, memory, storage, network, devices, etc. Also allows to ‘freeze’.
- Namespace can restrict what you SEE -> PID, mnt, UID/GID, etc…
Containers runtimes (like LXC, Docker, etc…) are using cgroups and namespaces to create separate isolated user-space entities called ‘containers‘.
Containers have basically no overhead because they are using the same system calls to the host kernel => No need of emuation or virtual machine.
They use the same kernel of the host (this is a key difference with virtualisation). So, currently, you cannot run Windows containers on a Linux host. But you can still run different versions of Linux, as they all share the same kernel.
Virtualisation: fully isolated OS, running its own kernel.
- Full virtualised: (eg. VMWare, Virtuabox, ESXi…). The OS in the VM is not aware to be a VM. Hypervisor emulates the hardware platform for the guest OS and then translates the hardware accesses requests to the physical hardware. Hypervisor provides the drivers to the guest OS.
=> higher overhead because hardware virtualisation BUT best isolation and security
- Para virtualised: (XEN, KVM) the OS in the VM knows to be virtualised. Drivers are sending instructions directly to the hardware of the host, via the Hypervisor. Hardware is not virtualised BUT the OS runs in isolation.
=> better performance and ability to use recent hardware drivers directly BUT guest OS needs to be modified to use paravirtualised devices
NOTE: Emulation is not platform virtualisation (e.g. QEMU)
With emulation you can emulate different architectures (e.g. ARM/RISC…) on a host that has a differnt instruction set (eg. i386). Performances are cleary not ideal.