Linux #
Linux is what runs most servers, the cloud, smartphones (Android), embedded devices, and supercomputers in the world — that's the convenient sentence to say, but strictly, "Linux" names only the kernel. What you actually touch as a user is a stack of the Linux kernel + a GNU userland + distribution-specific glue. Richard Stallman's insistence on calling it "GNU/Linux" reflects that very layered fact.
In practice, "Linux" is universally understood to mean "the entire ecosystem of Unix-like OSes running the Linux kernel." This article first untangles that three-layer structure, then covers the foundational design — kernel and userland are physically separated by the system-call boundary — followed by the unifying "everything is a file" abstraction, the process / permission / shell layers that shape the daily user experience, and finally how the container revolution falls naturally out of namespaces + cgroups — all at the granularity of "understand Linux from one diagram."
1. What Linux is — both "just the kernel" and "the whole OS" #
Untangling the layered meaning of the word "Linux" first.
| Layer | Contents | Examples |
|---|---|---|
| Kernel | Process management / memory / filesystem / network / device drivers. The only "Linux" that Linus Torvalds and the community govern. | linux-6.x (kernel.org) |
| Userland (GNU + neighbors) | Libraries (glibc / musl), shells (bash / zsh), coreutils (ls cp cat), GNU toolchain (gcc / binutils) |
GNU project + util-linux + systemd |
| Distribution | Kernel + userland + package manager + distro-specific config / init scripts / release policy | Debian, Ubuntu, RHEL, Fedora, Arch, Alpine, Android |
So "I installed Ubuntu" = "I installed a distribution that bundles a Linux kernel + GNU toolchain + Debian packaging + Canonical's own Snap / Netplan / configuration." Android also uses the Linux kernel but pairs it with Bionic libc + Java/ART instead of the GNU userland, so whether to call it "Linux" gets murky (kernel.org counts it as Linux).
The philosophical "Free Software" vs "Open Source" debate sits in the background, but practically both rest on source published under licenses like GPL/MIT/Apache. The Linux kernel itself is GPLv2, which is the legal hook obligating Android device makers to publish their kernel modifications.
2. Architecture — kernel and userland are split by the system-call boundary #
The defining design of Linux (and Unix-likes generally) is that kernel space and user space are physically separated by CPU privilege levels. User processes don't touch hardware directly — they always go through system calls, the well-defined door into the kernel.
The takeaways:
- The kernel is never "part of the app." Apps reach the kernel only through the ~400 system calls —
read()write()open()fork()socket(). The reasonstracecan follow execution is exactly that this boundary is sharp. - The GNU userland is "a separate project from the Linux kernel."
lsandcatare from GNU coreutils, developed independently of the Linux kernel. When Alpine Linux slims the userland down to musl + BusyBox, the same kernel produces an OS that feels entirely different. - systemd is "the modern glue between kernel and userland." Process management / cgroups / logging / network configuration / DNS / startup ordering — capabilities that used to be separate are unified. Behind
systemctl start nginxis a user-space PID 1 daemon at work. - Dynamic driver loading — a
kernel modulelisted inlsmodis loaded on demand bymodprobe. The reason Linux can start using a newly attached device without a reboot is this design.
# Trace syscalls live (what an app actually asks the kernel for)
strace -f -e trace=openat,read,write ls /tmp
# Kernel version and build options
uname -a
cat /proc/version
# Loaded kernel modules
lsmod
# System-wide syscall stats (requires perf)
sudo perf stat -e 'syscalls:sys_enter_*' -a sleep 5
3. Everything is a file — Unix's most influential abstraction #
"Everything is a file" is the most influential design choice in the Unix philosophy. Because regular files, directories, devices, pipes, sockets, and symlinks all share the same read() / write() / open() / close() API, programs can be composed without caring what's on the other end. grep foo /var/log/syslog, grep foo < /dev/ttyS0 (a serial port), and cat hello | grep foo all reach the same read() call — that's the actual substance of Unix's simplicity and power.
The most important concepts:
/devis the door to hardware and virtual devices.echo "hello" > /dev/tty1writes to virtual console 1;dd if=/dev/zero of=test.bin bs=1M count=100builds a 100 MB zero-filled file — both are driven by the same syscall as a write to a regular file./procis a virtual FS the kernel synthesizes on the fly./proc/PID/mapsshows a process's memory map;/proc/PID/fd/lists its open file descriptors. Nothing is on disk — the kernel composes the text the moment youcatit./sysexposes the kernel's internal object hierarchy (devices, buses, classes) as a directory tree. Operations likeecho 02:11:22:33:44:55 > /sys/class/net/eth0/addressto change a NIC's MAC become possible.- A symbolic link (starts with
l) is a pointer to another path, similar to a Windows shortcut but transparently resolved by the kernel. Used heavily for things like/lib → /usr/libconsolidation, orcurrent → releases/56swaps in deployments.
# Check file types
ls -l /dev/null /etc/hosts /lib /tmp/.X11-unix/X0
stat -c '%F %n' /dev/sda /proc/meminfo /sys/class/net/eth0
# Pull process info from /proc
ls /proc/$$/fd # FDs my shell has open
cat /proc/$$/maps # memory map
cat /proc/$$/status # state (incl. VmRSS)
# Change runtime settings via sysfs / sysctl
sudo sysctl -w net.ipv4.ip_forward=1
echo 1 | sudo tee /proc/sys/net/ipv4/ip_forward # the same thing
4. Processes and signals — fork/exec and PID 1 #
Linux's process model inherits the distinctive Unix design: "fork() to copy yourself → exec() to replace with another program." When bash runs ls, the shell:
- Calls
fork()to make a copy of itself (a child process; same memory and FDs as the parent) - The child calls
execve("/bin/ls", ...)to replace its own image with ls (PID stays the same) - The parent shell calls
waitpid()to wait for the child to exit
This is why all processes form a single genealogy descending from PID 1 (a process tree). pstree reveals a chain like systemd (PID 1) → sshd → bash → vim — you can walk back to your ancestors. When a parent dies before its children, the children become orphans and are adopted by PID 1 (systemd) — the classic init-reaping rule, still in effect.
Signals are asynchronous notifications between processes. SIGINT from Ctrl+C, SIGTERM from kill PID (default), SIGKILL from kill -9 PID (uncatchable, forced kill), SIGSEGV from a memory error — 31 of them defined. Only SIGKILL and SIGSTOP are uncatchable (so users always retain a way to stop a process).
# Process tree
pstree -p $$ # from my shell up to the ancestors
# State and resources
ps -ef # all processes (UNIX style)
ps auxf # tree (BSD style)
top / htop # real time
ss -tnp / lsof -p PID # connections / files for a PID
# Sending signals
kill -SIGTERM PID # polite request (app can clean up)
kill -9 PID # force kill (can leak resources)
kill -SIGUSR1 PID # app-defined notification (e.g. nginx log rotation)
Zombies (Z state) are children that have exited but whose parent hasn't called wait() yet. Resources are released; only the PID and process-table entry remain. A zombie buildup means the parent has a bug (not waiting on children); leave it long enough and you exhaust the PID space and can't fork any more.
5. The permission model — UID/GID/rwx → setuid → capabilities → namespaces #
The Unix permission model started simple: UID (user ID) + GID (group ID) + 9 rwx bits on each file. The 9 characters -rwxr-xr-x in ls -l are "owner rwx / group rwx / other rwx."
UID 0 = root is the all-powerful super-user, and doing day-to-day work as a non-root user + reaching for sudo only when necessary is the foundational rule of modern security.
But "rwx alone isn't fine-grained enough" came up many times, and the model expanded in stages:
| Feature | What it solved |
|---|---|
| setuid / setgid bits | passwd updating /etc/shadow requires root, but a regular user invokes it → marking the binary setuid runs it as the owner's identity at execution time. Common abuse vector for local privilege escalation, so post-compromise analysts always run find / -perm -4000 to enumerate them |
| POSIX ACL | The "owner / group / other" 3-tier model can't grant distinct rights to multiple users → setfacl adds per-file granular ACLs |
| Capabilities | Instead of "give me all of root," divide root into ~40 capabilities (CAP_NET_BIND_SERVICE = bind ports below 1024, CAP_NET_ADMIN = network configuration, CAP_SYS_ADMIN = catch-all) → grant a daemon only the minimum it needs |
| MAC (Mandatory Access Control) | rwx is DAC (Discretionary) — owners decide → SELinux / AppArmor enforce a system-wide policy like "this binary may only do these operations" |
| Namespaces + cgroups | "Show each process its own world (PID space / FS / network / UID mapping)" + "limit CPU / memory / I/O" → this is exactly what makes containers (§7) |
# Basic permissions
ls -l /etc/shadow # `-rw-r-----` root:shadow → invisible to regular users
chmod 600 ~/.ssh/id_ed25519 # SSH private key must be 600 (rw------- owner only)
# Find setuid binaries (a baseline survey of the system)
find / -perm -4000 -type f 2>/dev/null
# Capabilities (subdivide root)
sudo setcap cap_net_bind_service=+ep /usr/bin/python3.11
# → python can now bind 80/443 without being root
# SELinux (RHEL family) status
sestatus
ls -lZ /var/www/html
# sudo configuration (use NOPASSWD sparingly)
sudo visudo # safe edit of /etc/sudoers
6. The shell ties everything together — pipes and standard I/O #
The substance of Linux/Unix productivity is stitching small single-purpose programs together with | to build anything. What enables that is "everything is a file" + standard I/O (stdin / stdout / stderr) + pipes (|).
┌──────┐ stdout stdin ┌──────┐ stdout stdin ┌──────┐
│ ls │ ─────────────────→│ grep │ ────────────────→│ wc │
└──────┘ └──────┘ └──────┘
ls | grep '\.log$' | wc -l starts three independent processes simultaneously and wires each one's stdout into the next one's stdin, computing "the count of .log files in the current directory" in one line. The kernel mediates via a pipe (FIFO buffer); no shared memory is needed.
| Form | Meaning |
|---|---|
cmd > file |
stdout overwrites the file |
cmd >> file |
stdout appends to the file |
cmd 2> err.log |
only stderr goes to the file |
cmd 2>&1 |
merge stderr into stdout |
cmd < file |
stdin from the file |
cmd1 | cmd2 |
cmd1's stdout into cmd2's stdin |
cmd1 ; cmd2 |
sequential (regardless of cmd1's exit) |
cmd1 && cmd2 |
run cmd2 only if cmd1 succeeded (exit 0) |
cmd1 || cmd2 |
run cmd2 only if cmd1 failed |
Exit codes are the most important inter-program return value. 0 = success, anything else = some failure. Shell scripts branch on this; CI/CD pipelines decide "build passed or failed" by it.
Environment variables (PATH, HOME, LANG, LD_LIBRARY_PATH, …) propagate by copy from parent to child. export FOO=bar makes it visible to subsequently spawned children. bash's set -euo pipefail is the canonical "die on undefined variables / stop immediately on errors / detect failures inside pipelines" safety setup for scripts.
7. Distributions — picking one #
"Installing Linux" = "picking a distribution." The kernel is the same; package manager / release cycle / default daemons / community character differ.
| Distro | Family | Packages | Character / where it shines |
|---|---|---|---|
| Debian | Original | apt (.deb) |
Stability first / community-driven / strong long-term server use |
| Ubuntu | Debian-derived | apt |
Most popular on desktop / LTS releases / Canonical commercial support |
| RHEL (Red Hat Enterprise Linux) | Commercial | dnf (.rpm) |
Enterprise production standard / subscription / 10-year support |
| Fedora | RHEL upstream | dnf |
Cutting edge / the testing ground before features descend into RHEL |
| CentOS Stream / Rocky / AlmaLinux | RHEL-compatible | dnf |
Successors to old CentOS — Rocky / Alma are binary-compatible |
| Arch Linux | Independent | pacman |
Rolling release / minimalist / DIY culture / ArchWiki is the world's best Linux documentation |
| Alpine Linux | Independent | apk |
musl + BusyBox makes it tiny (~5 MB) / the de facto base for Docker images |
| Android | (custom) | Linux kernel + Bionic libc + Java/ART — the most-deployed Linux on Earth | |
| WSL2 (on Windows) | Microsoft | (depends on parent distro) | A Linux kernel running on Windows (in a Hyper-V VM) — surging adoption as a dev environment |
How to pick:
- Production servers → RHEL / Rocky / Ubuntu LTS (10-year support, enterprise track record)
- Workstation / desktop → Ubuntu / Fedora / Arch (a matter of taste)
- Container base → Alpine (small) or Debian slim / Ubuntu minimal (compatibility-first)
- Reviving old hardware → lightweight derivatives (Lubuntu, MX Linux, …)
- Learning → Arch (build it from
pacstrap) or Debian (minimal by default)
8. The container revolution — falling naturally out of namespaces + cgroups #
Docker / Kubernetes swept the late 2010s — what actually happened inside Linux? The answer is "two features the Linux kernel had all along finally got combined and used."
- namespaces (PID / network / mount / UTS / IPC / user) — show each process its own world
- PID namespace: inside a container,
psonly shows the container's own processes - network namespace: its own eth0 / routing table
- mount namespace: its own filesystem as
/ - user namespace: container root looks like a non-root from the host (rootless containers)
- PID namespace: inside a container,
- cgroups (control groups) — limit and measure CPU / memory / I/O / pid count at the process-group level
Combine these with overlayfs (a stacking filesystem) and you get "a self-contained mini-OS running per process on top of the host OS" = a container. Docker first reached for these via lxc, later wrote runc, and the modern world standardizes on OCI runtimes.
"Containers are lighter than VMs" because they skip hardware virtualization (Hyper-V, KVM, VMware) — the kernel is shared with the host, with only process isolation, no hardware emulation. The cost: kernel vulnerabilities can break container isolation (CVE-2022-0185, CVE-2024-1086 are well-known examples).
# Look at "the container is just Linux processes"
docker inspect --format '{{.State.Pid}}' my-container # PID on the host
ls /proc/<PID>/ns # that PID's namespaces
nsenter -t <PID> -n -p ip addr # enter the network ns and check IP
cat /sys/fs/cgroup/system.slice/docker-<ID>.scope/cpu.stat # cgroups CPU stats
Kubernetes layers on top another tier of "automatic placement, healing, and scaling for clusters of containers." Two layers: the Linux kernel runs the containers, Kubernetes manages them.
9. Where modern Linux actually runs #
| Domain | Share / status |
|---|---|
| Cloud servers | 95%+. Default images on AWS EC2 / Azure VM / GCP CE are Linux. Managed databases, Kubernetes workers, Lambda, Fargate — all on Linux |
| Smartphones (Android) | 70%+ of the world is Android = the Linux kernel. The most-deployed OS on Earth |
| Embedded | Routers, TVs, refrigerators, cars, industrial gear, smart watches, IoT — uncountable, mostly invisible |
| Supercomputers | 100% of the TOP500 since 2017. HPC is effectively only Linux |
| Desktop | ~3-5%. Windows / macOS dominate, though the developer / scientist / engineer community is large |
| WSL2 (on Windows) | Surging. An era where Microsoft officially ships a Linux kernel inside Windows |
"Powering servers, mobile, and every embedded thing — but a minority on desktops" is Linux's modern position. Windows and macOS compete as OS products; Linux occupies a different layer as infrastructure OS.
Linux began as "the kernel Linus Torvalds wrote as a hobby in 1991" and grew into the name for the OS ecosystem powering more than half the world's computing. The keys to understanding it: the three-layer structure of "kernel = Linux proper / userland = GNU + distro / both physically separated by syscalls," and the fact that "everything is a file" / "fork/exec" / "pipe + exit codes" / "rwx → capabilities → namespaces" — design choices inherited from Unix — all remain alive today.
Cloud, containers, smartphones — all rest on the fact that "Linux's design philosophy continues to apply directly to modern infrastructure." Containers are just "using Linux features in a new combination." Kubernetes is "another tier on top of Linux containers." Serverless is "running a function inside a tiny VM (Firecracker etc.) on top of a Linux kernel." Wherever you dig, you land on Linux's syscalls, namespaces, and cgroups. Once you understand Linux properly, you can confront most of modern infrastructure with the same vocabulary.